
# Working with Arrays and Tables in Data Science

This notebook provides examples of how to work with arrays and tables using the `datascience` package in Python,
a tool designed to facilitate data science education.

## Importing Required Libraries

First, let's import the necessary libraries.



In [1]:
from datascience import *
import numpy as np



## Arrays

Arrays are collections of items of the same type stored in contiguous memory locations.

### Creating and Manipulating Arrays



In [2]:
# Creating an array
my_array = make_array(1, 2, 3, 4, 5)
print("Array:", my_array)

# Performing operations on the array
my_array = my_array * 2
print("Array after multiplication:", my_array)

# Accessing elements
print("First element:", my_array.item(0))

Array: [1 2 3 4 5]
Array after multiplication: [ 2  4  6  8 10]
First element: 2




## Tables

Tables are structured data forms that hold information in rows and columns.

### Creating a Table



In [12]:
# Creating a table
students = Table().with_columns(
    'Name', make_array('Alice', 'Bob', 'Charlie'),
    'Age', make_array(20, 21, 19),
    'Major', make_array('Biology', 'Physics', 'Art')
)

students

Name    | Age  | Major
Alice   | 20   | Biology
Bob     | 21   | Physics
Charlie | 19   | Art




### Selecting Columns with .select()



In [4]:
# Selecting specific columns to create a new table
name_major_table = students.select('Name', 'Major')
name_major_table

Name    | Major
Alice   | Biology
Bob     | Physics
Charlie | Art




## Extracting Arrays from Tables

We can extract specific data from a table as an array for further analysis.



In [5]:
# Extracting the 'Age' column as an array
ages_array = students.column('Age')
print("Ages array:", ages_array)

# Calculating the average age
average_age = np.mean(ages_array)
print("Average age:", average_age)

Ages array: [20 21 19]
Average age: 20.0




This notebook demonstrates the basics of working with arrays and tables in the `datascience` package,
including how to manipulate data and extract useful information for analysis.



## More Operations with Arrays

### Operations on Arrays

Arrays support a variety of operations that facilitate numerical analysis.



In [6]:
# Example array
numbers = make_array(4, 7, 1, 3, 9)

# Operations
print("Minimum:", np.min(numbers))
print("Maximum:", np.max(numbers))
print("Sum:", np.sum(numbers))
print("Standard Deviation:", np.std(numbers))

Minimum: 1
Maximum: 9
Sum: 24
Standard Deviation: 2.85657137142




### Boolean Arrays and Conditional Selection

Boolean arrays enable data filtering based on conditions, which is useful for selecting subsets of data.



In [7]:
# Creating a boolean array
is_even = numbers % 2 == 0
print("Is even:", is_even)

# Using the boolean array to select elements
print("Even numbers:", numbers[is_even])

Is even: [ True False False False False]
Even numbers: [4]




## More Table Manipulations

### Adding Rows and Columns

Rows and columns can be added to tables dynamically.



In [13]:
# Adding a column
students = students.with_column('Graduation Year', make_array(2022, 2023, 2022))
students

Name    | Age  | Major   | Graduation Year
Alice   | 20   | Biology | 2022
Bob     | 21   | Physics | 2023
Charlie | 19   | Art     | 2022


In [14]:
# Adding a row
students = students.with_row(['Dana', 22, 'Computer Science', 2023])
students

Name    | Age  | Major            | Graduation Year
Alice   | 20   | Biology          | 2022
Bob     | 21   | Physics          | 2023
Charlie | 19   | Art              | 2022
Dana    | 22   | Computer Science | 2023




### Sorting Data

Sorting data based on column values is useful for both analysis and presentation.



In [16]:
# Sorting by age
sorted_by_age = students.sort('Age')
sorted_by_age

Name    | Age  | Major            | Graduation Year
Charlie | 19   | Art              | 2022
Alice   | 20   | Biology          | 2022
Bob     | 21   | Physics          | 2023
Dana    | 22   | Computer Science | 2023


In [21]:
# Sorting by name in descending order
sorted_by_name_desc = students.sort('Name', descending=True)
sorted_by_name_desc

Name,Age,Major,Graduation Year
Dana,22,Computer Science,2023
Charlie,19,Art,2022
Bob,21,Physics,2023
Alice,20,Biology,2022
