# Lecture 4 – Arrays, NumPy, Indexing

### Data 6, Summer 2025

## Arrays

In [None]:
# Run this cell to import the needed functions
from datascience import *

## `make_array()`
The `make_array` function takes in multiple values seperated by commas, and combines them into one value called an **array**. We will use arrays extensively in this class due to many of their useful features that allow us to write simpler code when working with lots of data. See the examples below for some uses of the function.

In [None]:
make_array(5, -1, 0, 5)

In [None]:
make_array(5, -1, 0.3, 5)

In [None]:
make_array(4, -4.5, "not a number")

In [None]:
arr = make_array("hello",
                 "world",
                 "!")
arr

In [None]:
print(make_array(5, -1, 0.3, 5))

### Education Levels

Task 1: Compute % of non-high school graduates by state

In [None]:
hs_or_higher = make_array(86.9, 83.9, 88.5, 87.2, 84.4)
hs_or_higher

In [None]:
100 - hs_or_higher
hs_or_higher

In [None]:
below_hs = 100 - hs_or_higher
below_hs

Task 2: Compute number of bachelor’s degrees by state


In [None]:
bs_or_higher = make_array(26.2, 34.7, 30.5, 37.5, 30.7)
state_pop = make_array(3344006, 26665143, 15255326, 13649157, 18449851)
state_pop

In [None]:
... #YOUR CODE HERE

In [None]:
#from datascience import *
from numpy import * #Why is this not preferred?
import numpy as np #Why is this preferred?
import datascience as ds

In [None]:
c_temps = ds.make_array(30, 18, -4.5, 0, 3)

type(c_temps.item(0))


## Warmup

In [None]:
pop_2020 = make_array(5.025, .732, 7.178, 3.012, 39.500)
pop_2021 = make_array(5.040, .733, 7.276, 3.026, 39.238)

In [None]:
pop_2021 - pop_2020

In [None]:
average(pop_2020)

In [None]:
x = print(print(3))
print(x)

<br/><br/>

# Array Functions
Array functions in Python, such as len, min, and max, are essential tools for manipulating and analyzing arrays or lists of data. 

The `len` function returns the length of an array, providing the number of elements it contains. This is useful for determining the size of an array dynamically. 

The `min` function identifies the smallest value within an array, while the `max` function retrieves the largest value. 

The `sum` function sums all of the values in an array.

These functions are handy when searching for extremities or performing basic statistical analysis on arrays. 

With their simplicity and efficiency, `len`, `min`, `max`, and `sum` are versatile array functions that greatly enhance Python's capabilities in data processing and analysis.

In [None]:
empty_arr = make_array()
int_arr = make_array(3, -4, 0, 5, 2)
str_arr = make_array("cm", "m", "in", "ft", "yd")

In [None]:
print(empty_arr)
print(int_arr)
print(str_arr)

`len()`

In [None]:
len(str_arr)

In [None]:
len(empty_arr)

`min()`, `max()`

In [None]:
min(int_arr)

In [None]:
max(str_arr)

`sum()`

In [None]:
sum(int_arr)

In [None]:
sum(str_arr)

How would you compute the average of an array arr?

In [None]:
arr = make_array(30, -40, -4.5, 0, 35)
avg = ...
avg

## NumPy Functions

Elementwise functions

In [None]:
numbers_arr = make_array(5, 4, 9, 12, 100)
numbers_arr

In [None]:
np.sqrt(numbers_arr)

In [None]:
np.log(numbers_arr)   # natural log

In [None]:
np.log10(numbers_arr) # log base 10

In [None]:
np.sin(numbers_arr)

In [None]:
np.sqrt(144)

Common functions

In [None]:
pop_2020

In [None]:
np.mean(pop_2020)

In [None]:
np.average(pop_2020)

In [None]:
np.sum(pop_2020)

In [None]:
np.prod(pop_2020)

In [None]:
np.count_nonzero(make_array(1, 2, 3, 0, 4, 0, -5))

### Even More Functions

In [None]:
daily_high_temps = make_array(73, 71, 69, 72, 76, 74, 75)
daily_high_temps

In [None]:
np.diff(daily_high_temps)

In [None]:
pop_by_year = make_array(23, 45, 93, 101, 118)
pop_by_year

In [None]:
np.cumsum(pop_by_year)

In [None]:
np.arange(10)
make_array(0, 1, 2, 3, 4...9)

In [None]:
arr = np.arange(3, 9)
arr

In [None]:
np.arange(3, 14, 7)

In [None]:
arr = np.arange(1, 11)
arr

In [None]:
np.sum(arr)

In [None]:
np.arange(3, 11, 2)

In [None]:
np.arange(10, 1, -3)

# Indexing

In [None]:
sq_array = make_array(1, 4, 9, 16, 25)
sq_array

In [None]:
type(sq_array)

In [None]:
int_arr = make_array(3, -4, 0, 5, 2)
int_arr

In [None]:
int_arr.item(0)

In [None]:
int_arr.item(3)

In [None]:
int_arr.item(5)

In [None]:
int_arr.item(len(int_arr) - 1) #Grab the last item inside int_arr

In [None]:
int_arr.item(-1)

In [None]:
int_arr.item(-3)

In [None]:
test_array = make_array(10, 14, 67, -1.5, 3.2, 2.72, 104, 81, 3.14159)

In [None]:
# first (0th element)
test_array.item(0)

In [None]:
# second (1st element)
test_array.item(1)

In [None]:
# last element
test_array.item(len(test_array) - 1)
# OR 
test_array.item(-1)

In [None]:
# second to last element
test_array.item(-2)

What is the value of `five` after running this code?

In [None]:
threes = make_array(3, 6, 9, 12, 15)
five = threes.item(-1) + threes.item(1)
five