# Intro to NumPy

---

NumPy is a foundational package for numerical computing in Python.

*   NumPy provides `ndarray`, an efficient multi-dimensional array supporting fast array-oriented arithmetic operations.
*   Can perform math operations on entire arrays without using for loops
*   Can perform common array operations like sorting, unique, & set
*   Linear Algebra, random number generation and Fourier transform capabilities
*   C api for connecting with C, C++, & FORTRAN libraries
*   Can map data directly onto underlying disk or memory representation


### Efficiency
- NumPy stores data in contiguous memory blocks
- NumPy stores data with single type so operations don’t require type checking
- Performs complex computations on entire arrays without the need for loops
- Operations don’t copy arrays by default

NumPy-based algorithms are generally 10 to 100 times faster (or more) than their pure Python counterparts and use significantly less memory.

NumPy is designed & optimized for **vectorized** (batch) operations on array data without `for` loops

- Arithmetic operations between equal-size arrays apply the operation element-wise - multiplication, addition, subtraction, division
- Scalar operations propagate the scalar argument to each element in the array
- Comparisons between equal size arrays yield boolean arrays


### Creating arrays
NumPy provides the `ndarray` a multi-dimensional array structure optimized for fast numeric operations.

ndarrays are constructed from sequences of homogenous values.

Unless explicitly specified, `np.array` infers a data type for the created array and stores the data type in a special `dtype` metadata object.


In [5]:
import numpy as np

Problem 1

In [None]:
# create an array called numbers with 2 rows and 4 columns from literal values. Have your literal values be the integers 1-8
numbers = np.array([
    [1, 2, 3, 4], 
    [5, 6, 7, 8]
    ])
print(numbers)

# Replace None below with each value for the array. 
print("\nInspect the array\n--------------")
print("dimensions:", 5)  # ndarrays have dimensions
print("shape:\t", 6)    # ndarrays have shape (numer of rows & columns)
print("size:\t", 7)     # number of elements in the array
print("datatype", 8)    # numpy determines the datatype
print("bytes:\t", 9)  # total bytes consumed by the array

[[1 2 3 4]
 [5 6 7 8]]

Inspect the array
--------------
dimensions: 5
shape:	 6
size:	 7
datatype 8
bytes:	 9


Problem 2: `ndarray` has built-in methods to create special arrays.

In [17]:
# Create a 3 by 4 array of ones
ones = np.ones((3, 4))
print(ones, "\n")

# Create a 3 by 4 array of zeros
zeros = np.zeros((3, 4))
print(zeros, "\n")

# Create a 2 by 2 array with random values
rando = np.random.rand(2, 2)
print(rando, "\n")

# Create 3 by 2 an empty array
empty = np.empty((3, 2))
print(empty, "\n")

# Create a 2 by 2 array full of 'x's, hint seach numpy full
full = np.full((2, 2), 'x')
print(full, "\n")

# Create an array of the form [10, 15, 20] using np.arange
even = np.arange(10, 25, 5)
print(even, "\n")

# Create an array of the form [0.   0.25 0.5  0.75 1.   1.25 1.5  1.75 2.  ]  using np.linspace
line = np.linspace(0, 2, num=9)
print(line, "\n")


[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]] 

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]] 

[[0.3965221  0.00991006]
 [0.24163363 0.94324556]] 

[[4.45057636e-308 9.05128263e-321]
 [0.00000000e+000 1.15280871e-311]
 [0.00000000e+000 3.25060610e-319]] 

[['x' 'x']
 ['x' 'x']] 

[10 15 20] 

[0.   0.25 0.5  0.75 1.   1.25 1.5  1.75 2.  ] 



### Indexing & Slicing

NumPy supports data access with Python-like indexing & slicing

- One-dimensional ndarrays act similar to Python lists
- `ndarray` dimensions are sometimes referred to as axis - e.g. in a 2d array axis 0 are the ‘rows’ and axis 1 are the ‘columns’
- ndarray slices are **views** on the original array and not copied. Any changes to the view are reflected in the source array


Problem 3

In [20]:
arr1d = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8])

# using indexing select and output [5, 6, 7] from arr1d. Save this slice to a variable named slice_arr1d
slice_arr1d = arr1d[5:8]
print(slice_arr1d)

[5 6 7]


Problem 4

In [21]:
# Replace 6, 7, and 8 with 12 in arr1d using one line of code. Then print arr1d and slice_arr1d
arr1d[6:9] = 12
print(arr1d)

[ 0  1  2  3  4  5 12 12 12]


Problem 5

In [None]:
# Create a copy of arr1d and name it new_array. Hint: You can't use new_array = arr1d
new_array = arr1d.copy()
print(new_array)

[ 0  1  2  3  4  5 12 12 12]


Problem 6

In [24]:
arr2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

# There are two ways to select 3 from the 2 dimensional array above. Use both ways to select 3 and print the results. 
print(arr2d[0, 2])
print(arr2d[0][2])

3
3


Problem 7

In [25]:
# In mult-dimensional arrays, slicing that omits later indices will return a lower-dimensional ndarray
# Select and print out the following:

# select first two rows and all but first column
print(arr2d[0:2, 1:])
# select first two rows and only 3rd column
print(arr2d[0:2, 2])
# select all columns in the first row
print(arr2d[0, :])

[[2 3 4]
 [6 7 8]]
[3 7]
[1 2 3 4]


### Boolean Indexing

NumPy supports boolean expressions in place of indices, where the expression results in an array of boolean values with the same length as the axis it’s indexing

Problem 8

In [None]:
import numpy as np
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
data = np.random.randint(2, size=(7, 7)) # create an array of random values

# Select and print the first and forth rows from 'data' using just the array 'names' and 'data'. Print out both the selected rows and 'data'.
selected_rows = data[(names == 'Bob')]
# Hint" The "Bob" appears in both the first and forth rows. 

[[1 1 1 0 0 1 1]
 [1 1 1 0 0 0 0]]


Problem 9: The expression can be assigned to a variable and used in place of slicing

In [36]:
# Again output the first and fouth rows using the cond below. Meaning replace None with an expression
cond = (names == 'Bob')
data[cond]

array([[1, 1, 1, 0, 0, 1, 1],
       [1, 1, 1, 0, 0, 0, 0]], dtype=int32)

Problem 10: The expression can be negated

In [37]:
# Select all rows except the first and forth
data[~cond]

array([[0, 1, 0, 1, 0, 0, 0],
       [0, 1, 0, 0, 1, 0, 1],
       [1, 0, 1, 1, 0, 1, 1],
       [0, 0, 1, 0, 0, 1, 1],
       [1, 1, 0, 0, 0, 0, 1]], dtype=int32)

Problem 11: The expression can be combined with other indices

In [38]:
# Select the last 5 columns for the first and forth row
data[cond, -5:]

array([[1, 0, 0, 1, 1],
       [1, 0, 0, 0, 0]], dtype=int32)

Problem 12: boolean expressions can be combined with logical operators

In [39]:
# Select the first, third, forth and fifth row using similar conditioins and an OR operator
cond = (names == 'Bob') | (names == 'Will')

### Array Oriented Programming

NumPy supports **vectorization** where data processing is executed as array expressions without **for loops**. Vectorized operations can be 1-2 orders of magnitude faster that pure Python equivalents.

Any arithmetic operation between equal-size arrays applies the operation element-wise.


Problem 13

In [40]:
array1 = np.array([[1., 2., 3.], [4., 5., 6.]])

print(array1)

# Add 5 to all values in array1 and print 
print(array1 + 5)


# Multiple all values in array1 by 10 and print 
print(array1 * 10)


[[1. 2. 3.]
 [4. 5. 6.]]
[[ 6.  7.  8.]
 [ 9. 10. 11.]]
[[10. 20. 30.]
 [40. 50. 60.]]


Problem 14: Comparisons between arrays of the same size yield boolean arrays:

In [41]:
array2 = np.array([[0., 4., 1.], [7., 2., 12.]])


# Test to see if pairwise elements in array1 are less than the corresponding element in array2. Output the results.
print(array1 < array2)
# i.e. comparing the first elements of the first rows would output False because 1 is not less than 0

[[False  True False]
 [ True False  True]]


### Universal Functions

NumPy provides universal functions the can perform element-wise operations on array data. They are fast vectorized wrappers for simple functions that take a scalar value and produce one or more scalar results.

- Unary functions - e.g. sqrt, exp - perform element-wise transformations
- Binary functions - e.g. add, maximum - take two arrays and return a single-array as the result
- Ufuncs can use an optional ‘out’ parameter to perform in-place transformations


### Conditional Logic

Problem 15

In [31]:
arr1 = np.ones(5)
arr2 = np.zeros(5)
cond = np.array([True, False, True, True, False])

# Output the value from `arr1` whenever the corresponding value in `cond` is True, and otherwise take the value from `arr2`. 
result = np.where(cond, arr1, arr2)
print(result)

[1. 0. 1. 1. 0.]


### Math & Statistical Methods

NumPy can compute statistics for an entire array or the data along a single axis.
- Can compute aggregations by invoking the array instance method or the top-level NumPy function
- Can specify whether to compute across rows or columns
- Can use `sum` to count number of True values in a boolean array

Methods:
- mean
- sum
- cumsum

Problem 16

In [29]:
arr = np.random.randint(4, size=(5, 4))
print(arr)

# Calculate and print the mean and sum for all values in the array
print("Mean:", np.mean(arr))


[[1 0 3 3]
 [0 0 0 0]
 [3 3 2 2]
 [3 0 1 3]
 [0 0 1 3]]
Mean: 1.4


Problem 17

In [28]:
# Calculate the mean and sum across the columns and the rows. Print out both
print("Mean across columns:", np.mean(arr2d, axis=0))
print("Sum across columns:", np.sum(arr2d, axis=0))
print("Mean across rows:", np.mean(arr2d, axis=1))
print("Sum across rows:", np.sum(arr2d, axis=1))

Mean across columns: [3. 4. 5. 6.]
Sum across columns: [ 6  8 10 12]
Mean across rows: [2.5 6.5]
Sum across rows: [10 26]


Problem 18

In [27]:

# Print out the cumulative sum for the full array, the columns and the rows. 
print
# For example the array [[1, 1, 1], [1, 1, 1]] would have the following outputs:
# full array: [1, 2, 3, 4, 5, 6]
# columns: [[1, 1, 1], [2, 2, 2]]
# rows: [[1, 2, 3], [1, 2, 3]]


<function print(*args, sep=' ', end='\n', file=None, flush=False)>

Problem 19

In [26]:

arr = np.arange(10)
print(arr)


# Print out all all odd values
print(arr[1::2])
# Print out all values in reverse order
print(arr[::-1])

[0 1 2 3 4 5 6 7 8 9]
[1 3 5 7 9]
[9 8 7 6 5 4 3 2 1 0]
