![Numpy](https://www.durhamtech.edu/themes/custom/durhamtech/images/durham-tech-logo-web.svg) 

# Utilizing Arrays with Numpy

# Table of Contents

### Basics
#### <a href='#1'>What is NumPy?</a>
#### <a href='#2'>NumPy contains an array object that is "fast"</a>
#### <a href='#3'>Creating an Array</a>
#### <a href='#4'>Slicing and Indexing</a>

### Advanced
#### <a href='#5'>Arrays vs. Lists</a>
* Structure
* Speed
* Operational Differences

### Mathematical Uses
#### <a href='#6'>Multidimensional Arrays</a>
#### <a href='#7'>Masks</a>
#### <a href='#8'>Linear Algebra and Matrix Operations</a>

#### <a href='#9'>Weekly Readings/Videos</a>
#### <a href='#10'>Extra Practice</a>

<a id='1'></a>
## What is NumPy?  

NumPy stands for **Numerical Python** and it is the fundamental package for scientific computing in Python. It is a package that lets you efficiently store and manipulate numerical arrays. It contains among other things:

* a powerful N-dimensional array object
* useful linear algebra and random number capabilities

<a id='2'></a>
## NumPy contains an array object that is "fast" 

<img src="https://miro.medium.com/max/700/1*X0Dg7QfSYtWhSAu-afi8-g.png">
<a href="https://towardsdatascience.com/numpy-array-cookbook-generating-and-manipulating-arrays-in-python-2195c3988b09">Source</a>

**It stores / consists of**:
* location of a memory block (allocated all at one time)
* a shape (3 x 3 or 1 x 9, etc)
* data type / size of each element

The core feauture that NumPy supports is its multi-dimensional arrays. In NumPy, dimensions are called axes and the number of axes is called a rank.

In [None]:
import numpy as np

np.__version__

<a id='3'></a>
## Creating an Array

In [None]:
# list to array
array_list = [1, 2, 3, 4, 5, 6]
numpy_array = np.array(array_list)
numpy_array

In [None]:
# object type, data type store in our array
type(numpy_array), numpy_array.dtype

In [None]:
# An array only has one data type so if we try to store say an float in an int array,
# the float will be up-casted to an int
numpy_array[1] = 7.9562
numpy_array # check the 2nd item [1st index] of the return value

In [None]:
# If there are multiple data types in an array, numpy will make its best guess on the type for the array
array1 = np.array([1, 2.5, 3, 4, 5, 6])
print(array1) # notice all values are now floats
print(array1.dtype) 

In [None]:
# can use very long lists if desire
data = np.array(range(320000))
print(data)
print("length of this array:",len(data)) # to see the length of the full array

In [None]:
# the elements of an array must be of a type that is valid to perform
# a specific mathematical operation on

data = np.array([1,2,'cat', 4])
print(data)
print(data.dtype) 

#For more info on the printed out data type check out this link
#https://stackoverflow.com/questions/49751000/how-does-numpy-determine-the-array-data-type-when-it-contains-multiple-dtypes

<a id='4'></a>
## Slicing and Indexing

In [None]:
# Same logic as slicing a python list:
data = np.array(range(10))
print(data[:])
print (data[0:3])
print (data[3:])
print (data[::-2])
print("")

x = np.array(range(18))
print ('x:',x)
print (x[5:15:2]) # [start : end : step_size]

<a id='5'></a>
## Arrays vs. Lists

### Structure
NumPy utilizes efficient pointers to a location in memory and it will store the full array. Lists on the other hand are pointers to many different objects in memory.

In [None]:
# Slicing returns a view in Numpy, 
# and not a copy as is the case with Python lists
data = np.array(range(10))
view = data[0:3]
print(view)

l = list(range(10))
copy = l[0:3]
print(copy)

copy[0] = 99
view[0] = 99
print(copy)
print(view)

print('Python list:',l) # has not changed
print('NumPy array:',data) # has changed

# Creating copies of the array instead of views
data = np.array(range(10))
arr_copy = data[:3].copy()
print('Array copy',arr_copy)

arr_copy[0] = 555
print('New array copy',arr_copy)
print('Original array',data) # now it is not a view any more

# same thing with assignment, its not a copy, its the same data
x = np.array(range(20))
print (x)
y = x

y[:]=0
print(x)

x is y

In [None]:
x = [1,2]
y = [1,2]
x is y

### Speed
Arrays are much faster than lists

In [None]:
# Arrays are faster and more efficient than lists
x = list(range(100000))

# Say that we want to square all elements
y = [i**2 for i in x]
print (y[0:10])

# Time the operation with some IPython magic command
print('Time for Python lists:')
list_time = %timeit -o -n 20 [i**2 for i in x]

In [None]:
z = np.array(x)
w = z**2
print(w[:10])

print('Time for NumPy arrays:')
np_time = %timeit -o -n 20 z**2

In [None]:
print('NumPy is ' + str(list_time.all_runs[0]//np_time.all_runs[0]) + \
      ' times faster than lists at squaring 100,000 elements.')

### Operational Differences
A function that is applied on an ndarray in an element-by-element fashion. Several universal functions can be found the NumPy documentation here: https://docs.scipy.org/doc/numpy-1.13.0/reference/ufuncs.html

In [None]:
#Addition

# x and y are lists
x = list(range(5))
y = list(range(5,10))
print ("list x = ", x)
print ("list y = ", y)
print ("x + y = ", x+y)
print('\n')

# xn and yn are numpy arrays:
xn = np.array(x)
yn = np.array(y)
print ('np.array xn =', xn)
print ('np.array yn =', yn)
print ("xn + yn = ", xn + yn)

# An array is a sequence that can be manipulated easily
# An arithmetic operation is applied to each element individually
# When two arrays are added, they must have the same size 
# (otherwise they might be broadcasted)

# can also use a function
np.add(xn,yn)

In [None]:
#Multiplication
# python lists
print (3* x)

# Numpy Arrays
print (3 * xn)

## -------------PRACTICE-------------
1.  Create two arrays from array a, one with 4/5 and the other with 1/5 of the elements that doesn't mess up the original array a.

In [None]:
a = np.random.normal(0,1,1000)

#Your code below

2. Print out every 50th element on the last 400 elements of a and store it in an element called "g".

In [None]:
g = ...

3. Create an array of your top 5 favorite foods in an array called food.

In [None]:
food = ...

4. Multiply every element in the array "g" by 10.  Then try multiplying every element in the array "food" by 10 as well... does this error out?  If so, why?

<a id='6'></a>
## Multidimensional Arrays
This allows us to create the matrices that are often used underneath hood of a lot of machine learning packages

In [None]:
# This list has two dimensions
list3 = [[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]
list3 # nested list

In [None]:
# Better print format in np.array
data = np.array(list3)
data

In [None]:
print('Dimensions:',data.ndim)
print ('Shape:',data.shape)
print('Size:', data.size)

In [None]:
# You can also transpose an array Matrix with either np.transpose(arr)
# or arr.T
data.T

In [None]:
np.transpose(data)

In [None]:
# Returns a new array of specified size, filled with zeros.  
# Notice it goes row length, column length
print('zeros')
print(np.zeros((2,5), dtype=np.int8))
print("")

#Returns a new array of specified size, filled with ones.
print('ones')
print(np.ones((4,2), dtype=np.float128))
print("")

# Returns the identity matrix of specific squared size
# Identity matrix link: https://www.studypug.com/algebra-help/identity-matrix
print('identity')
print(np.eye(5))

<a id='7'></a>
## Masks
Allows us to filter our arrays quickly and effeciently based on some criteria

In [None]:
arr = np.arange(10)
print(arr)
mask = arr>5
print(mask)

#Notice two ways to do this - import to remember for Python
print(arr[mask])
print(arr[arr>5])

In [None]:
# Functions any / all
np.any( arr==8 )

In [None]:
np.all( arr>3 )

<a id='8'></a>
## Linear Algebra and Matrix Operations

<a href="https://minireference.com/static/tutorials/linear_algebra_in_4_pages.pdf">Linear Algebra Basic Overview</a>

In [None]:
a = np.arange(0, 15)
print(a)
print('')

# Reshape the array to a 5 x 3 matrix
print('reshaped:')
a=a.reshape((5, 3))
print(a)

# We can easily find the sum, min, max, .. are easy
print()
print ('Sum:',a.sum())
print('Min:', a.min())
print('Max:', a.max())

print ('Column sum:',a.sum(axis=0))
print ('Row sum:',a.sum(axis=1))

In [None]:
a = np.arange(11, 21)
b = np.arange(0, 10)
print ("a = ",a)
print ("b = ",b)

print ('Matrix dot product')
print (a.dot(b))

print ('Matrix multiplication')
c = np.arange(1,5).reshape(2,2)
print ("c = \n", c)
print()
d = np.arange(5,9).reshape(2,2)
print ("d = \n", d)
print()
print (np.matmul(d,c))

## -------------PRACTICE-------------
1. Create a 6 x 4 matrix filled with random numbers (created via numpy's random function).

2. Transpose the matrix you created above and print out its size, dimensions and shape.

3. Print out the maximum of the 2nd row and the sum of the 3rd column.

4. Create an identity of matrix that has 3 rows and 3 columns.  Then print out the identity matrix multiplied by "g".  Also print out the matrix "g" multiplied by itself ("g") as well.

In [None]:
g = np.array([[1,2,3],[4,5,6],[7,8,9]])

#Your code below

<a id='9'></a>
# Weekly Readings/Videos

https://www.sciencenewsforstudents.org/article/explainer-what-is-an-algorithm

https://www.wired.com/2014/11/algorithms-great-can-also-ruin-lives/

<a id='10'></a>
# Extra Practice