# Intro to NumPy

NumPy is a python library for performing linear algebra calculations. It's also the foundation for many other libraries we'll be using in this course.

The first step to using a library is to import it:

In [1]:
import numpy as np

Once it's imported, we'll use `np.array` to create a numpy array, which is like a python list, but with added functionality.

In [2]:
numbers=[5,6,43,546,3,6,25,46,36]

In [4]:
np_num=np.array(numbers)

In [5]:
np_num


array([  5,   6,  43, 546,   3,   6,  25,  46,  36])

## Convenience methods

We can use `min()` and `max()` to grab the smallest and largest numbers in the array, respectively.

In [6]:
np_num.min()

3

In [8]:
np_num.max()

546

`argmin()` and `argmax()` will give you the **index** of the smallest and largest numbers in the array, respectively.

In [10]:
np_num.argmin()##the index of first occurance of the smallest num in the list

4

In [12]:
np_num.argmax()##the index of first occurance of the largestbnp_ num in the list

3

Numpy has a method for creating arrays from ranges of numbers: `np.arange`. It works like `range`, except it returns a numpy array as opposed to a vanilla python list.

In [15]:
np.arange(1,10)

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [17]:
np.random.choice([1,2,3],2)##select 2 random num from the list with replacement

array([1, 3])

In [None]:
np.random.choice([1,2,3,4,5],2,replace=False)##2 random num from the list without repalcement

In [26]:
np.random.choice([1,2,3,4,5,6],1,replace=False,p=[.1,.5,.1,.1,.1,.1])

array([1])

# Broadcasting / Scalar Math

Exercise: Write a for loop to add 1 to each item in your array

In [20]:
[num+1 for num in np_num]##get a list

[6, 7, 44, 547, 4, 7, 26, 47, 37]

In [21]:
np.array([num+1 for num in np_num])#get an array

array([  6,   7,  44, 547,   4,   7,  26,  47,  37])

With numpy, we no longer have to code for loops to do these types of calculations. We can simply broadcast our arithmetic operations across the entire array:

In [None]:
[num+1 for num in np_num]

We can also use broadcasting over a subset of an array:

# Matrices vs Vectors

Linear Algebra (and Data Science as an extension) uses vectors and matrices extensively. You can think of a **matrix** as an excel spreadsheet. A **vector** is a special kind of matrix in that it is a single column (or row)of data.

Stated differently, matrix = many columns, vector = single column.

We can can use the `reshape()` method to create a matrix from an array.

In [29]:
matr=np.arange(1,101).reshape(10,10)
matr

array([[  1,   2,   3,   4,   5,   6,   7,   8,   9,  10],
       [ 11,  12,  13,  14,  15,  16,  17,  18,  19,  20],
       [ 21,  22,  23,  24,  25,  26,  27,  28,  29,  30],
       [ 31,  32,  33,  34,  35,  36,  37,  38,  39,  40],
       [ 41,  42,  43,  44,  45,  46,  47,  48,  49,  50],
       [ 51,  52,  53,  54,  55,  56,  57,  58,  59,  60],
       [ 61,  62,  63,  64,  65,  66,  67,  68,  69,  70],
       [ 71,  72,  73,  74,  75,  76,  77,  78,  79,  80],
       [ 81,  82,  83,  84,  85,  86,  87,  88,  89,  90],
       [ 91,  92,  93,  94,  95,  96,  97,  98,  99, 100]])

# Transposing Matrices

Transposing is when a matrix is flipped along it's top-left to bottom-right diagonal. We use `T` on a numpy array to accomplish this task.

In [30]:
matr.T

array([[  1,  11,  21,  31,  41,  51,  61,  71,  81,  91],
       [  2,  12,  22,  32,  42,  52,  62,  72,  82,  92],
       [  3,  13,  23,  33,  43,  53,  63,  73,  83,  93],
       [  4,  14,  24,  34,  44,  54,  64,  74,  84,  94],
       [  5,  15,  25,  35,  45,  55,  65,  75,  85,  95],
       [  6,  16,  26,  36,  46,  56,  66,  76,  86,  96],
       [  7,  17,  27,  37,  47,  57,  67,  77,  87,  97],
       [  8,  18,  28,  38,  48,  58,  68,  78,  88,  98],
       [  9,  19,  29,  39,  49,  59,  69,  79,  89,  99],
       [ 10,  20,  30,  40,  50,  60,  70,  80,  90, 100]])

# Slicing

Slicing numpy arrays is similar to slicing lists. We can get a single item by using bracket notation. 

**Practice:** Create a numpy array and grab the second item from that array.

In [33]:
np_num[-2:]

array([46, 36])

You can also slice an array using a range of indices.

**Practice:** Grab the second through fifth items from your array.

In [34]:
matr[:3,:3]##first 3 rows, first 3 colums

array([[ 1,  2,  3],
       [11, 12, 13],
       [21, 22, 23]])

In [36]:
matr[2:6,2:6]

array([[23, 24, 25, 26],
       [33, 34, 35, 36],
       [43, 44, 45, 46],
       [53, 54, 55, 56]])

# Slicing Matrices

We still use bracket notation for slicing matrices, only now we separate our row slicing from column slicing with a comma.

In [38]:
matr[:,-3:]

array([[  8,   9,  10],
       [ 18,  19,  20],
       [ 28,  29,  30],
       [ 38,  39,  40],
       [ 48,  49,  50],
       [ 58,  59,  60],
       [ 68,  69,  70],
       [ 78,  79,  80],
       [ 88,  89,  90],
       [ 98,  99, 100]])

In [41]:
matr[3:7,:1]

array([[31],
       [41],
       [51],
       [61]])

In [42]:
matr[3:7,1]

array([32, 42, 52, 62])

# Boolean Selection

We can broadcast a boolean expression to filter a numpy array:

In [43]:
np_num[np_num%2==0]

array([  6, 546,   6,  46,  36])

In [44]:
np_num%2==0

array([False,  True, False,  True, False,  True, False,  True,  True], dtype=bool)

In [45]:
matr[matr%2==0]

array([  2,   4,   6,   8,  10,  12,  14,  16,  18,  20,  22,  24,  26,
        28,  30,  32,  34,  36,  38,  40,  42,  44,  46,  48,  50,  52,
        54,  56,  58,  60,  62,  64,  66,  68,  70,  72,  74,  76,  78,
        80,  82,  84,  86,  88,  90,  92,  94,  96,  98, 100])

# Views vs Copy

In [55]:
subset=matr[:3,:3]
subset
#subset is a view into the same matrix
#changes on subset will affect the original matrix

array([[ 2,  3,  4],
       [12, 13, 14],
       [22, 23, 24]])

In [56]:
subset=matr[:3,:3].copy
subset

#.copy does not change the original matrix

<function ndarray.copy>

In [52]:
subset+=1
subset 

array([[ 2,  3,  4],
       [12, 13, 14],
       [22, 23, 24]])

In [53]:
matr

array([[  2,   3,   4,   4,   5,   6,   7,   8,   9,  10],
       [ 12,  13,  14,  14,  15,  16,  17,  18,  19,  20],
       [ 22,  23,  24,  24,  25,  26,  27,  28,  29,  30],
       [ 31,  32,  33,  34,  35,  36,  37,  38,  39,  40],
       [ 41,  42,  43,  44,  45,  46,  47,  48,  49,  50],
       [ 51,  52,  53,  54,  55,  56,  57,  58,  59,  60],
       [ 61,  62,  63,  64,  65,  66,  67,  68,  69,  70],
       [ 71,  72,  73,  74,  75,  76,  77,  78,  79,  80],
       [ 81,  82,  83,  84,  85,  86,  87,  88,  89,  90],
       [ 91,  92,  93,  94,  95,  96,  97,  98,  99, 100]])

In [79]:
#bingo card
bingo=np.array([[15, 18, 32, 59, 74],
       [ 3, 24, 37, 50, 68],
       [11, 22, 34, 47, 70],
       [ 2, 17, 43, 56, 69],
       [ 4, 28, 41, 46, 73]])

In [80]:
bingo

array([[15, 18, 32, 59, 74],
       [ 3, 24, 37, 50, 68],
       [11, 22, 34, 47, 70],
       [ 2, 17, 43, 56, 69],
       [ 4, 28, 41, 46, 73]])

In [3]:
bingo_card=np.matr=np.arange(1,76).reshape(5,15)

NameError: name 'np' is not defined

In [4]:
bingo_card

NameError: name 'bingo_card' is not defined

In [6]:
import random
import numpy as np

In [None]:
np.random.random_sample(np.arange(1,16))

In [1]:

subset=bingo_card[:,:5]

NameError: name 'bingo_card' is not defined

In [157]:
subset

array([], shape=(5, 0), dtype=int64)

In [88]:
np.random.choice([1,2,3,4,5],2,replace=False)


array([2, 3])

In [142]:
row1 = np.random.choice(range(1,16), size=5,replace=False)
row2 = np.random.choice(range(16,31), size=5,replace=False)
row3 = np.random.choice(range(31,46), size=5,replace=False)
row4 = np.random.choice(range(46,61), size=5,replace=False)
row5 = np.random.choice(range(61,76), size=5,replace=False)
card =np.array([row1,row2,row3,row4,row5]).T
card

array([[ 1, 28, 32, 60, 72],
       [ 8, 19, 33, 55, 69],
       [10, 18, 36, 49, 66],
       [14, 25, 41, 54, 67],
       [ 4, 23, 34, 50, 63]])