# NumPy 

NumPy (or Numpy) is a Linear Algebra Library for Python, the reason it is so important for Data Science with Python is that almost all of the libraries in the PyData Ecosystem rely on NumPy as one of their main building blocks.

Numpy is also incredibly fast, as it has bindings to C libraries. For more info on why you would want to use Arrays instead of lists, check out this great [StackOverflow post](http://stackoverflow.com/questions/993984/why-numpy-instead-of-python-lists).


## Importing Numpy

To use Numpy, we will have to import it.

In [115]:
import numpy as np

We will begin by learning, how to create Numpy arrays.
## Creating Numpy Arrays

From a Python List

We can create an array by directly converting a list or list of lists:

In [3]:
my_list = [1,2,3]
my_list

[1, 2, 3]

In [4]:
np.array(my_list)

array([1, 2, 3])

In [5]:
my_matrix = [[1,2,3],[4,5,6],[7,8,9]]
my_matrix

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [6]:
np.array(my_matrix)

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

## Built-in Methods

There are lots of built-in ways to generate Arrays

### Zeros and Ones
Generate arrays of zeros and one

In [7]:
np.zeros(3)

array([ 0.,  0.,  0.])

In [8]:
np.zeros((3,3))

array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])

In [9]:
np.ones(3)

array([ 1.,  1.,  1.])

In [10]:
np.ones((3,3))

array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

### Eye

Creates an identity matrix

In [11]:
np.eye(3)

array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

<hr>

### arange

Returns evenly spaced values within a given interval

In [12]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [13]:
np.arange(10,20)

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])

In [14]:
np.arange(10,20,2)

array([10, 12, 14, 16, 18])

### linspace

Return evenly spaced numbers over a specified interval.

In [16]:
np.linspace(10,20,10)

array([ 10.        ,  11.11111111,  12.22222222,  13.33333333,
        14.44444444,  15.55555556,  16.66666667,  17.77777778,
        18.88888889,  20.        ])

In [17]:
np.linspace(10,20,11)

array([ 10.,  11.,  12.,  13.,  14.,  15.,  16.,  17.,  18.,  19.,  20.])

## Random array generation

Numpy has a lot of functions to create random number arrays.

### rand
Create an array of the given shape and populate it with random samples from a uniform distribution over [0, 1).

In [19]:
np.random.rand(3)

array([ 0.52341181,  0.60087912,  0.88432446])

In [20]:
np.random.rand(3,3)

array([[ 0.79966681,  0.37276689,  0.6258196 ],
       [ 0.64121614,  0.61548034,  0.97714675],
       [ 0.18470481,  0.53312486,  0.96606184]])

### randn

Return a sample (or samples) from the "standard normal" distribution. Unlike rand which is uniform:

In [21]:
np.random.randn(3)

array([-0.22948105,  1.00888744,  0.65226022])

In [22]:
np.random.randn(3,3)

array([[-0.29905177, -1.87302396,  1.05193144],
       [ 0.82254898, -0.0410661 , -0.73927155],
       [-0.74423391,  0.81125992, -0.99031573]])

### randint

Return random integers from `low` (inclusive) to `high` (exclusive).

In [27]:
np.random.randint(10)

9

In [29]:
np.random.randint(10, 15)

12

In [30]:
np.random.randint(10,15,5)

array([10, 13, 12, 11, 10])

##  Methods and Attributes

Lets discuss some methods and attributes of numpy array object.

### max(), min(), argmax(), argmin()

These are useful methods for finding max or min values. Or to find their index locations using argmin or argmax

In [32]:
a = np.random.randint(10,100,8)

In [33]:
a

array([65, 48, 59, 71, 21, 71, 51, 46])

In [34]:
a.min()

21

In [35]:
a.argmin()

4

In [36]:
a.max()

71

In [37]:
a.argmax()

3

### dtype

You can also grab the data type of the object in the array:

In [38]:
a.dtype

dtype('int32')

### shape

In [40]:
a.shape

(8,)

In [41]:
a.reshape(1,8)

array([[65, 48, 59, 71, 21, 71, 51, 46]])

In [43]:
a.reshape(1,8).shape

(1, 8)

In [44]:
a.reshape(8,1)

array([[65],
       [48],
       [59],
       [71],
       [21],
       [71],
       [51],
       [46]])

In [45]:
a.reshape(8,1).shape

(8, 1)

## Numpy Indexing and slicing

Here, we will learn, how to select element or group of elements form a numpy array.

In [46]:
arr = np.arange(10)

In [47]:
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [49]:
arr[5]

5

In [50]:
arr[4:9]

array([4, 5, 6, 7, 8])

In [51]:
arr[4:9:2]

array([4, 6, 8])

### 2D-array

In [54]:
arr2 = np.arange(100).reshape(10,10)

In [55]:
arr2

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

There are 2 approachs for this:<br>
1. **arr2[row][col]**<br>
2. **arr2[row,col]**

In [56]:
arr2[1]

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])

In [57]:
arr2[1][3]

13

In [59]:
arr2[1,3]

13

It is highly recommend to use 2nd approach(comma seperated) because its more efficient both performance wise and memory wise. Also its less prone to errors when slicing the array.

In [62]:
arr2[1:4]

array([[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39]])

In [64]:
arr2[1:4][0] # expecting [10,20,30] 

In [66]:
arr2[1:4,0]

### Accessing elements by passing list

In [116]:
arr2

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

In [117]:
arr2[[5,2,3]]  # order is not important

array([[50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39]])

In [69]:
arr2[[5,2,3],[1,2,3]]

array([51, 22, 33])

Again, Here you cannot use \[row\]\[col\](back to basics). 

The Important take aways, use **\[row\]\[col\] notation** only if you want to access single element from the array. When slicing or using list to access element, its highly recommended to use **[row, col] notation**

## Broadcasting

Things starts getting interesting here. Numpy arrays have great advantage over normal python list because of their ability to broadcast.

In [70]:
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [73]:
arr[2:6]

array([2, 3, 4, 5])

In [80]:
arr26 = arr[2:6]

In [82]:
arr26[:]

array([2, 3, 4, 5])

In [83]:
arr26[:] = 100
arr26

array([100, 100, 100, 100])

In [84]:
arr

array([  0,   1, 100, 100, 100, 100,   6,   7,   8,   9])

Changes are reflected in original array as well. Python does this to save memory. When you slice an array it just returns you a view of the array not a complete new array with selected values.

In [85]:
arr37 = arr[3:7].copy()

Use **.copy()** method to explicitly state that you want to make a copy of the sliced array. 

In [86]:
arr37

array([100, 100, 100,   6])

In [87]:
arr37[:] = 99

In [88]:
arr37

array([99, 99, 99, 99])

In [89]:
arr

array([  0,   1, 100, 100, 100, 100,   6,   7,   8,   9])

There are many other cool things that you can achieve using Broadcasting. Such as using Comparision Operators

In [90]:
arr 

array([  0,   1, 100, 100, 100, 100,   6,   7,   8,   9])

In [91]:
arr > 80

array([False, False,  True,  True,  True,  True, False, False, False, False], dtype=bool)

In [92]:
arr == 100

array([False, False,  True,  True,  True,  True, False, False, False, False], dtype=bool)

In [93]:
plus80_arr = arr>80

In [94]:
plus80_arr

array([False, False,  True,  True,  True,  True, False, False, False, False], dtype=bool)

In [95]:
arr[plus80_arr]

array([100, 100, 100, 100])

### Boolean masking or Indexing

In [96]:
arr

array([  0,   1, 100, 100, 100, 100,   6,   7,   8,   9])

In [97]:
arr > 7

array([False, False,  True,  True,  True,  True, False, False,  True,  True], dtype=bool)

In [98]:
arr[arr>7]

array([100, 100, 100, 100,   8,   9])

## Arithematics with numpy array

In [100]:
arr = np.arange(10)

In [101]:
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [102]:
arr+arr

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [103]:
arr - arr

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [104]:
arr * arr

array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81])

In [105]:
arr / arr

  """Entry point for launching an IPython kernel.


array([ nan,   1.,   1.,   1.,   1.,   1.,   1.,   1.,   1.,   1.])

In [106]:
1/arr

  """Entry point for launching an IPython kernel.


array([        inf,  1.        ,  0.5       ,  0.33333333,  0.25      ,
        0.2       ,  0.16666667,  0.14285714,  0.125     ,  0.11111111])

In [108]:
arr**4

array([   0,    1,   16,   81,  256,  625, 1296, 2401, 4096, 6561], dtype=int32)

## Universal Array Functions

Numpy comes with many [universal array functions](http://docs.scipy.org/doc/numpy/reference/ufuncs.html), which are essentially just mathematical operations you can use to perform the operation across the array.

In [109]:
np.sqrt(arr)

array([ 0.        ,  1.        ,  1.41421356,  1.73205081,  2.        ,
        2.23606798,  2.44948974,  2.64575131,  2.82842712,  3.        ])

In [110]:
np.square(arr)

array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81], dtype=int32)

In [111]:
np.exp(arr)

array([  1.00000000e+00,   2.71828183e+00,   7.38905610e+00,
         2.00855369e+01,   5.45981500e+01,   1.48413159e+02,
         4.03428793e+02,   1.09663316e+03,   2.98095799e+03,
         8.10308393e+03])

In [112]:
np.log(arr)

  """Entry point for launching an IPython kernel.


array([       -inf,  0.        ,  0.69314718,  1.09861229,  1.38629436,
        1.60943791,  1.79175947,  1.94591015,  2.07944154,  2.19722458])

In [113]:
np.sin(arr)

array([ 0.        ,  0.84147098,  0.90929743,  0.14112001, -0.7568025 ,
       -0.95892427, -0.2794155 ,  0.6569866 ,  0.98935825,  0.41211849])

## Great work, First milestone ACHIEVED !!