# Probability and Statistics Fall 1400
## Python Tutorial - Part 2_NumPy
* Prepared by : Shayan Vassef
* Email : sh.vassef@ut.ac.ir

![Python](Python-Tutorial.png)

# Let's get started !


**This notebook will just go through the following topics in order:**

* NumPy
    * Built-in Methods
    * Random
    * Array Attributes and Methods
    * Indexing and Selection
    * Broadcasting
    * Conditional Selection
    * Operations
    * Functions
* pandas
* Matplotlib
* Seaborn


# NumPy 

What makes Numpy so important is that almost all of the libraries in the PyData ecosystem (pandas, scipy, scikit-learn, etc.) rely on NumPy as one of their main building blocks. Plus we will use it to generate data for our analysis examples later on!

## Numpy Installation 

 If you have Anaconda, install NumPy by going to your terminal or command prompt and typing:

<font color='red'>conda install numpy</font>

 You can also employ the pip method to install or update your desire library:
 
<font color='red'>pip install numpy</font>

## Numpy Importing

In [4]:
import numpy as np

### <font color='green'>NumPy Arrays</font>

**arrays and lists are easily convertible**

In [5]:
my_list = [1,2,3]
my_array = np.array([1,2,3])
my_array

array([1, 2, 3])

In [6]:
list(my_array)

[1, 2, 3]

In [7]:
my_matrix = [[1,2,3],[4,5,6],[7,8,9]]
my_matrix

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [8]:
np.array(my_matrix)

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

### <font color='green'>Built-in Methods</font>

### arange

In [11]:
np.arange(0,10,1)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [1]:
np.arange(0,10,2)

NameError: name 'np' is not defined

### zeros and ones

In [13]:
np.zeros(5)

array([0., 0., 0., 0., 0.])

In [155]:
np.zeros((5,5))

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [17]:
np.ones(5)

array([1., 1., 1., 1., 1.])

In [18]:
np.ones((5,5))

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

### linspace

In [19]:
np.linspace(0,10,3)

array([ 0.,  5., 10.])

In [20]:
np.linspace(0,5,21)

array([0.  , 0.25, 0.5 , 0.75, 1.  , 1.25, 1.5 , 1.75, 2.  , 2.25, 2.5 ,
       2.75, 3.  , 3.25, 3.5 , 3.75, 4.  , 4.25, 4.5 , 4.75, 5.  ])

### eye

In [21]:
np.eye(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

### <font color='green'>Random</font>

### rand 

Creates an array of the given shape and populates it with random samples from a uniform distribution over [0, 1)

In [22]:
np.random.rand(2)

array([0.10284176, 0.94206904])

In [161]:
np.random.rand(2,5)

array([[0.66079333, 0.16336672, 0.47545672, 0.04967573, 0.44870916],
       [0.39969553, 0.42619501, 0.94158406, 0.94378594, 0.79318951]])

In [156]:
np.mean(np.random.rand(1,10**6))

0.5003333566049761

In [157]:
np.var(np.random.rand(1,10**6))

0.08324084865386719

### randn 

Returns a sample (or samples) from the "standard normal" distribution [σ = 1]. Unlike rand which is uniform, values closer to zero are more likely to appear.

In [54]:
np.random.randn(2)

array([-0.04327251, -0.94104415])

In [55]:
np.random.randn(3,3)

array([[-0.95508977,  1.39559558, -1.67640166],
       [-0.79371526,  2.33788267, -0.72581334],
       [ 2.42903808,  0.73841752, -0.43092908]])

### randint 

Returns random integers from low (inclusive) to high (exclusive).

In [56]:
np.random.randint(1,100)

89

In [57]:
np.random.randint(1,100,10)

array([50, 10, 72, 38, 86, 20, 97, 13, 56, 32])

### seed 
Can be used to set the random state, so that the same "random" results can be reproduced. 

In [167]:
np.random.seed(42)
np.random.rand(4)

array([0.37454012, 0.95071431, 0.73199394, 0.59865848])

In [168]:
np.random.seed(42)
np.random.rand(4)

array([0.37454012, 0.95071431, 0.73199394, 0.59865848])

### <font color='green'>Array Attributes and Methods</font>

### Reshape 

Returns an array containing the same data with a new shape

In [171]:
myarr=np.arange(0,25,3)
myarr

array([ 0,  3,  6,  9, 12, 15, 18, 21, 24])

In [172]:
myarr.reshape(3,3)

array([[ 0,  3,  6],
       [ 9, 12, 15],
       [18, 21, 24]])

### max, min, argmax, argmin

These are useful methods for finding max or min values, or to find their index locations using argmin or argmax

In [65]:
myarr.max()

24

In [66]:
myarr.min()

0

In [67]:
myarr.argmax()

8

In [68]:
myarr.argmin()

0

### Shape 

Shape is an attribute that arrays have (not a method):  [[reference](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.shape.html)]

In [72]:
#one dimensional !
myarr.shape

(9,)

In [78]:
#Two dimensional !
myarr=myarr.reshape(1,9)
myarr

array([[ 0,  3,  6,  9, 12, 15, 18, 21, 24]])

In [79]:
myarr.shape

(1, 9)

In [80]:
myarr.reshape(9,1)

array([[ 0],
       [ 3],
       [ 6],
       [ 9],
       [12],
       [15],
       [18],
       [21],
       [24]])

### <font color='green'>Indexing and Selection</font>

### 1D array

In [103]:
myarr=np.arange(0,36,3)
myarr

array([ 0,  3,  6,  9, 12, 15, 18, 21, 24, 27, 30, 33])

In [104]:
myarr[3]

9

In [105]:
# Just like indexing inside a list
myarr[1:4]

array([3, 6, 9])

### 2D array(Matrix)

In [106]:
newarr=myarr.reshape((1,len(myarr)))
newarr

array([[ 0,  3,  6,  9, 12, 15, 18, 21, 24, 27, 30, 33]])

In [107]:
# So why is that ?
newarr[3]

IndexError: index 3 is out of bounds for axis 0 with size 1

In [108]:
# arr[row][column] or arr[row , column]
print(newarr[0][3])
print(newarr[0,3])

9
9


In [110]:
D2array=newarr.reshape((3,4))
D2array

array([[ 0,  3,  6,  9],
       [12, 15, 18, 21],
       [24, 27, 30, 33]])

In [112]:
#Selecting one item
D2array[2,3]

33

In [113]:
#Selecting several items
D2array[1:3,:]

array([[12, 15, 18, 21],
       [24, 27, 30, 33]])

In [114]:
D2array[1:3,1:3]

array([[15, 18],
       [27, 30]])

In [115]:
D2array[1:3,0:3:2]

array([[12, 18],
       [24, 30]])

### More Indexing Help
Indexing a 2D matrix can be a bit confusing at first, especially when you start to add in step size. Try google image searching *NumPy indexing* to find useful images, like this one:
![numpy-indexing](numpy-indexing.jpg)

### <font color='green'>Broadcasting</font> 

**With lists, you can only reassign parts of a list with new parts of the same size and shape. That is, if you wanted to replace the first 5 elements in a list with a new value, you would have to pass in a new 5 element list. With NumPy arrays, you can broadcast a single value across a larger set of values:**

In [119]:
myarr[0:5]=20
myarr

array([20, 20, 20, 20, 20, 15, 18, 21, 24, 27, 30, 33])

In [122]:
Narray=myarr[5:]
Narray

array([15, 18, 21, 24, 27, 30, 33])

In [123]:
Narray[:]=30

In [124]:
Narray

array([30, 30, 30, 30, 30, 30, 30])

In [126]:
# The original array also changed !!
myarr

array([20, 20, 20, 20, 20, 30, 30, 30, 30, 30, 30, 30])

In [129]:
#In order to cancel the changing in our original array , we should creat a copy of that :
new_array=myarr.copy()
new_array

array([20, 20, 20, 20, 20, 30, 30, 30, 30, 30, 30, 30])

### <font color='green'>Conditional Selection</font> 

In [179]:
arr = np.arange(1,10)
arr

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [180]:
arr>5

array([False, False, False, False, False,  True,  True,  True,  True])

In [181]:
arr[arr>5]

array([6, 7, 8, 9])

### <font color='green'>Operations</font> 

In [133]:
arr+arr

array([ 2,  4,  6,  8, 10, 12, 14, 16, 18])

In [134]:
arr-arr*2

array([-1, -2, -3, -4, -5, -6, -7, -8, -9])

In [135]:
arr**2

array([ 1,  4,  9, 16, 25, 36, 49, 64, 81], dtype=int32)

In [136]:
1/arr

array([1.        , 0.5       , 0.33333333, 0.25      , 0.2       ,
       0.16666667, 0.14285714, 0.125     , 0.11111111])

### <font color='green'>Functions</font> 

In [137]:
np.sqrt(arr)

array([1.        , 1.41421356, 1.73205081, 2.        , 2.23606798,
       2.44948974, 2.64575131, 2.82842712, 3.        ])

In [138]:
np.exp(arr)

array([2.71828183e+00, 7.38905610e+00, 2.00855369e+01, 5.45981500e+01,
       1.48413159e+02, 4.03428793e+02, 1.09663316e+03, 2.98095799e+03,
       8.10308393e+03])

In [139]:
np.sin(arr)

array([ 0.84147098,  0.90929743,  0.14112001, -0.7568025 , -0.95892427,
       -0.2794155 ,  0.6569866 ,  0.98935825,  0.41211849])

In [140]:
np.log(arr)

array([0.        , 0.69314718, 1.09861229, 1.38629436, 1.60943791,
       1.79175947, 1.94591015, 2.07944154, 2.19722458])

In [142]:
arr.mean()

5.0

In [143]:
arr.var()

6.666666666666667

In [144]:
arr.std()

2.581988897471611

In [145]:
arr.sum()

45

In [149]:
arr=arr.reshape((3,3))
arr

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [150]:
arr.sum(axis=1)

array([ 6, 15, 24])

In [151]:
arr.sum(axis=0)

array([12, 15, 18])

<img src='axis.png' width=600/>

Thats enough for NumPy ! , See u in the next tutorial ;)