___


<p style="text-align: center;"><img src="https://docs.google.com/uc?id=1lY0Uj5R04yMY3-ZppPWxqCr5pvBLYPnV" class="img-fluid" alt="Rossum"></p>

___

# NumPy 

In [17]:
import numpy as np

## Built-in Methods

There are lots of built-in ways to generate Arrays

### ``arange``

Return evenly spaced values within a given interval.

In [3]:
np.arange(0,31,5)

array([ 0,  5, 10, 15, 20, 25, 30])

array([ 0,  5, 10, 15, 20, 25, 30])

### **``zeros``**

Generate arrays of zeros or ones

In [7]:
np.zeros(4,dtype='int')

array([0, 0, 0, 0])

array([0, 0, 0, 0])

In [8]:
np.zeros((5,4),dtype='int')

array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0]])

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

### ``linspace``
Return evenly spaced numbers over a specified interval.

In [10]:
np.linspace(0,50,11)

array([ 0.,  5., 10., 15., 20., 25., 30., 35., 40., 45., 50.])

array([ 0.,  5., 10., 15., 20., 25., 30., 35., 40., 45., 50.])

array([ 0,  0,  0,  0,  0,  1,  1,  1,  1,  1,  2,  2,  2,  2,  2,  3,  3,
        3,  3,  3,  4,  4,  4,  4,  4,  5,  5,  5,  5,  5,  6,  6,  6,  6,
        6,  7,  7,  7,  7,  7,  8,  8,  8,  8,  8,  9,  9,  9,  9,  9, 10])

## Random 

Numpy also has lots of ways to create random number arrays:

### ``rand``
Create an array of the given shape and populate it with
random samples from a uniform distribution
over ``[0, 1)``.

- ``(0, 5)`` = 1, 2, 3, 4
- ``(0, 5]`` = 1, 2, 3, 4, 5
- ``[0, 5)`` = 0, 1, 2, 3, 4
- ``[0, 5]`` = 0, 1, 2, 3, 4, 5

In [11]:
np.random.seed(101) #to get the same result everytime
np.random.rand(3)

array([0.51639863, 0.57066759, 0.02847423])

array([0.51639863, 0.57066759, 0.02847423])

In [12]:
np.random.seed(101)
np.random.rand(3,2)

array([[0.51639863, 0.57066759],
       [0.02847423, 0.17152166],
       [0.68527698, 0.83389686]])

In [13]:
np.random.seed(101)
np.random.rand(2,2,2) #2 pieces of 2x2 array

array([[[0.51639863, 0.57066759],
        [0.02847423, 0.17152166]],

       [[0.68527698, 0.83389686],
        [0.30696622, 0.89361308]]])

### ``randn``

Return a sample (or samples) from the "standard normal" distribution. Unlike rand which is uniform:

In [15]:
np.random.seed(101)
np.random.randn(4)


array([2.70684984, 0.62813271, 0.90796945, 0.50382575])

array([2.70684984, 0.62813271, 0.90796945, 0.50382575])

In [13]:
np.random.seed(101)
np.random.randn(3, 4)

array([[ 2.70684984,  0.62813271,  0.90796945,  0.50382575],
       [ 0.65111795, -0.31931804, -0.84807698,  0.60596535],
       [-2.01816824,  0.74012206,  0.52881349, -0.58900053]])

In [18]:
np.random.randn(1903).mean()  # Standart normal distrubition gives mean of 0
# 1000 samples is good

0.02901008015849377

In [19]:
np.random.randn(1903).std()**2  # Standart normal distrubition gives variance of 1

1.0369104256819919

In [20]:
np.random.seed(101)
np.random.randn(2, 2, 2)

array([[[ 2.70684984,  0.62813271],
        [ 0.90796945,  0.50382575]],

       [[ 0.65111795, -0.31931804],
        [-0.84807698,  0.60596535]]])

### ``randint``
Return random integers from `low` (inclusive) to `high` (exclusive).

In [28]:
# [low, high)
np.random.seed(42)
np.random.randint(10,20,3)


array([16, 13, 17])

array([16, 13, 17])

In [29]:
np.random.seed(42)
np.random.randint(8, size=5)  # the only int (8) refers to the high (stop)

array([6, 3, 4, 6, 2])

In [32]:
np.random.seed(42)
np.random.randint(9, size=(3, 4))

array([[6, 3, 7, 4],
       [6, 2, 6, 7],
       [4, 3, 7, 7]])

In [33]:
np.random.seed(42)
np.random.randint(1, [5, 10, 20])  # 3 different upper bounds

array([ 3,  4, 15])

In [35]:
np.random.seed(42)
np.random.randint(10, [14, 20, 25], size = (4, 3)) 

# 3 different upper bounds with 4x3 matrix

array([[12, 13, 22],
       [12, 17, 22],
       [10, 16, 19],
       [12, 16, 20]])

## Array Attributes and Methods

Let's discuss some useful attributes and methods or an array:

In [36]:
arr=np.arange(0,15)

In [37]:
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

## Reshape
Returns an array containing the same data with a new shape.

In [42]:
arr=arr.reshape(3,5)
arr

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

### ``max``, ``min``, ``argmax``, ``argmin``

These are useful methods for finding max or min values. Or to find their index locations using argmin or argmax

In [52]:
arr=np.random.randint(25,size=15)
arr

array([ 2,  4, 18,  6, 20,  8,  6, 17,  3, 24, 13, 17,  8, 20,  1])

In [53]:
arr.max()

24

In [54]:
arr.argmax()

9

In [55]:
arr.min()

1

In [56]:
arr.argmin()

14

## Some other attributes :
* ``ndim``: number of array dimensions.
* ``shape``: tuple of array dimensions.
* ``size``: Number of elements in the array.
* ``dtype``: data types of the array.

## Bracket Indexing and Selection
The simplest way to pick one or some elements of an array looks very similar to python lists:

In [57]:
arr=np.arange(1,21).reshape(4,5)

In [58]:
arr

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20]])

In [59]:
arr[1:3]

array([[ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15]])

In [61]:
arr[:2]

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10]])

In [62]:
# arr[start:stop:step], odds


In [63]:
arr[1::2]

array([[ 6,  7,  8,  9, 10],
       [16, 17, 18, 19, 20]])

In [None]:
# arr[start:stop:step], evens


In [65]:
arr[::2]

array([[ 1,  2,  3,  4,  5],
       [11, 12, 13, 14, 15]])

## Broadcasting

Numpy arrays differ from a normal Python list because of their ability to broadcast:

In [68]:
#Setting a value with index range (Broadcasting)
arr[:4]=33

In [69]:
#Show
arr

array([[33, 33, 33, 33, 33],
       [33, 33, 33, 33, 33],
       [33, 33, 33, 33, 33],
       [33, 33, 33, 33, 33]])

In [77]:
a=[0,0,0,7,9,12]

In [78]:
a[:3]=[101,101,100]

In [79]:
a

[101, 101, 100, 7, 9, 12]

In [80]:
# Reset array, we'll see why I had to reset in  a moment
arr=np.arange(10,21)

# Show
arr

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20])

In [84]:
# Important notes on Slices
slice_of_arr=arr[0:3]

# Show slice
slice_of_arr

array([10, 11, 12])

In [85]:
slice_of_arr[:3]=[9,7,9]

In [86]:
arr

array([ 9,  7,  9, 13, 14, 15, 16, 17, 18, 19, 20])

In [87]:
#To get a copy, need to be explicit
arr_copy=arr.copy()
arr_copy

array([ 9,  7,  9, 13, 14, 15, 16, 17, 18, 19, 20])

In [88]:
# We can assign a list of values in an exact dimension
arr_copy[:4]=0

In [89]:
arr_copy

array([ 0,  0,  0,  0, 14, 15, 16, 17, 18, 19, 20])

In [90]:
arr

array([ 9,  7,  9, 13, 14, 15, 16, 17, 18, 19, 20])

## Indexing a 2D array (matrices)

<p>The general format is <b>arr_2d[row][col]</b> or <b>arr_2d[row,col]</b>. I recommend usually using the comma notation for clarity.</p>

In [91]:
arr_2d=np.array([[5,10,15],[20,25,30],[35,40,45]])

#Show
arr_2d

array([[ 5, 10, 15],
       [20, 25, 30],
       [35, 40, 45]])

In [95]:
#Indexing row
arr_2d[2]

array([35, 40, 45])

In [97]:
arr_2d[1:3,1:3]

array([[25, 30],
       [40, 45]])

In [98]:
arr_2d[1:3,0:2]=88

In [99]:

arr_2d

array([[ 5, 10, 15],
       [88, 88, 30],
       [88, 88, 45]])

### Fancy Indexing

Fancy indexing allows you to select entire rows or columns out of order,to show this, let's quickly build out a numpy array:

In [100]:
v=np.arange(0,30,3)
v

array([ 0,  3,  6,  9, 12, 15, 18, 21, 24, 27])

In [101]:
v[3]

9

In [102]:
# we can select separate elements using their indices in a list
index_list=[3,5,8]
v[index_list]

array([ 9, 15, 24])

<h3>any_array[[row indices], [column indices]]</h3>

In [106]:
jj=np.arange(1,17).reshape(4,4)

In [107]:
jj

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16]])

In [111]:
# let's select separate values of 6 and 8
jj[[1,1],[1,3]]

array([6, 8])

In [112]:
# this time let's select separate values of 1, 10 and 16
jj[[0,2,3],[0,1,3]    ]

array([ 1, 10, 16])

## Statistical Calculations

In [113]:
v=np.array([1,1,2,2,3,3,3])
v

array([1, 1, 2, 2, 3, 3, 3])

In [114]:
np.mean(v)

2.142857142857143

In [115]:
np.median(v)

2.0

In [116]:
v.sum()

15

In [117]:
v.min()

1

# Pandas Series

In [118]:
import pandas as pd

In [119]:
labels = [i for i in 'python']
my_list = list(np.arange(6))
d = dict(zip(labels,my_list))

arr = np.array([10, 20, 30,40,50,60])


In [121]:
pd.Series(labels)

0    p
1    y
2    t
3    h
4    o
5    n
dtype: object

In [122]:
pd.Series(data=arr, index=labels)

p    10
y    20
t    30
h    40
o    50
n    60
dtype: int32

In [124]:
d

{'p': 0, 'y': 1, 't': 2, 'h': 3, 'o': 4, 'n': 5}

In [125]:
pd.Series(d)

p    0
y    1
t    2
h    3
o    4
n    5
dtype: int64

In [126]:
pd.Series(data = d, index= ['q', 'o', 'y','t','k','p'])

q    NaN
o    4.0
y    1.0
t    2.0
k    NaN
p    0.0
dtype: float64

In [127]:
pd.Series(['pandas', 5, False, np.mean, len])

0                                   pandas
1                                        5
2                                    False
3    <function mean at 0x0000015E6CF36B80>
4                  <built-in function len>
dtype: object

In [128]:
ser = pd.Series([1,2,5,4,6],index = ['numpy', 'pandas','tableau', 'seaborn','matplotlib'])
ser

numpy         1
pandas        2
tableau       5
seaborn       4
matplotlib    6
dtype: int64

In [129]:
ser[2]

5

In [130]:
ser['tableau']

5

In [132]:
ser[2:4]

tableau    5
seaborn    4
dtype: int64

In [134]:
ser['tableau':'seaborn']

tableau    5
seaborn    4
dtype: int64

In [135]:
ser.keys()

Index(['numpy', 'pandas', 'tableau', 'seaborn', 'matplotlib'], dtype='object')

In [136]:
ser.index

Index(['numpy', 'pandas', 'tableau', 'seaborn', 'matplotlib'], dtype='object')

In [137]:
ser.values

array([1, 2, 5, 4, 6], dtype=int64)

# DataFrames

In [139]:
from numpy.random import randn
np.random.seed(101)

In [140]:
df=pd.DataFrame(randn(5,4),columns='w x y z'.split(), index='a b c d e'.split())
df

Unnamed: 0,w,x,y,z
a,2.70685,0.628133,0.907969,0.503826
b,0.651118,-0.319318,-0.848077,0.605965
c,-2.018168,0.740122,0.528813,-0.589001
d,0.188695,-0.758872,-0.933237,0.955057
e,0.190794,1.978757,2.605967,0.683509


In [142]:
df[['w','y']]

Unnamed: 0,w,y
a,2.70685,0.907969
b,0.651118,-0.848077
c,-2.018168,0.528813
d,0.188695,-0.933237
e,0.190794,2.605967


In [143]:
df['c':'e']

Unnamed: 0,w,x,y,z
c,-2.018168,0.740122,0.528813,-0.589001
d,0.188695,-0.758872,-0.933237,0.955057
e,0.190794,1.978757,2.605967,0.683509


In [144]:
df['w+z']=df['w']*df['z']
df

Unnamed: 0,w,x,y,z,w+z
a,2.70685,0.628133,0.907969,0.503826,1.363781
b,0.651118,-0.319318,-0.848077,0.605965,0.394555
c,-2.018168,0.740122,0.528813,-0.589001,1.188702
d,0.188695,-0.758872,-0.933237,0.955057,0.180215
e,0.190794,1.978757,2.605967,0.683509,0.13041


In [145]:
df.drop('w+z', axis=1, inplace=True)
df

Unnamed: 0,w,x,y,z
a,2.70685,0.628133,0.907969,0.503826
b,0.651118,-0.319318,-0.848077,0.605965
c,-2.018168,0.740122,0.528813,-0.589001
d,0.188695,-0.758872,-0.933237,0.955057
e,0.190794,1.978757,2.605967,0.683509


In [146]:
df=df.drop('c',axis=0)
df

Unnamed: 0,w,x,y,z
a,2.70685,0.628133,0.907969,0.503826
b,0.651118,-0.319318,-0.848077,0.605965
d,0.188695,-0.758872,-0.933237,0.955057
e,0.190794,1.978757,2.605967,0.683509


In [149]:
df[["w","y"]]

Unnamed: 0,w,y
a,2.70685,0.907969
b,0.651118,-0.848077
d,0.188695,-0.933237
e,0.190794,2.605967


In [150]:
df.loc['b','y']

-0.8480769834036315

In [151]:
df.iloc[1,2]

-0.8480769834036315

In [152]:
df[(df['w']>0) & (df['z']<1)]

Unnamed: 0,w,x,y,z
a,2.70685,0.628133,0.907969,0.503826
b,0.651118,-0.319318,-0.848077,0.605965
d,0.188695,-0.758872,-0.933237,0.955057
e,0.190794,1.978757,2.605967,0.683509


In [153]:
df.loc[((df.x>1) | (df.y<1)), ['x','w']]

Unnamed: 0,x,w
a,0.628133,2.70685
b,-0.319318,0.651118
d,-0.758872,0.188695
e,1.978757,0.190794


In [155]:
newindx='A B C D'.split()
newindx

['A', 'B', 'C', 'D']

In [156]:
df['newidx']=newindx

In [159]:
df.set_index('newidx',inplace=True)

In [160]:
df

Unnamed: 0_level_0,w,x,y,z
newidx,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,0.188695,-0.758872,-0.933237,0.955057
D,0.190794,1.978757,2.605967,0.683509


___


<p style="text-align: center;"><img src="https://docs.google.com/uc?id=1lY0Uj5R04yMY3-ZppPWxqCr5pvBLYPnV" class="img-fluid" alt="Rossum"></p>

___