<img src="pythonwebinar.png" style="width:650px">

### AccelerateAI - Python for Data Science - Notebook 04
##### Introduction to Python Language  (Python 3) 
In this notebook we will learn the usage of following package: 
* 1. NumPy       <br> 

We will cover the following in nest session:
* 2. Pandas      <br>
* 3. String & Text <br>
***

#### 1. Numpy
- Numpy is short for <b>Num</b>erical <b>Py</b>thon
- It is a fundamental package required for high performance scientific computing and data analysis
- Many data analysis libraries are built on top of NumPy
- Here are some of the things it provides:
    - <b>ndarray</b>, a fast and space-efficient multidimensional array providing 
        - fast vectorized array operations for data munging and cleaning
        - common array algorithms like sorting, unique, and set operations 
        - efficient descriptive statistics and aggregating/summarizing data 
        - data alignment and relational data manipulations for merging and joining datasets
        - expressing conditional logic as array expressions instead of loops with if-elifelse branches
        - group-wise data manipulations 
    - tools for reading / writing array data to disk and working with memory-mapped files 
    - linear algebra, random number generation, and Fourier transform capabilities
    - tools for integrating code written in C, C++, and Fortran

##### 1.1.1 Scalars in Numpy
- Python defines only one type of a particular data class (1 integer type, 1 floating-point type, etc.)
- In NumPy, there are 24 new fundamental Python types to describe different types of scalars
        - numpy.generic : Base class for numpy scalar types
        - numpy.ushort, numpy.uint, numpy.ulonglong etc for unsigned integer
        - numpy.half, numpy.single, numpy.double, numpy.longdouble etc : for floating point 
        - numpy.datetime for storing time
        - numpy.str_ for string

In [None]:
#importing numpy
import numpy as np
import pandas as pd

In [None]:
print(np.ScalarType)

In [None]:
x = np.int8(1)
y = np.float32(1.0)
x == y                                          #What do you expect the result to be?      

In [None]:
print(x.dtype , y.dtype)

In [None]:
#check current version of pandas
print("Pandas Version:", pd.__version__)

##### 1.1.2 Numpy Constants - NumPy includes several constants
 - numpy.Inf
 - numpy.nan    (NaN, NAN are aliases)
 - numpy.NINF
 - numpy.PZERO
 - numpy.NZERO
 - numpy.pi
 - numpy.euler_gamma

In [None]:
np.log(0)

In [None]:
np.NZERO

In [None]:
np.pi

##### 1.2 Numpy Ndarray creation and reshape

In [None]:
#importing numpy


In [None]:
#creating array from list
data1 = [6, 7.5, 8, 0, 1]
arr1 = np.array(data1)
#arr1                                           #Note the way it is represented with 2 paranthesis

In [None]:
#print("Dimensions:", arr1.ndim)

In [None]:
# Nested sequences, like a list of equal-length lists, will be converted into a multidimensional array
data2 = [[1, 2, 3, 4], [5, 6, 7, 8]]
arr2 = np.array()                              # pass data2 in the method
arr2

In [None]:
print("Dimensions:", arr2.ndim)
print("Shape:", arr2.)                         # check the shape

In [None]:
# other functions for creating new arrays.
a = np.(10)                                    # np.zeros() - array will all zero
b = np.((3, 6))                                # np.ones()  - array with all ones
print("a=", a,"\n","b=",b)

In [None]:
np.eye(3)                                      #Identity matrix

In [None]:
#np.random.rand(3,3)                           # random values

In [None]:
arr = np.arange(8)
arr

In [None]:
#change the dimension of the array
arr.((4, 2))                                    #reshape() - Note the order of numbers 

In [None]:
x = arr.reshape((4,2), order=?)                 #Order C- C language type, F - Fortran type
x

In [None]:
#chaining 
np.arange(10).reshape((2,5))                     #chain multiple methods in one line

In [None]:
#convert from higher dimension to single dimension
x.()                                             #flatten - note the order of numbers

##### 1.3 Ndarray attributes

In [None]:
arr = np.arange(9).reshape(3,3, order='F')
arr

In [None]:
arr.()                                                       #max() - maximum max value; min()- minimum value

In [None]:
arr.argmax(axis=1)                                           #argmax() - index of maximum value, similarly argmin

In [None]:
arr.dtype                                                    #data type - attribute, not method

##### 1.4 Array arithmetic 
 - arithmetic operations applies the operation elementwise

In [None]:
arr = np.array([[1., 2., 3.], [4., 5., 6.]])
arr +                                                        # element wise addition - recycle rule applies

In [None]:
arr *                                                        # multiplication with a constant

In [None]:
arr*arr                                                      #what's going to happen?

In [None]:
np.sqrt(arr)

In [None]:
np.square(arr)

In [None]:
arr.sum()                                                     #sum() - total of all elements

In [None]:
arr.mean()                                                    #mean() - average of all elements           

In [None]:
arr.mean(axis=1)                                              #specify axis to get row(0) or column(1) average            

##### 1.5 Array Slicing

In [None]:
#slicing
arr = np.arange(10)
arr

In [None]:
arr[0]                                                        #index starts at 0

In [None]:
arr[ ]                                                        #slicing within a range

In [None]:
#broadcasting 
arr[5:8] = 12                                                 #slicing creates a view - hence original arr is modified
arr

In [None]:
# Sllicing in higher dimension
arr = np.arange(9).reshape(3,3)
arr

In [None]:
# slicing : row, column
arr[ ]                                                        # try [1,2]

In [None]:
# slicing : rowstart:rowend-1, columnstart:columnend-1

In [None]:
arr[1:3,1:3]                                                  # try [1:3, 1:3] will return a 2 x 2 array

In [None]:
#What would be the output?
arr[1,:3]                                  

In [None]:
arr = np.empty((8, 4))
for i in range(8):
    arr[i] = i
arr

In [None]:
arr[[4, 3, 6]]                                #select specified rows

In [None]:
#Array copy
arr = np.array([11, 12, 13, 14, 15])

x = arr.copy()                                #creates a copy of original array

arr[0] = 42
print(arr)
print(x)                                      # note that X doesn't change

##### 1.6 Transpose, Swapping and Dot product

In [None]:
arr = np.arange(16).reshape(4, 4)
arr

In [None]:
arr.T                                                   #transpose 

In [None]:
arr.swapaxes(1,0)                                       #same as transpose - as there are only 2 axes

In [None]:
np.dot(arr.T, arr)                                      #dot product

In [None]:
arr.trace()

In [None]:
np.linalg.eig(arr)                                       #eigen values & eigen vectors

In [None]:
np.linalg.svd(arr)                                       #singular value decomposition

##### 1.7 Numpy Example

In [None]:
sample1 = np.random.normal(100,15,40)
sample2= np.random.normal(125,15,40)

In [None]:
def t_test(x, y):
    diff = y - x
    var = np.var(diff, ddof=1)
    num = np.mean(diff)
    denom = np.sqrt(var / len(x))
    return np.divide(num, denom)

In [None]:
# Null hypothesis : mean(sample1) = mean(sample2)

t_stats = t_test(sample1, sample2)


In [None]:
from scipy import stats
dof = len(sample1) - 1
p_value = 1- stats.distributions.t.cdf(t_stats, dof)

print("The t value is {} and the p value is {}.".format(t_stats, p_value))               

Thats all folks ! 