# NumPy 

NumPy is a powerful linear algebra library for Python. What makes it so important is that almost all of the libraries in the <a href='https://pydata.org/'>PyData</a> ecosystem (pandas, scipy, scikit-learn, etc.) rely on NumPy as one of their main building blocks. 

NumPy is also incredibly fast, as it has bindings to C libraries. For more info on why you would want to use arrays instead of lists, check out this great [StackOverflow post](http://stackoverflow.com/questions/993984/why-numpy-instead-of-python-lists).

We will only learn the basics of NumPy. 


## Importing NumPy

To get started using NumPy, the first step is to import it.

The most common way (and method you should use) is to import NumPy as the abbreviation np.

If you see the letters np used anywhere in machine learning or data science, it's probably referring to the NumPy library.

In [None]:
import numpy as np

NumPy has many built-in functions and capabilities. We won't cover them all but instead we will focus on some of the most important aspects of NumPy: vectors, arrays, matrices and number generation. Let's start by discussing arrays.

## NumPy Arrays

NumPy arrays are the main way we will use NumPy throughout the course. NumPy arrays essentially come in two flavors: vectors and matrices. Vectors are strictly 1-dimensional (1D) arrays and matrices are 2D (but you should note a matrix can still have only one row or one column).

### Why use Numpy array? Why not just a list?

There are lot's of reasons to use a Numpy array instead of a "standard" python list object. Our main reasons are:
* Memory Efficiency of Numpy Array vs list
* Easily expands to N-dimensional objects
* Speed of calculations of numpy array
* Broadcasting operations and functions with numpy
* All the data science and machine learning libraries we use are built with Numpy

## Let's begin our introduction by exploring how to create NumPy arrays.

## Creating NumPy Arrays from Objects

### From a Python List

We can create an array by directly converting a list or list of lists:

In [None]:
my_list = [1,2,3]

In [None]:
my_list

In [None]:
type(my_list)

In [None]:
arr = np.array(my_list)

In [None]:
arr

In [None]:
#help(np.array)

In [None]:
type(arr)

In [None]:
arr.dtype

In [None]:
#creating an object with list of lists (nested list)
my_matrix = [[1,2,3],[4,5,6],[7,8,9]]
my_matrix
#output is a normal python list consisting of three items. Each item in the list is another list

In [None]:
np.array(my_matrix)
#The output is a two-dimensional arry. 3 rows and 3 columns

In [None]:
# 1-dimensonal array, also referred to as a vector
a1 = np.array([1, 2, 3])

# 2-dimensional array, also referred to as matrix
a2 = np.array([[1, 2.0, 3.3],
               [4, 5, 6.5]])

# 3-dimensional array, also referred to as a matrix
a3 = np.array([[[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]],
                [[10, 11, 12],
                 [13, 14, 15],
                 [16, 17, 18]]])

In [None]:
a1

In [None]:
a2 

In [None]:
a3

In [None]:
a1.ndim, a2.ndim, a3.ndim
# Printing array dimensions (axes)

In [None]:
# Printing type of elements in array
a1.dtype, a2.dtype, a3.dtype

In [None]:
# Printing size (total number of elements) of array
a1.size, a2.size, a3.size

In [None]:
a1.shape
# Printing shape of array

In [None]:
a2.shape

In [None]:
a3.shape
#first dimension represents the block size(total number of 2D arrays)

In [None]:
# Printing type of arr object
type(a1), type(a2), type(a3)

In [None]:
# Creating an array from list with type float 
arr = np.array([[1, 2, 4], [5, 8, 7]], dtype = 'float')

In [None]:
arr

## Built-in Methods to create arrays

There are lots of built-in ways to generate arrays.

### arange

Return evenly spaced values within a given interval. [[reference](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.arange.html)]

In [None]:
np.arange(0,10)
#To create sequences of numbers, NumPy provides a function arange that returns arrays instead of lists.
#help(np.arange)

In [None]:
#Now we can add a step size. 0 to 100 with a step size of 2
np.arange(0,101,2)

In [None]:
np.arange(10)**3

### zeros and ones

Generate arrays of zeros or ones. [[reference](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.zeros.html)]

In [None]:
np.zeros(3)
#help(np.zeros)
#It produces a one dimensional vector of zeros. Note that these zeros are floating point numbers.

In [None]:
np.zeros((5,5))
# five rows and five columns

In [None]:
np.ones(3)
#help(np.ones)

In [None]:
np.ones((3,2))

In [None]:
# Create a constant value array
np.full((3, 3),6)
#help(np.full)

In [None]:
# Create a constant value array of complex type
np.full((3, 3),6,dtype='complex')

### linspace 
Return evenly spaced numbers over a specified interval. [[reference](https://www.numpy.org/devdocs/reference/generated/numpy.linspace.html)]

In [None]:
#another useful function for creating NumPy arrays is linspace function. 
#It returns evenly spaced numbers over a specified interval.
np.linspace(0,10,3)
#It is goint to take a starting point, and a stopping point, and then the third parameter is num 
#which is how many numbers do we want in between start and stop points.
#It will give you 3 evenly/linearly spaced numbers between 0 to 10
#Notice that the space between the numbers between 0 to 5, and 5 to 10 is evenly spaced.
#help(np.linspace)

In [None]:
np.linspace(0,50,5)

In [None]:
np.linspace(0,10,11) # By default endpoint is set to True

In [None]:
np.linspace(0,5,20)

<font color=green>Note that `.linspace()` *includes* the stop value. To obtain an array of common fractions, increase the number of items:</font>

In [None]:
np.linspace(0,5,21)

### eye

Creates an identity matrix [[reference](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.eye.html)]

In [None]:
np.eye(4)

## Random 
Numpy also has lots of ways to create random number arrays:

### rand
Creates an array of the given shape and populates it with random samples from a uniform distribution over ``[0, 1)``. [[reference](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.rand.html)]

In [None]:
np.random.rand(2)

In [None]:
np.random.rand(3,3)

### randint
Returns random integers from `low` (inclusive) to `high` (exclusive).  [[reference](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.randint.html)]

In [None]:
np.random.randint(1,100)
#Retruns random integers. Lower thresold, Higher threshold is exclusive

In [None]:
np.random.randint(1,100,10)

In [None]:
np.random.randint(0,51,(3,4))

In [None]:
random_array = np.random.randint(10, size=(5, 3))
random_array

In [None]:
# Pseudo-random numbers
np.random.seed(seed=99999)
random_array = np.random.randint(10, size=(5, 3))
random_array

In [None]:
np.unique(random_array)

### seed
Can be used to set the random state, so that the same "random" results can be reproduced. [[reference](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.seed.html)]

In [None]:
#seed value is an arbitrary choice. You can use any number
np.random.seed(42)
np.random.rand(4)

In [None]:
np.random.seed(101)
np.random.rand(4)

In [None]:
np.random.seed(999)
np.random.rand(10)

### Reshape
Returns an array containing the same data with a new shape. [[reference](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.reshape.html)]

In [None]:
arr = np.arange(25)

In [None]:
arr

In [None]:
#We can use reshape method to reshape an array.  
# Consider an array with shape (a1, a2, a3, …, aN). 
#We can reshape and convert it into another array with shape (b1, b2, b3, …, bM). 
#The only required condition is: a1 x a2 x a3 … x aN = b1 x b2 x b3 … x bM
arr.reshape(5,5)
#help(np.reshape)

In [None]:
arr.reshape(5,3)

In [None]:
#Reshaping 3X4 array to 2X2X3 array.
arr = np.array([[1, 2, 3, 4], [5, 2, 4, 2], [1, 2, 0, 1]]) 
newarr = arr.reshape(2, 2, 3) 
print ("\n Original array:\n", arr) 
print ("Reshaped array:\n", newarr)
print(arr.ndim)
print(newarr.ndim)

### Flatten

In [None]:
#We can use flatten method to get a copy of array collapsed into one dimension.
# Flatten array 
arr = np.array([[1, 2, 3], [4, 5, 6]]) 
flarr = arr.flatten() 
print ("\n Original array:\n", arr) 
print ("Fattened array:\n", flarr)

#help(arr.flatten)

### max, min, argmax, argmin

These are useful methods for finding max or min values. Or to find their index locations using argmin or argmax

In [None]:
randarr=np.random.randint(1,100,20)

In [None]:
randarr

In [None]:
randarr.max()

In [None]:
randarr.argmax()
#if you want to know the index location of max value. Index location starts at 0

In [None]:
randarr.min()

In [None]:
randarr.argmin()
#if you want to know the index location of min value. Index location starts at 0

## NumPy Indexing and Selection
In this we will discuss how to select elements or groups of elements from an array.

In [None]:
#Creating sample array
arr = np.arange(0,50,2)

In [None]:
#Show
arr

In [None]:
#Get a value at an index
arr[8]

In [None]:
#Get values in a range
arr[1:5]
#Selecting a range of elements. Starting index is 1. 5 is not included.Starting at 1, upto not including 5.

In [None]:
#Get values in a range
arr[0:5]

In [None]:
arr[:5]
#These two are same

In [None]:
arr[5:]
#gives you all the elements starting at 5

In [None]:
arr[5:-2]

In [None]:
arr[5:-4]

In [None]:
arr[2:10:2]

### Indexing a 2D array (matrices)
The general format is arr_2d[row][col] or arr_2d[row,col]. I recommend using the comma notation for clarity.

In [None]:
arr_2d = np.array(([5,10,15],[20,25,30],[35,40,45]))

#Show
arr_2d

In [None]:
#Indexing row
arr_2d[1]

In [None]:
#Shape bottom row
arr_2d[2]

In [None]:
# Format is arr_2d[row][col] or arr_2d[row,col]

# Getting individual element value
arr_2d[1][0]

In [None]:
# Getting individual element value
arr_2d[1,0]

In [None]:
arr_2d[:2]
#returns all the rows upto but not includding 2

In [None]:
# 2D array slicing
#Shape (2,2) from top right corner
arr_2d[:2,1:]

In [None]:
#All the rows and last two cloumns
arr_2d[:,1:]

### Conditional Selection
This is a very fundamental concept that will directly translate to pandas later on, make sure you understand this part!

Let's briefly go over how to use brackets for selection based off of comparison operators.



In [None]:
arr = np.arange(0,51,2)
arr

In [None]:
arr > 10
#returns a numpy array of boolean values

In [None]:
bool_arr = arr>10

In [None]:
bool_arr

In [None]:
arr[bool_arr]

In [None]:
arr[arr>5]

In [None]:
x = 2
arr[arr>x]

### Broadcasting
NumPy arrays differ from normal Python lists because of their ability to broadcast. With lists, you can only reassign parts of a list with new parts of the same size and shape. That is, if you wanted to replace the first 5 elements in a list with a new value, you would have to pass in a new 5 element list. With NumPy arrays, you can broadcast a single value across a larger set of values:

In [None]:
arr

In [None]:
#Setting a value with index range (Broadcasting)
arr[0:5]=100

#Show
arr

In [None]:
# Reset array
arr = np.arange(0,51,2)

#Show
arr

In [None]:
arr[2:10:2]=100
arr

In [None]:
#To get a copy, need to be explicit
arr_copy = arr.copy()

arr_copy

## Liner Algebra

In [None]:
# transpose of array 
a = np.array([[1, 2, 3], [3, 4, 5], [9, 6, 0]]) 
print ("\n Original array:\n", a) 
a.transpose()

#help(np.linalg)

In [None]:
np.linalg.inv(a)

In [None]:
#solve: x+2y = 5 , 3x+4y = 7 , AX = B
A = np.array([[1.0, 2.0], [3.0, 4.0]])
print("Array A is:\n",A)
B = np.array([[5.], [7.]])
print("Array B is:\n",B)

In [None]:
sol = np.linalg.inv(A).dot(B)
print(sol)
#or
#print(np.linalg.solve(A, B))

In [None]:
np.linalg.solve(A, B)

In [None]:
#Solve the following using linear algebra:
#  2x + 4y + 7z = 4
#  3x + 3y + 2z = 8
#  5x + 6y + 3z =0

## NumPy Operations

### Arithmetic
You can easily perform array with array arithmetic, or scalar with array arithmetic. Let's see some examples:

In [None]:
arr = np.arange(0,10)
arr

In [None]:
arr + arr

In [None]:
arr * arr

In [None]:
arr - arr

In [None]:
# This will raise a Warning on division by zero, but not an error!
# It just fills the spot with nan
arr/arr

In [None]:
# Also a warning (but not an error) relating to infinity
1/arr

In [None]:
arr**3

In [None]:
#Basic array operations
#Operations on single array
# Defining Array 1
a = np.array([[1, 2], [3, 4]])
print ("\n Original array:\n", a)

# Adding 1 to every element
print ("Adding 1 to every element:\n", a + 1)

# Subtracting 2 from each element
print ("\n Subtracting 2 from each element:\n", a - 2)

# multiply each element by 10 
print ("\n Multiplying each element by 10:\n", a*10)

# square each element 
print ("\n Squaring each element:\n", a**2) 

# modify existing array 
a *= 2
print ("\n Doubled each element of original array:\n", a) 

# transpose of array 
a = np.array([[1, 2, 3], [3, 4, 5], [9, 6, 0]]) 
print ("\n Original array:\n", a) 
print ("\n Transpose of array:\n", a.T)

### Universal Array Functions
NumPy comes with many universal array functions, or ufuncs, which are essentially just mathematical operations that can be applied across the array.
Let's show some common ones:

In [None]:
arr

In [None]:
# Taking Square Roots
np.sqrt(arr)

In [None]:
# Calculating exponential (e^)
np.exp(arr)

In [None]:
# Trigonometric Functions like sine
np.sin(arr)

In [None]:
# Taking the Natural Logarithm
np.log(arr)

## Summary Statistics on Arrays
NumPy also offers common summary statistics like sum, mean and max. You would call these as methods on an array.

In [None]:
arr

In [None]:
arr.sum()

In [None]:
arr.mean()

In [None]:
arr.max()

In [None]:
#Unary operators
arr = np.array([[1, 5, 6], [4, 7, 2], [3, 1, 9]]) 
print("\n Array:\n", arr) 
# maximum element of array 
print ("Largest element is:", arr.max()) 
print ("Row-wise maximum elements:",  arr.max(axis = 1)) 
  
# minimum element of array 
print ("Column-wise minimum elements:", arr.min(axis = 0)) 
  
# sum of array elements 
print ("Sum of all array elements:", arr.sum()) 

# cumulative sum along each row 
print ("Cumulative sum along each row:\n", arr.cumsum(axis = 1))

In [None]:
#Binary operators
a = np.array([[1, 2], [3, 4]]) 
b = np.array([[4, 3], [2, 1]]) 
print("\n Array a:\n", a)
print("\n Array b:\n", b)
# add arrays 
print ("Array sum:\n", a + b) 
  
# multiply arrays (elementwise multiplication) 
print ("Array multiplication:\n", a*b) 
  
# matrix multiplication 
print ("Matrix multiplication:\n", a.dot(b))
#print ("Matrix multiplication:\n", a@b)

### Sorting

In [None]:
#Sorting an Array
#There is a simple np.sort method for sorting NumPy arrays.
a = np.array([[1, 4, 2], [3, 4, 6], [0, -1, 5]]) 
  
# sorted array 
print ("Array elements in sorted order:\n", np.sort(a, axis = None)) 
  
# sort array row-wise 
print ("Row-wise sorted array:\n", np.sort(a, axis = 1))

# sort array column-wise 
print ("Column-wise sorted array:\n", np.sort(a, axis = 0)) 
  
# specify sort algorithm 
print ("Column wise sort by applying merge-sort:\n", 
       np.sort(a, axis = 0, kind = 'mergesort')) 

#help(np.sort)

# SCiPy: ScientificComputing

In [None]:
# We are trying to solve a linear algebra system which can be given as:
#               1x + 2y =5
#               3x + 4y =7
from scipy import linalg
# Create input array
A= np.array([[1,2],[3,4]])

# Solution Array
B= np.array([[5],[7]])

# Solve the linear algebra
X= linalg.solve(A,B)

# Print results
print(X)

# Checking Results
print("\n Checking results, following vector should be all zeros")
print(A.dot(X)-B)

In [None]:
#Finding inverse of a matrix
A = np.array([[1.0, 2.0], [3.0, 4.0]])
print("Original matrix: \n", A)
print("\n Inverse of matrix A is: \n", linalg.inv(A))

In [None]:
#Compute the determinant of a matrix
#The determinant of a square matrix is a value derived arithmetically from the coefficients of the matrix: ad-bc
print(linalg.det(A))

In [None]:
#Singular Value Decomposition.
linalg.svd(A)

In [None]:
#SciPy’s Special Function package provides a number of functions 
#through which you can find exponents and solve trigonometric problems.
from scipy import special
a = special.exp10(3)
print(a)
 
b = special.exp2(3)
print(b)
 
#Sine of angle given in degrees
c = special.sindg(90)
print(c)
 
d = special.cosdg(45)
print(d)

e = special.tandg(45)
print(e)

#help(special)

In [None]:
#Cubic Root Function: Cubic Root function finds the cube root of values.
cb = special.cbrt([27, 64])
#print value of cb
print(cb)

In [None]:
#Eigenvalues and Eigenvector
#define two dimensional array
arr = np.array([[5,4],[6,3]])
#pass value into function
eg_val, eg_vect = linalg.eig(arr)
#get eigenvalues
print("eigen values:\n", eg_val)
#get eigenvectors
print("eigen vector:\n", eg_vect)

# Matplotlib
Matplotlib is the "grandfather" library of data visualization with Python. It was created by John Hunter. He created it to try to replicate MatLab's (another programming language) plotting capabilities in Python. So if you happen to be familiar with matlab, matplotlib will feel natural to you.

It is an excellent 2D and 3D graphics library for generating scientific figures.

I encourage you just to explore the official Matplotlib web page: http://matplotlib.org/

In [None]:
# importing matplotlib module  
import matplotlib.pyplot as plt
#or - from matplotlib import pyplot as plt

#help(plt)
#dir(plt)

### Line Plot
We can create a very simple line plot using the following ( I encourage you to pause and use Shift+Tab along the way to check out the document strings for the functions we are using).
Let's walk through a very simple example using two numpy arrays. You can also use lists, but most likely you'll be passing numpy arrays or pandas columns (which essentially also behave like arrays).

In [None]:
import numpy as np

In [None]:
x = np.arange(0,10)

In [None]:
y = 2*x

In [None]:
x

In [None]:
y

In [None]:
# Function to plot 
plt.plot(x, y) 
plt.xlabel('x')
plt.ylabel('y')
plt.title('Test Plot')
#plt.show() # Required for non-jupyter users

In [None]:
#Editing more figure parameters
plt.plot(x, y) 
plt.xlabel('x')
plt.ylabel('y')
plt.title('Test Plot')
plt.xlim(0,6) # Lower Limit, Upper Limit
plt.ylim(0,12) # Lower Limit, Upper Limit
#plt.show() # Required for non-jupyter users

In [None]:
#Exporting a plot
plt.plot(x, y) 
plt.xlabel('x')
plt.ylabel('y')
plt.title('Test Plot')
plt.xlim(0,6) # Lower Limit, Upper Limit
plt.ylim(0,12) # Lower Limit, Upper Limit
plt.savefig('example.png')
#plt.savefig('C:/Users/Administrator/Desktop/fig1.png')
#help(plt.savefig)

In [None]:
ls

In [None]:
a = np.linspace(0, 2, 11)
a

In [None]:
plt.plot(a, a, label='a')
plt.plot(a, a**2, label='$a^2$')
plt.plot(a, a**3, label='$a^3$')

plt.xlabel('x label')
plt.ylabel('y label')

plt.title("Simple Plot")

plt.legend()

#plt.show()

In [None]:
a = np.arange(0, 10, 0.2)
a

In [None]:
b = np.sin(a)
plt.plot(a,b)
#plt.show()

In [None]:
plt.plot(a,b, marker='x')
#plt.show()
plt.plot(a, np.cos(a), marker='o')
#plt.show()

### Bar Plot

In [None]:
department=['IT','ECE','EEE','MECH','CIVIL']
students=[190,136,189,67,56]

In [None]:
# Function to plot the bar 
plt.bar(department,students)
plt.xlabel("Departments")
plt.ylabel("Students")
plt.title('Departments vs Students')
#help(plt.bar)

In [None]:
plt.barh(department,students)
plt.title('Departments vs Students')

In [None]:
#Multiple Bars showing total number of boys and girls in a department
boys=[40,25,34,56,23]
girls=[30,15,24,66,20]

In [None]:
w = 0.4
bar1 = np.arange(len(department))
plt.bar(bar1,boys, w, label="Boys")
bar2 = [i+w for i in bar1]
plt.bar(bar2,girls, w,label="Girls")
plt.title('Departments vs Students')
#plt.xticks(bar1,department)
plt.xticks(bar1+w/2,department)
plt.legend()

### Scatter Plot

In [None]:
x

In [None]:
y

In [None]:
# Function to plot scatter 
#plt.scatter(x, y)
plt.scatter(x, y,color='r',s=100,marker='*')
plt.grid()
#help(plt.scatter)

In [None]:
plt.scatter(x, y)
plt.scatter(x, x*3)

### Histogram

In histograms, x axis contains a variable and y axis will be a frequency of that variable.
We have a sample data of blood sugar level of different patients, we will try to plot number of patients by blood range and try to figure out how many patients are normal, pre-diabetic and diabetic

In [None]:
blood_sugar = [113, 85, 90, 150, 149, 88, 93, 115, 135, 80, 77, 82, 129]
plt.hist(blood_sugar) # by default number of bins is set to 10

In [None]:
blood_sugar = [113, 85, 90, 150, 149, 88, 93, 115, 135, 80, 77, 82, 129]
plt.hist(blood_sugar, rwidth=0.95) # by default number of bins is set to 10

In [None]:
plt.hist(blood_sugar,rwidth=0.95,bins=3)

Histogram showing normal, pre-diabetic and diabetic patients distribution
80-100: Normal
100-125: Pre-diabetic
125-150: Diabetic

In [None]:
plt.xlabel("Sugar Level")
plt.ylabel("Number Of Patients")
plt.title("Blood Sugar Chart")

plt.hist(blood_sugar, bins=[80,100,125,150], rwidth=0.95, color='g')

In [None]:
plt.xlabel("Sugar Level")
plt.ylabel("Number Of Patients")
plt.title("Blood Sugar Chart")

plt.hist(blood_sugar,bins=[80,100,125,150],rwidth=0.95,histtype='step')

In [None]:
#horizontal orientation
plt.xlabel("Number Of Patients")
plt.ylabel("Sugar Level")
plt.title("Blood Sugar Chart")

plt.hist(blood_sugar, bins=[80,100,125,150], rwidth=0.95, orientation='horizontal')

In [None]:
plt.xlabel("Sugar Level")
plt.ylabel("Number Of Patients")
plt.title("Blood Sugar Chart")

blood_sugar_men = [113, 85, 90, 150, 149, 88, 93, 115, 135, 80, 77, 82, 129]
blood_sugar_women = [67, 98, 89, 120, 133, 150, 84, 69, 89, 79, 120, 112, 100]

plt.hist([blood_sugar_men,blood_sugar_women], bins=[80,100,125,150], rwidth=0.95, color=['green','orange'],label=['men','women'])
plt.legend()

## Subplots

In [None]:
#help(plt.subplots)

In [None]:
#Subplot:number of rows, number of col, fig name.. eg. subplot(2,1,1)
plt.subplot(2,1,1)
x = np.arange(0, 10, 0.2)
sin = np.sin(x)
cos = np.cos(x)
plt.plot(x,sin,color='red',linewidth=2.4)

plt.subplot(2,1,2)
plt.plot(x,cos,color='green',linewidth=3.4)

In [None]:
plt.figure(figsize=(10,2))
plt.subplot(1,2,1)
x = np.arange(0, 10, 0.2)
sin = np.sin(x)
cos = np.cos(x)
plt.plot(x,sin,color='red',linewidth=2.4)

plt.subplot(1,2,2)
plt.plot(x,cos,color='green',linewidth=3.4)

In [None]:
x = np.linspace(0,10,11)
x

In [None]:
plt.figure(figsize=(10,8))
y1 = x
y2 = x**2
y3 = x**3
y4 = np.sqrt(x)

plt.subplot(2,2,1)
plt.plot(x,y1,'ro')
plt.title('$y_1=x$')

plt.subplot(2,2,2)
plt.plot(x,y2,'g--')
plt.title('$y_2=x^2$')

plt.subplot(2,2,3)
plt.plot(x,y3,'b^')
plt.title('$y_3=x^3$')

plt.subplot(2,2,4)
plt.plot(x,y4,'ks')
plt.title('$y_4=\sqrt{x}$')

In [None]:
#Creating subplots using Axes
#Create a figure and two subplots
x = np.arange(0, 10, 0.2)
sin = np.sin(x)
cos = np.cos(x)
fig, (ax1, ax2) = plt.subplots(2, 1)
ax1.plot(x, sin, marker='x')
ax1.set_title('SinX')
ax2.plot(x, cos, marker='o')
ax2.set_title('CosX')
#plt.show()

#help(plt.subplots)
#dir(ax1)
#help(ax1)

# Pandas
Pandas is an open source library for Data Analysis for Python.
It uses an exreamly powerful table object called a Data Frame system which is built directly off of Numpy.

## Series
The first main data type we will learn about for pandas is the Series data type. Let's import Pandas and explore the Series object.
A series is a data strucrue in Pandas that holds an array of information along with a named index.
A Series is very similar to a NumPy array (in fact it is built on top of the NumPy array object). What differentiates the NumPy array from a Series, is that a Series can have axis labels, meaning it can be indexed by a label, instead of just a number location. It also doesn't need to hold numeric data, it can hold any arbitrary Python Object.

In [None]:
import numpy as np
import pandas as pd

In [None]:
#Creating a Series from Python Objects. We can create a Series from Python lists (also from NumPy arrays)
#help(pd.Series)

In [None]:
myindex = ['USA','Canada','Mexico']

In [None]:
mydata = [1776,1867,1821]

In [None]:
myser = pd.Series(data=mydata)

In [None]:
myser

In [None]:
myser = pd.Series(data=mydata,index=myindex)
#Pands series adds on a labeled index.

In [None]:
myser

In [None]:
myser[0]

In [None]:
myser['USA']

In [None]:
#From a Dictionary
ages = {'Sammy':5,'Frank':10,'Spike':7}

In [None]:
ages

In [None]:
pd.Series(ages)

In [None]:
# Imaginary Sales Data for 1st and 2nd Quarters for Global Company
q1 = {'Japan': 80, 'China': 450, 'India': 200, 'USA': 250}
q2 = {'Brazil': 100,'China': 500, 'India': 210,'USA': 260}

In [None]:
# Convert into Pandas Series
sales_Q1 = pd.Series(q1)
sales_Q2 = pd.Series(q2)

In [None]:
sales_Q1

In [None]:
sales_Q2

In [None]:
# Call values based on Named Index
sales_Q1['Japan']

In [None]:
# Integer Based Location information also retained!
sales_Q1[0]

In [None]:
# Grab just the index keys
sales_Q1.keys()

In [None]:
# Can Perform Operations Broadcasted across entire Series
sales_Q1 * 2

In [None]:
sales_Q2 / 100

In [None]:
# Notice how Pandas informs you of mismatch with NaN
sales_Q1 + sales_Q2

## DataFrames
A Data Frame is a table of columns and rows in Pandas that we can easily restructure and filter. (Formal Definition: A group of Pandas Series objects that share the same index)

In [None]:
#Creating a DataFrame from Python Objects
# help(pd.DataFrame)
# Make sure the seed is in the same cell as the random call
np.random.seed(42)
mydata = np.random.randint(0,101,(4,3))

In [None]:
mydata

In [None]:
myindex = ['CA','NY','AZ','TX']

In [None]:
mycolumns = ['Jan','Feb','Mar']

In [None]:
df = pd.DataFrame(data=mydata)
df

In [None]:
df = pd.DataFrame(data=mydata,index=myindex)
df

In [None]:
df = pd.DataFrame(data=mydata,index=myindex,columns=mycolumns)
df 

In [None]:
df.info()

### Reading a .csv file for a DataFrame

In [None]:
#Print your current directory file path with pwd
#WHERE IS MY PYTHON CODE LOCATED?

In [None]:
pwd

In [None]:
ls

In [None]:
df = pd.read_csv('Data.csv')

In [None]:
df

In [None]:
#Obtaining Basic Information About DataFrame
df.columns

In [None]:
df.index

In [None]:
df.head(3)

In [None]:
df.tail(3)

In [None]:
df.shape

In [None]:
df.info()

In [None]:
len(df)

In [None]:
df.describe()

In [None]:
df.describe().transpose()

In [None]:
#Selection and Indexing
df.head()

In [None]:
#Grab a Single Column
df['Salary']

In [None]:
#Grab Multiple Columns
# Note how its a python list of column names! Thus the double brackets.
df[['Salary','Purchased']]

In [None]:
#Create New Columns
df['monthly_salary'] = df['Salary'] / 12

In [None]:
df

In [None]:
#Adjust Existing Columns
df['monthly_salary'] = np.round(df['monthly_salary'],2)
#help(np.round)

In [None]:
df

In [None]:
#Remove Columns
df = df.drop("monthly_salary",axis=1)

In [None]:
df

In [None]:
#Grab a Single Row
# Integer Based
df.iloc[0]

In [None]:
# Grab Multiple Rows
df.iloc[0:4]

In [None]:
# Grab Multiple Rows
df.iloc[2:7]