# Connect Intensive - Machine Learning Nanodegree

## Week 1. Python Crash Course  

## Objectives    

- Jupyter notebook 
- Basic Python programming  
- Numpy
- Pandas 
- Data visualization with Matplotlib and Seaborn 

## Prerequisites   

 - You should have **Python 2.7** installed (if not, please [download and install Python 2.7](https://www.python.org/downloads/))
 - You should also install (and perhaps upgrade) the following packages, if you haven't already:
    - [numpy](http://www.numpy.org/)
    - [pandas](http://pandas.pydata.org/)
    - [matplotlib](http://matplotlib.org/)  
    - [seaborn](http://seaborn.pydata.org)  

---

## 2 | Numpy

Numpy is a Linear Algebra Library for Python. Almost all of the libraries in the PyData Ecosystem rely on Numpy as one of their main building blocks. Numpy is also incredibly fast, as it has bindings to C libraries. 

Some basic topics we cover here include:
- Create numpy array  
- Built-in methods in numpy array
- Array indexing / selection / slicing
- Broadcasting 
- Array operations 

> **Aditional reference:** 
> Check out [here](http://stackoverflow.com/questions/993984/why-numpy-instead-of-python-lists) for a post about why use array instead of list, and check out [here](https://www.dataquest.io/blog/numpy-cheat-sheet/) for a Numpy cheat sheet. 

In [None]:
# import numpy as a library
import numpy as np

### Create Numpy Array from Python List

In [None]:
my_list = [1, 2, 3]
my_list

In [None]:
# create numpy array from list
np.array(my_list)

In [None]:
my_matrix = [[1,2,3],[4,5,6],[7,8,9]]
my_matrix

In [None]:
np.array(my_matrix)

### Built-in Methods

**arange**

In [None]:
np.arange(0, 10)

In [None]:
np.arange(0, 10, 3) # a step size of 3

**linspace**   

In [None]:
# np.linspace(start, stop, num)
np.linspace(0, 10, 3) # generaet 3 numbers

In [None]:
np.linspace(0, 10) # default num is 50

**zeros and ones**

In [None]:
np.zeros(5)

In [None]:
np.zeros((5, 5))

In [None]:
np.ones(5)

In [None]:
np.ones((5, 5))

**random**

In [None]:
# Create an array of the given shape and populate it with random samples from a 
# uniform distribution over 0 (inclusive) and 1 (exclusive)
np.random.rand(3)

In [None]:
# Return an array of the given shape and populate with samples from 
# standard normal distribution.
np.random.randn(3)

In [None]:
# Return random integers from give range of low (inclusive) to high (exclusive).
np.random.randint(1, 10) # specify numbers with np.random.randint(low, high, size)

**dtype** 

In [None]:
arr = np.random.randint(1, 10, 5)
arr.dtype

**max, min, argmax, argmin**

In [None]:
arr = np.array([10, 2, 3, 6, 7])
arr

In [None]:
arr.max()

In [None]:
arr.argmax()

In [None]:
arr.min()

In [None]:
arr.argmin()

**shape, reshape**

In [None]:
arr = np.random.randn(4, 4)
arr

In [None]:
arr.shape

In [None]:
arr.reshape(1, 16).shape

In [None]:
# use -1 and the value is inferred from the length of the array and remaining dimensions
arr.reshape(2, -1).shape 

In [None]:
# change doesn't happen inplace in original array
arr.shape

### Array Indexing / Slicing / Selection

In [None]:
arr = np.arange(0, 11)
arr

In [None]:
# Get a value at an index
arr[8]

In [None]:
# Get values in a range
arr[1: 5]

In [None]:
# Select elements by a condition
arr[arr > 5]

In [None]:
arr_2d = np.array(([1, 2, 3], [4, 5, 6], [7, 8, 9]))
arr_2d

In [None]:
# Indexing row
arr_2d[1]

In [None]:
# Get individual element value [row][col]
arr_2d[1][2]

In [None]:
# Get individual element value [row, col]
arr_2d[1, 2]

In [None]:
# Get slice of 2d array
arr_2d[:2, 1:]

In [None]:
arr_2d[2, :]

### Broadcasting

In [None]:
arr = np.arange(0, 11)
arr

In [None]:
# broadcast (set a value with an index range)
arr[0:5] = 99
arr

In [None]:
# Reset array
arr = np.arange(0,11)
arr

In [None]:
# create slice of array and view
arr_slice = arr[0:5]
arr_slice

In [None]:
# broadcast
arr_slice[:] = 99
arr_slice

In [None]:
# ORIGINAL ARRAY
arr

**Changes also occur in our original array!** This is because data is not copied to the sliced array, the sliced array is just a view of the original array to avoid memory problems. 

In [None]:
# Make a copy of an array
arr_2 = arr.copy()
arr_2

### Array Operators

In [None]:
arr = np.arange(0, 11)
arr

In [None]:
arr + arr

In [None]:
arr * arr

In [None]:
arr / arr # expect a warning

In [None]:
1 / arr # expect a warning

In [None]:
arr * 2

In [None]:
arr**2

In [None]:
np.sqrt(arr) # square root

In [None]:
np.exp(arr) # e^

In [None]:
np.log(arr) # natural log, expect a warning

In [None]:
np.max(arr) # same as arr.max()