# Intro to Machine Intelligence

___


### Agenda

1. Introductions
2. How a computer works
3. What are AI, DS, and ML?
4. Supervised ML
5. Unsupervised ML
6. Reinforcement Learning
7. Deep Learning
8. Issues with ML and the Data Pipeline
9. Recap and Questions

# How Does a Computer Work?

---



***A computer*** is a device that can be instructed to carry out arbitrary sequences of arithmetic or logical operations automatically. The ability of computers to follow generalized sets of operations, called programs, enables them to perform an extremely wide range of tasks. - Wikipedia

***Computer programming*** is a process that leads from an original formulation of a computing problem to executable computer programs. Programming involves activities such as analysis, developing understanding, generating algorithms, verification of requirements of algorithms including their correctness and resources consumption, and implementation of algorithms in a target programming language. - Wikipedia

[***Python***](https://docs.python.org/3/) is an interpreted high-level programming language for general-purpose programming. Python has a design philosophy that emphasizes code readability, and a syntax that allows programmers to express concepts in fewer lines of code. - Wikipedia

### Printing Out Results Using Python

Run our First Python program to print the string "Hello World!!"

In [0]:
# Our First Python Program
print("Hello World!!")

Hello World!!


1. Define a math function called "***math_function***" that will compute $y = ({x_1}^{x_2} + x_3)$.

2. Run the function to compute $y = ({1}^{2} + 3)$, then display the result.

3. Run the function to compute $y = ({3}^{4} + 5)$, then display the result.

### Python Functions

In [0]:
# Another program that runs a function that multiplies elements of
# list one to list two and add the results to elements in list 3

# Define a function that multiplies input 1 and 2 and add input 3 to that result
def math_function(input_1, input_2, input_3):
  # y = (x1 * x2) + x3
  output = input_1 * input_2 + input_3
  return output

1. Create two lists called 'array_1' and 'array_2'

2. Define a new math function called "***mult***" that will multiply elements in both lists together.

3. Calculate the results of multiplying elements of *'array_1'* and *'array_2'* and store results in *'array_3'*

4. Print the results contained in *'array_3'* to display

5. Call the predefined function *'sum'* to calculate the total of all the elements in *'array_3'* before we print and display result

In [0]:
# Create two list of numbers
array_1 = [1,2,3,4,5,] 
array_2 = [2,4,6,8,10]

Now that we can calculate the dot product of two list we can use an external libary called NumPy to do it more efficiently


1. Load addition predefined functions from an external library called NumPy

2. Create two lists called 'array_1' and 'array_2'

3. Multiply 'array_1' and 'array_2' and store in 'array_3'

4. Print the results of 'array_3'

5. Call the NumPy (np) function dot to calculate the dot product of array 1 and 2

6. Print the result of array 3

### Python Classes

In [0]:
# Define a class object (a blueprint or template to create our AIAgents)
class AIAgent():
    def __init__(self, health, greeting):
        self.health = health
        self.greeting = greeting
        
    def talk(self):
        print(self.greeting)

In [0]:
# Initialize our Hal AIAgent with health 10 and his greeting
Hal = AIAgent(10, "Hello Dave.")
Hal.talk()
print(Hal.health)

Hello Dave.
10


In [0]:
# Initialize our Jarvis AIAgent with health 200 and his greeting
Jarvis = AIAgent(200, "Good morning. It's 7 A.M. The weather in Malibu is 72 degrees with scattered clouds. The\nsurf conditions are fair with waist to shoulder highlines, high tide will be at 10:52 a.m.")
Jarvis.talk()
print(Jarvis.health)

Good morning. It's 7 A.M. The weather in Malibu is 72 degrees with scattered clouds. The
surf conditions are fair with waist to shoulder highlines, high tide will be at 10:52 a.m.
200


In [0]:
# Initialize T1000 AIAgent with health 1000 and his greeting
T1000 = AIAgent(1000, "Hasta la vista, baby!")
T1000.talk()
print(T1000.health)

Hasta la vista, baby!
1000


### Python Numpy Library

In [0]:
# Load a library (a file of additional predefined functions and classes not in the standard python language)
import numpy as np # Load the Numpy library and rename it 'np'
# Create two numpy arrays using the np array function
array_1 = np.array([1,2,3,4,5])
array_2 = np.array([2,4,6,8,10])
# Multiply elements in arrays 1 and 2
array_3 = array_1 * array_2
# Return result
print(array_3)

# A shortcut numpy function called 'dot' to perform dot multiplication
array_3 = np.dot(array_1, array_2)
print(array_3)


# Supervised Machine Learning

___


[***NumPy***](http://www.numpy.org/) is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

[***Pandas***](http://pandas.pydata.org/pandas-docs/stable/) is a software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series. 

[***Scikit-learn***](http://scikit-learn.org/stable/documentation.html) is a machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN. - Wikipedia

### Regression

1. Load the sklearn and numpy libraries

2. Load (or define) our dataset

3. Define our performance metric that we would like the ML model to optimize

4. Define the ML model (linear regression) we would like to use to learn the data from

5. Train (fit) the ML 

6. Make a prediction with a new data point

In [0]:
# Supervised Learning - Regression 
# Import the sklearn machine learning library
import sklearn
import numpy as np
from sklearn import datasets, linear_model

# Load in a dataset
X = np.array([1,2,3,4,5,6,7,8,9,10]).reshape(-1, 1)
y = np.array([1.1, 1.9,2.8,4,5.2,5.8,6.9,8.1,9,9.9]).reshape(-1, 1)

# Setup a performance metric
from sklearn.metrics import mean_squared_error, r2_score

# Create a machine learning model
linear_regression_model = linear_model.LinearRegression()

# Train the model
linear_regression_model.fit(X, y)

# Make a prediction with model
prediction = linear_regression_model.predict(4.5)
print(prediction)

### Classification

1. Load the sklearn and numpy libraries

2. Load (or define) our dataset

3. Define our performance metric that we would like the ML model to optimize

4. Define the ML model (decision tree) we would like to use to learn the data from

5. Train (fit) the ML 

6. Make a prediction with a new data point

In [0]:
# Supervised Learning - Classification
# Import the sklearn machine learning library
import sklearn
import numpy as np
from sklearn.tree import DecisionTreeClassifier

# Load in a dataset
X = [[1,1],[1,2],[1,3],[1,4],[2,1],[2,2],[2,3],[2,4],[3,1],[3,2],[3,3],[3,4],[4,1],[4,2],[4,3],[4,4]]
y = [0,1,1,1,0,0,1,1,0,0,1,1,0,0,0,1]

# Setup a performance metric
from sklearn.metrics import mean_squared_error, r2_score

# Create a machine learning model
decision_tree_classifier_model = DecisionTreeClassifier()

# Train the model
decision_tree_classifier_model.fit(X, y)

# Make a prediction with model
prediction = decision_tree_classifier_model.predict([[1,3.5],[3,1]])
print(prediction)

# Unsupervised Machine Learning

---


### Clustering

1. Load the sklearn and numpy libraries

2. Load (or define) our dataset

3. Define our unsupervised ML Cluster Model

4. Train (fit) the ML 

5. Make a prediction (the center of the cluster)

In [0]:
# Use the sklearn library
from sklearn.cluster import KMeans

# Load a dataset
X = np.array([[1, 2], [1, 4], [1, 0], [4, 2], [4, 4], [4, 0]])

# Choose a clustering model
kmeans = KMeans(n_clusters=2, random_state=0)

# Train the model
kmeans.fit(X)

# Return results of the model
kmeans.predict([[0, 0], [4, 4]])
kmeans.cluster_centers_

# Reinforcement Learning

---

[The Snake Game](https://www.google.com/search?q=snake+game&oq=snake+game&aqs=chrome..69i57.2067j0j4&sourceid=chrome&ie=UTF-8)

In [0]:
1. Import in the Pygame library for animation

2. Define the game logic

3. Define agent that interacts with game environment

4. Run the simulation

# Deep Learning

---

[Tensorflow](https://www.tensorflow.org/versions/r1.1/get_started/) is an open-source software library for dataflow programming across a range of tasks. It is a symbolic math library, and also used for machine learning applications such as neural networks. - wikipedia

### Short Tensorflow Tutorial

1. Initialize our Tensorflow and NumPY Libraries

2. Start the Tensorflow Session

3. Initialize our input variables input_1 and input_2

4. Define the math function to add inputs 1 and 2 and store in output

5. Evaluate ouput and store in result

6. Display result

In [0]:
import tensorflow as tf
import numpy as np

with tf.Session():
  input1 = tf.constant(1.0, shape=[2, 3])
  input2 = tf.constant(np.reshape(np.arange(1.0, 7.0, dtype=np.float32), (2, 3)))
  output = tf.add(input1, input2)
  result = output.eval()

result

array([[ 2.,  3.,  4.],
       [ 5.,  6.,  7.]], dtype=float32)

### Tensorflow with MNIST

1. Load NumPy and Tensorflow libraries

2. Load the 'MNIST' digit image dataset

3. Define our Neural Network Weights

4. Define our Input and Output variables

5. Define our Error Reduction and Training Algorithms

6. Train Our Neural Network Model

7. Evaluate and Display the Performance of Model

In [0]:
# Import Google's Tensorflow Deep Learning Library
import numpy as np
import tensorflow as tf

# Load the Dataset
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

# Define the Neural Network Model Parameters
W = tf.Variable(tf.zeros([784, 10])) # Neural Network Weights
b = tf.Variable(tf.zeros([10])) # Bias
 
# Implement Neural Network Model using the Softmax Activation Function
# Model Inputs and Outputs
x = tf.placeholder(tf.float32, [None, 784]) # Input feature variables
y_ = tf.placeholder(tf.float32, [None, 10]) # Store the predicted outputs
y = tf.nn.softmax(tf.matmul(x, W) + b)

# Define the Training the Parameters
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1])) # Error Reduction Method
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy) # Training Method

# Train the Model
sess = tf.InteractiveSession() # Initialize our Neural Network Graph
tf.global_variables_initializer().run()
for _ in range(1000): # Train using 1K Iterations
  batch_xs, batch_ys = mnist.train.next_batch(100) # Train with a Batch of 100 Training Data
  sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

# Evaluate and Display Accuracy
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))


Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting MNIST_data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
0.9177


# Model Evaluation

---

### Data Science Pipeline for Predictive Machine Learning Models

1. Import Libraries

2. Load the Data

3. Analyze the Dataset

4. Clean, Transform, and Prepare Data

5. Split the Data into Training and Testing Datasets

6. Choose Performance Metric for Model

7. Initialize our ML Model

8. Train and Validate ML Algorithm with Multiple Parameters to Find Optimal Model

9. Make Predictions

In [0]:
# Import libraries
import sklearn
import numpy as np
import pandas as pd
from sklearn.datasets import load_boston
from sklearn.tree import DecisionTreeRegressor
from sklearn.feature_selection import SelectKBest, f_classif
from sklearn.metrics import r2_score
from sklearn.metrics import make_scorer
from sklearn.cross_validation import train_test_split
from sklearn.model_selection import ShuffleSplit
from sklearn.cross_validation import StratifiedKFold
from sklearn.model_selection import GridSearchCV
from sklearn.preprocessing import Normalizer

# Load dataset
boston = load_boston()
features, prices = boston['data'], boston['target']

# Analyze the data
features = pd.DataFrame(features)
features.describe()

# Clean, transform, and prepare data
normal_features = Normalizer().fit_transform(features)
#normal_prices = Normalizer().fit_transform(prices)
new_features = SelectKBest(f_classif, k=3).fit_transform(normal_features, prices)

# Split the data
features_train, features_test, prices_train, prices_test = train_test_split(new_features, prices, test_size=0.2, random_state=42)

# Choose a performance metric
scoring_fnc = make_scorer(r2_score)

# Initialize a machine learning model
regressor = DecisionTreeRegressor()

# Train and validate best model
cv_sets = ShuffleSplit(prices_train.shape[0], test_size = 0.20, random_state = 0)
params = {'max_depth': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]}
grid = GridSearchCV(regressor, param_grid = params, scoring = scoring_fnc)
grid = grid.fit(features_train, prices_train)
best_classifer = grid.best_estimator_

# Make new predictions
best_predictions = best_classifer.predict(features_test)
print("Final R2 score on the testing data: {:.4f}".format(r2_score(prices_test, best_predictions)))


# Summary

----

### Recap

1. Python Programing
2. AI, DS, and ML
3. Supervised ML
4. Unsupervised ML
5. Reinforcement Learning
6. Deep Learning
7. Issues with ML and the Data Pipeline