This study guide should reinforce and provide practice for all of the concepts you have seen in the past week. There are a mix of written questions and coding exercises, both are equally important to prepare you for the sprint challenge as well as to be able to speak on these topics comfortably in interviews and on the job.

If you get stuck or are unsure of something remember the 20 minute rule. If that doesn't help, then research a solution with google and stackoverflow. Only once you have exausted these methods should you turn to your Team Lead - they won't be there on your SC or during an interview. That being said, don't hesitate to ask for help if you truly are stuck.

Have fun studying!

# Deep Learning Architectures

## Definitions

Define the following terms in your own words, do not simply copy and paste a definition found elsewhere but reword it to be understandable and memorable to you. *Double click the markdown to add your definitions.*

**Recurrent Neural Network:** `Your Answer Here`

**Long Short Term Memory:** `Your Answer Here`

**Convolution:** `Your Answer Here`

**Convolutional Neural Network:** `Your Answer Here`

**Transfer Learning:** `Your Answer Here`

**Autoencoder:** `Your Answer Here`

**Generative Adversarial Network:** `Your Answer Here`

# Questions of Understanding

1. What is the "deep" in Deep Learning refer to?
```
Your Answer Here
```

2. Name at least two types of problems that RNN's/LSTM's are good for. Why are RNN's particularly suited for these problems?
```
Your Answer Here
```

3. What weakness in RNN's does an LSTM help to address?
```
Your Answer Here
```

4. Name at least two types of problems that CNN's are good for. Why are CNN's particularly suited for these problems?
```
Your Answer Here
```

5. What are the advantages of transfer learning?
```
Your Answer Here
```

6. Name at least two types of problems that autoencoders are good for. Why are autoencoders particularly suited for these problems?
```
Your Answer Here
```

## Practice Problems

### RNN's

Run the code below to create a dummy dataset. Then build a Neural Network in Keras using at least one LSTM layer.

In [1]:
# Starter code for dummy data
from numpy import array
 
def split_sequence(sequence, n_steps):
	X, y = list(), list()
	for i in range(len(sequence)):
		end_ix = i + n_steps
		if end_ix > len(sequence)-1:
			break
		seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)
 
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120]
X, y = split_sequence(raw_seq, 3)
n_features = 1
X = X.reshape((X.shape[0], X.shape[1], n_features))

In [2]:
from tensorflow.keras.layers import Dense, LSTM, Embedding
from tensorflow.keras.models import Sequential

In [3]:
X

array([[[ 10],
        [ 20],
        [ 30]],

       [[ 20],
        [ 30],
        [ 40]],

       [[ 30],
        [ 40],
        [ 50]],

       [[ 40],
        [ 50],
        [ 60]],

       [[ 50],
        [ 60],
        [ 70]],

       [[ 60],
        [ 70],
        [ 80]],

       [[ 70],
        [ 80],
        [ 90]],

       [[ 80],
        [ 90],
        [100]],

       [[ 90],
        [100],
        [110]]])

In [4]:
y

array([ 40,  50,  60,  70,  80,  90, 100, 110, 120])

In [5]:
# define model
RNN = Sequential()

# RNN.add(Embedding(input_dim=100, output_dim=128))
RNN.add(LSTM(32, input_shape=X[0].shape))
RNN.add(Dense(64, activation='relu'))
RNN.add(Dense(128, activation='relu'))
RNN.add(Dense(1, activation='relu'))

RNN.compile(loss='mean_absolute_error', optimizer='adam', metrics=['MAE', 'MSE'])

# fit model

RNN.fit(X,y, batch_size=1, epochs=1000)

Epoch 1/1000
Epoch 2/1000
Epoch 3/1000
Epoch 4/1000
Epoch 5/1000
Epoch 6/1000
Epoch 7/1000
Epoch 8/1000
Epoch 9/1000
Epoch 10/1000
Epoch 11/1000
Epoch 12/1000
Epoch 13/1000
Epoch 14/1000
Epoch 15/1000
Epoch 16/1000
Epoch 17/1000
Epoch 18/1000
Epoch 19/1000
Epoch 20/1000
Epoch 21/1000
Epoch 22/1000
Epoch 23/1000
Epoch 24/1000
Epoch 25/1000
Epoch 26/1000
Epoch 27/1000
Epoch 28/1000
Epoch 29/1000
Epoch 30/1000
Epoch 31/1000
Epoch 32/1000
Epoch 33/1000
Epoch 34/1000
Epoch 35/1000
Epoch 36/1000
Epoch 37/1000
Epoch 38/1000
Epoch 39/1000
Epoch 40/1000
Epoch 41/1000
Epoch 42/1000
Epoch 43/1000
Epoch 44/1000
Epoch 45/1000
Epoch 46/1000
Epoch 47/1000
Epoch 48/1000
Epoch 49/1000
Epoch 50/1000
Epoch 51/1000
Epoch 52/1000
Epoch 53/1000
Epoch 54/1000
Epoch 55/1000
Epoch 56/1000
Epoch 57/1000
Epoch 58/1000
Epoch 59/1000
Epoch 60/1000
Epoch 61/1000
Epoch 62/1000
Epoch 63/1000
Epoch 64/1000
Epoch 65/1000
Epoch 66/1000
Epoch 67/1000
Epoch 68/1000
Epoch 69/1000
Epoch 70/1000
Epoch 71/1000
Epoch 72/1000
E

<tensorflow.python.keras.callbacks.History at 0x7fe410172b70>

Once you have trained your model, run the code below to test it. The predicted output should be near `130` and the rounded output should be exactly `130`. We can round our prediction because we know our sequence only deals in increments of 10, with real data this would be a bad idea.

In [30]:
# demonstrate prediction
x_input = array([100, 110, 120])
x_input = x_input.reshape((1, 3, n_features))
yhat = RNN.predict(x_input, verbose=0)[0][0]

print('Prediction:', yhat)
print('Rounded to Nearest 10:', int((10 * round(yhat/10))))

Prediction: 124.50973
Rounded to Nearest 10: 120


### CNN's

[Jian-Yang](https://www.youtube.com/watch?v=ACmydtFDTGs) from Silicon Valley might be on to the next billion dollar company. Beat him to market by creating your own hotdog detector using a pretrained model from TensorFlow Hub. You should be able to feed your model an image final output should be either "Hotdog" or "Not hotdog".

In [18]:
from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions
import numpy as np

In [19]:
model = ResNet50(weights='imagenet')

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels.h5


Once complete, test out your model with random hotdog and non-hotdog related images.

In [20]:
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

preds = model.predict(x)
# decode the results into a list of tuples (class, description, probability)
# (one such list for each sample in the batch)
print('Predicted:', decode_predictions(preds, top=3)[0])

Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/imagenet_class_index.json
Predicted: [('n01871265', 'tusker', 0.6167958), ('n02504013', 'Indian_elephant', 0.21267118), ('n02504458', 'African_elephant', 0.17049867)]


In [21]:
img_path = 'hotdog.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

preds = model.predict(x)
# decode the results into a list of tuples (class, description, probability)
# (one such list for each sample in the batch)
print('Predicted:', decode_predictions(preds, top=3)[0])

Predicted: [('n07697537', 'hotdog', 0.9972065), ('n07684084', 'French_loaf', 0.00096120656), ('n01955084', 'chiton', 0.00037866313)]


# Artificial Intelligence

1. What is General Artificial Intelligence?
```
Your Answer Here
```

2. Is General Artificial Intelligence possible? Why or Why not?
```
Your Answer Here
```

3. As data science improves, what positive and negative impacts do you think it will have on our society?
```
Your Answer Here
```

4. What are the benefits and dangers of pursuing AI? Does the good outweigh the bad?
```
Your Answer Here
```

5. The world is unlikely to stop pursuing the goal of thinking machines, so what steps/precautions can we take to approach the challenge in a safe and ethical way?
```
Your Answer Here
```

6. Automation is now getting to the point of impacting fields that have been traditionally safe from such concerns such as transportation, machine learning, and healthcare. What impact do you think this will have on these professions? Will these jobs simply shift to accomodate the automation or will the automation eliminate the jobs?
```
Your Answer Here
```

## Practice Problem

Try using AutoML on the dataset below. It's great if you can get it working but don't spend too much time struggling with it. If you get stuck, move on to the next question.

In [None]:
#https://github.com/mljar/mljar-supervised

In [1]:
data_url = 'https://github.com/bundickm/Study-Guides/blob/master/data/hearts.csv'

In [2]:
!pip install mljar-supervised



In [3]:
import pandas as pd
from sklearn.model_selection import train_test_split
from supervised.automl import AutoML

In [16]:
df = pd.read_csv("https://raw.githubusercontent.com/bundickm/Study-Guides/master/data/hearts.csv")

In [17]:
df.head()

Unnamed: 0.1,Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,0,63,Male,D,145,233,1,0,150,0,2.3,0,0,1,1
1,1,37,Male,C,130,250,0,1,187,0,3.5,0,0,2,1
2,2,41,Female,B,130,204,0,0,172,0,1.4,2,0,2,1
3,3,56,Male,B,120,236,0,1,178,0,0.8,2,0,2,1
4,4,57,Female,A,120,354,0,1,163,1,0.6,2,0,2,1


In [18]:
X_train, X_test, y_train, y_test = train_test_split(
    df[df.columns[:-1]], df["target"], test_size=0.25, random_state = 42)

In [20]:
automl = AutoML()

Create directory AutoML_3


In [21]:
automl.fit(X_train, y_train)

AutoML task to be solved: binary_classification
AutoML will use algorithms: ['Baseline', 'Linear', 'Decision Tree', 'Random Forest', 'Xgboost', 'Neural Network']
AutoML will optimize for metric: logloss
1_Baseline final logloss 0.6893155378231264 time 0.02 seconds
2_DecisionTree final logloss 1.000000500029089e-06 time 7.47 seconds
3_Linear final logloss 0.05850106617097879 time 2.76 seconds
4_Default_RandomForest final logloss 1.000000500029089e-06 time 8.02 seconds
5_Default_Xgboost final logloss 0.011891497560499007 time 2.7 seconds
6_Default_NeuralNetwork final logloss 0.005129155547215784 time 336.89 seconds
Ensemble final logloss 1.000000500029089e-06 time 0.12 seconds


In [22]:
predictions = automl.predict(X_test)

In [23]:
predictions

Unnamed: 0,prediction_0,prediction_1,label
0,1.0,0.0,0
1,1.0,0.0,0
2,0.0,1.0,1
3,1.0,0.0,0
4,0.0,1.0,1
...,...,...,...
71,0.0,1.0,1
72,1.0,0.0,0
73,0.0,1.0,1
74,1.0,0.0,0


In [24]:
from sklearn.metrics import accuracy_score

In [25]:
print(f"Model scores {accuracy_score(y_test,predictions.label):.3f} on heart disease dataset")

Model scores 1.000 on heart disease dataset


What are some of the uses and limitations of AutoML and other similar tools?
```
Your Answer Here
```