<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

# Major Neural Network Architectures Challenge
## *Data Science Unit 4 Sprint 3 Challenge*

In this sprint challenge, you'll explore some of the cutting edge of Data Science. This week we studied several famous neural network architectures: 
recurrent neural networks (RNNs), long short-term memory (LSTMs), convolutional neural networks (CNNs), and Autoencoders. In this sprint challenge, you will revisit these models. Remember, we are testing your knowledge of these architectures not your ability to fit a model with high accuracy. 

__*Caution:*__  these approaches can be pretty heavy computationally. All problems were designed so that you should be able to achieve results within at most 5-10 minutes of runtime locally, on AWS SageMaker, on Colab or on a comparable environment. If something is running longer, double check your approach!

## Challenge Objectives
*You should be able to:*
* <a href="#p1">Part 1</a>: Train a LSTM classification model
* <a href="#p2">Part 2</a>: Utilize a pre-trained CNN for object detection
* <a href="#p3">Part 3</a>: Describe a use case for an autoencoder
* <a href="#p4">Part 4</a>: Describe yourself as a Data Science and elucidate your vision of AI

<a id="p1"></a>
## Part 1 - LSTMSs

Use a LSTM to fit a multi-class classification model on Reuters news articles to distinguish topics of articles. The data is already encoded properly for use in a LSTM model. 

Your Tasks: 
- Use Keras to fit a predictive model, classifying news articles into topics. 
- Report your overall score and accuracy

For reference, the [Keras IMDB sentiment classification example](https://github.com/keras-team/keras/blob/master/examples/imdb_lstm.py) will be useful, as well as the LSTM code we used in class.

__*Note:*__  Focus on getting a running model, not on maxing accuracy with extreme data size or epoch numbers. Only revisit and push accuracy if you get everything else done!

In [1]:
from tensorflow.keras.datasets import reuters

(X_train, y_train), (X_test, y_test) = reuters.load_data(num_words=None,
                                                         skip_top=0,
                                                         maxlen=None,
                                                         test_split=0.2,
                                                         seed=723812,
                                                         start_char=1,
                                                         oov_char=2,
                                                         index_from=3)

In [2]:
# Demo of encoding

word_index = reuters.get_word_index(path="reuters_word_index.json")

print(f"Iran is encoded as {word_index['iran']} in the data")
print(f"London is encoded as {word_index['london']} in the data")
print("Words are encoded as numbers in our dataset.")

Iran is encoded as 779 in the data
London is encoded as 544 in the data
Words are encoded as numbers in our dataset.


In [3]:
# Do not change this line. You need the +1 for some reason. 
max_features = len(word_index.values()) + 1

# TODO - your code! - Imports
from __future__ import print_function

import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

import random
import sys
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.preprocessing import sequence
from tensorflow.keras.layers import Dense, Embedding, Dropout, LSTM

import matplotlib.pyplot as plt
import seaborn as sns
sns.set()

In [4]:
# Print the number of samples for train and test sets
print(f'Number of Training Samples: {len(X_train)}')
print(f'Number of Test Samples: {len(X_test)}')

# Print the number of classes
num_class = max(y_train) + 1
print(f'Number of Training Classes: {num_class}')

Number of Training Samples: 8982
Number of Test Samples: 2246
Number of Training Classes: 46


In [5]:
# Look at one of the train data
print(X_train[3])
print(y_train[3])

[1, 346, 273, 94, 187, 53, 74, 472, 26, 14, 46, 19, 124, 15, 39, 74, 32, 6582, 18, 14, 46, 61, 6097, 18, 1730, 1668, 32, 11, 14, 996, 12, 11, 123, 346, 39, 235, 627, 276, 5, 19, 19, 11, 15, 17, 12]
3


In [6]:
# Ensure that "special" words are mapped into human readable
word_index = {k:(v + 3) for k, v in word_index.items()}
word_index["<PAD>"] = 0
word_index["<START>"] = 1
word_index["<UNKNOWN>"] = 2

# Preform reverse word lookup and make it callable
rev_word_index = dict([(value, key) for (key, value) in word_index.items()])
def decode_review(text):
  return ' '.join([rev_word_index.get(i, '?') for i in text])

In [7]:
# Look at the max, min, average length of the whole dataset
all_articles = np.concatenate((X_train, X_test), axis=0)
print(f'Maximum article length: {len(max((all_articles), key=len))}')
print(f'Minimum article length: {len(min((all_articles), key=len))}')
result = [len(x) for x in all_articles]
print(f'Average article length: {round(np.mean(result))}')

# Print an article and it's class as stored in the dataset
class_names = ['Technology', 'Sports', 'Fashion']
print('\nA Machine Readable Article:')
print('  Article Text: ' + str(X_train[9]))
print('  Article Sentiment: ' + str(y_train[9]))

# Print the same article in human readable format
print('\nA Human Readable Article:')
print('  Article Text: ' + decode_review(X_train[9]))

Maximum article length: 2376
Minimum article length: 2
Average article length: 146.0

A Machine Readable Article:
  Article Text: [1, 53, 46, 312, 26, 14, 74, 134, 26, 39, 46, 5775, 18, 14, 74, 19, 3843, 18, 86, 981, 19, 11, 14, 924, 19, 11, 155, 230, 53, 74, 321, 26, 14, 74, 119, 26, 39, 74, 32, 5328, 18, 14, 74, 32, 3253, 18, 86, 2389, 44, 11, 14, 2012, 61, 11, 17, 12]
  Article Sentiment: 3

A Human Readable Article:
  Article Text: <START> shr loss seven cts vs profit 12 cts net loss 662 000 vs profit 1 520 000 revs 59 1 mln vs 63 1 mln six mths shr profit 23 cts vs profit 20 cts net profit 2 802 000 vs profit 2 543 000 revs 138 5 mln vs 126 7 mln reuter 3


In [8]:
# Add padding to the begining of the sequence
max_len = 125
X_train = sequence.pad_sequences(X_train, maxlen=max_len)
X_test = sequence.pad_sequences(X_test, maxlen=max_len)

In [9]:
X_train[3]

array([   0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    1,  346,  273,   94,  187,   53,   74,  472,
         26,   14,   46,   19,  124,   15,   39,   74,   32, 6582,   18,
         14,   46,   61, 6097,   18, 1730, 1668,   32,   11,   14,  996,
         12,   11,  123,  346,   39,  235,  627,  276,    5,   19,   19,
         11,   15,   17,   12], dtype=int32)

In [10]:
from tensorflow.keras.layers import Bidirectional

# Create a LSTM model
lstm = Sequential()
lstm.add(Embedding(max_features, 64, input_length=max_len))
lstm.add(LSTM(256, return_sequences=True))
lstm.add(Dropout(0.25))
lstm.add(Bidirectional(LSTM(128)))
lstm.add(Dropout(0.25))
lstm.add(Dense(1, activation='softmax'))

# Compile the model
lstm.compile(loss='categorical_crossentropy',
             optimizer='sgd',
             metrics='accuracy')

# Look at the summary of the model
lstm.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        (None, 125, 64)           1982720   
_________________________________________________________________
lstm (LSTM)                  (None, 125, 256)          328704    
_________________________________________________________________
dropout (Dropout)            (None, 125, 256)          0         
_________________________________________________________________
bidirectional (Bidirectional (None, 256)               394240    
_________________________________________________________________
dropout_1 (Dropout)          (None, 256)               0         
_________________________________________________________________
dense (Dense)                (None, 1)                 257       
Total params: 2,705,921
Trainable params: 2,705,921
Non-trainable params: 0
______________________________________________

In [11]:
# Fit my model
lstm1 = lstm.fit(X_train, y_train,
                 batch_size=64,
                 epochs=10,
                 validation_data=(X_test, y_test))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


## Sequence Data Question
#### *Describe the `pad_sequences` method used on the training dataset. What does it do? Why do you need it?*

  - Pad sequences adds 0s to either the beginning (by default) or end of a sequence to ensure that all sequences in a list have the same length. 


## RNNs versus LSTMs
#### *What are the primary motivations behind using Long-ShortTerm Memory Cell unit over traditional Recurrent Neural Networks?*

  - As the RNN goes through the layers it losses it's "memory" of prior values that could still be of importance, while a LSTM maintains this memory, which in turn trains the model better.

## RNN / LSTM Use Cases
#### *Name and Describe 3 Use Cases of LSTMs or RNNs and why they are suited to that use case*

  1. Weather Forcasting - LSTM would be best for this case because you are going to want to use data from at least a year ago to help you forecast better based on the time of year and what the average weather is like.

  2. Stock Values Forcasting - In order to get the best predictive values you would want to use LSTM unless it is a newer stock. With stock values though, the more data you can train on, the closer you could get to predicting what that stock will do based on it's historical data.

  3. Predicting Text - This one could be done with RNN or LSTM on a use case basis. If you have small sequences then using the RNN would work fine, but if you have large sequences, you will want to use the LSTM since the older a value is, the RNN will not be able to retain the memory of that value, even though it is valuable to predicting the future values.

<a id="p2"></a>
## Part 2- CNNs

### Find the Frog

Time to play "find the frog!" Use Keras and [ResNet50v2](https://www.tensorflow.org/api_docs/python/tf/keras/applications/resnet_v2) (pre-trained) to detect which of the images with the `frog_images` subdirectory has a frog in it. Note: You will need to upload the images to Colab. 

<img align="left" src="https://d3i6fh83elv35t.cloudfront.net/newshour/app/uploads/2017/03/GettyImages-654745934-1024x687.jpg" width=400>

The skimage function below will help you read in all the frog images into memory at once. You should use the preprocessing functions that come with ResnetV2, and you should also resize the images using scikit-image.

In [None]:
from skimage.io import imread_collection

images = imread_collection('./frog_images/*.jpg')

In [None]:
print(type(images))
print(type(images[0]), end="\n\n")

Your goal is to validly run ResNet50v2 on the input images - don't worry about tuning or improving the model. Print out the predictions in any way you see fit. 

*Hint* - ResNet 50v2 doesn't just return "frog". The three labels it has for frogs are: `bullfrog, tree frog, tailed frog`

*Stretch goals:* 
- Check for other things such as fish.
- Print out the image with its predicted label
- Wrap everything nicely in well documented fucntions

In [None]:
from tensorflow.keras.applications.resnet_v2 import ResNet50V2, decode_predictions, preprocess_input
# TODO - your code!


<a id="p3"></a>
## Part 3 - Autoencoders

Describe a use case for an autoencoder given that an autoencoder tries to predict its own input. 

__*Your Answer:*__ 


<a id="p4"></a>
## Part 4 - More...

Answer the following questions, with a target audience of a fellow Data Scientist:

- What do you consider your strongest area, as a Data Scientist?
- What area of Data Science would you most like to learn more about, and why?
- Where do you think Data Science will be in 5 years?
- What are the threats posed by AI to our society?
- How do you think we can counteract those threats? 
- Do you think achieving General Artifical Intelligence is ever possible?

A few sentences per answer is fine - only elaborate if time allows.

## Congratulations! 

Thank you for your hard work, and congratulations! You've learned a lot, and you should proudly call yourself a Data Scientist.


In [None]:
from IPython.display import HTML

HTML("""<iframe src="https://giphy.com/embed/26xivLqkv86uJzqWk" width="480" height="270" frameBorder="0" class="giphy-embed" allowFullScreen></iframe><p><a href="https://giphy.com/gifs/mumm-champagne-saber-26xivLqkv86uJzqWk">via GIPHY</a></p>""")