# Homework 4
Charles Kornoelje | CS 344 | 4/23/2020 | Cal Uni


## Exercise 1 - Essay
I believe that "deep" neural networks are a breakthrough and will be widely used in the future. The reason for
	this is because neural networks have their foundation in ideas that have been around since the 1950s, such as
	perceptrons and connectionism which would come in the 80s. Though these old ideas have been around for many years,
	applications of the ideas haven't been widely used until the 2010s with the rise of machine learning. It took until
	the 2010s to have fast enough processors and large amounts of data to use create useful neural networks. The
	applications of neural networks are vast so the use of these networks will continue to be used for years. Such
	applications are for identifying zip codes on envelopes (as studied in class with the MNIST dataset) as well as
	personalized ads, suggestion features (such as recommended videos on YouTube or Netflix, friend suggestions on
	FaceBook, project suggestions on Amazon), and deciding on sentencing for those guilty of crimes. We generate so
	much big data, that we need neural networks as a way to find meaningful answers. And although we use neural
	networks, there is still a need to improve them and learn how to apply their usefulness in a better way.
	Interestingly, the applications of neural networks are too good that Amazon is removing suggestions of products
	to limit the amount they are selling during the COVID-19 pandemic
	([a Robinhood Article](https://snacks.robinhood.com/newsletters/25IZbjYcBGpV6UMaUGQIGz/),
	[and a forum post](https://www.resetera.com/threads/did-amazon-remove-their-%E2%80%9Ccustomers-who-bought-viewed-this-item-also-bought-viewed-%E2%80%9D-section.177169/)).
	Because of the wide application of "deep" neural networks, large amounts of data generated, technological
	advancements, and companies as well as the individual pushing for efficient problem solving, these neural
	networks are here to stay for a long time.
	
More efficiencies and techniques for neural networks will happen in the future, making the "deep" neural networks
	even more powerful, thus solidifying their place in society. Humans desperately want to create some sort of
	artificial intelligence that mimics the human mind, and using networks the relate to neurons in the mind I believe
	is a critical step into creating human-like AI. We are advancing quicker than ever, and if we have been using
	neural networks since the 2010s instead of moving onto something new, there is a reason to suspect that this
	technology isn't a bust. Although I do contend that current neural networks are mostly used a pattern matching
	and perception and lack things like logical deduction. I believe that neural networks might be used as part of
	larger AI systems in the future. Currently, neural networks rely on good data, so if we get better at getting
	proper data, then the results of the network will also improve. Also in the future, I see neural network use
	becoming more commercialized and available to small businesses and people without a computer science background.
	Currently, big tech companies are paving the path forward for neural networks, but I think additional APIs and
	services will be built upon our current tools to allow wide-spread, commercial access to all. William Vorhies of
	Data Science Central says we have reached a plateau and says that "Automation or even well-accepted rules of thumb
	are still out of reach" [from his article "What Comes After Deep Learning"](https://www.datasciencecentral.com/profiles/blogs/what-comes-after-deep-learning).
	I currently agree with this statement which is why using neural nets is an art. But with enough time, I think that
	rules of thumb can be developed and automation of network training can be done, thus advancing neural
	networks forward.

## Exercise 2 - Back-Propagation

![image](./bp1.jpg)
![image](./bp2.jpg)


## Exercise 3 - CovNet

1) Create model

In [2]:
# Adapted from 344-code unit 10.
from keras import layers
from keras import models

model = models.Sequential()

# Configure a convnet with 2 layers of convolutions and max pooling.
# Using two layers will increase the number of parameters to allow for
# more complexity in the model which I think is important because
# clothes are more complex than numbers.
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

# Add layers to flatten the 2D image and then do a 10-way classification.
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_4 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
flatten_2 (Flatten)          (None, 7744)              0         
_________________________________________________________________
dense_3 (Dense)              (None, 64)                495680    
_________________________________________________________________
dense_4 (Dense)              (None, 10)                650       
Total params: 515,146
Trainable params: 515,146
Non-trainable params: 0
________________________________________________

2) Prepare data

In [2]:
from keras.datasets import fashion_mnist
from keras.utils import to_categorical

(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype('float32') / 255

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

# Split val and train
val_images = train_images[54000:]
train_images = train_images[:54000]

val_labels = train_labels[54000:]
train_labels = train_labels[:54000]

val_data = (val_images, val_labels)

3) Train model and assess accuracy.

In [3]:
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5, batch_size=64, validation_data=val_data)
print(model.evaluate(train_images, train_labels))
print(model.evaluate(val_images, val_labels))
print(model.evaluate(test_images, test_labels))

Train on 54000 samples, validate on 6000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
[0.12685728906591734, 0.9557036757469177]
[0.22108589073022206, 0.9229999780654907]
[0.2301205246180296, 0.919700026512146]


Training Accuracy: 95.57%
Validation Accuracy: 92.29%
Test Accuracy: 91.97%

It makes sense that the model performs best on the data it was trained on
and worst on the test set because it is completely new. Because the training accuracy is 
~4% higher than the test, this might be evidence of overfitting.