<a href="https://colab.research.google.com/github/taylan-sen/CIS490b_computer_vision/blob/main/AI_comp_vision_neural_network_intro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**SUMMARY**: This notebook has class/reference notes for using basic neural networks with keras/tensorflow for image classification.

<img src="https://miro.medium.com/v2/resize:fit:1280/1*lJ3wVL_JGs4udYZTNZJOng.jpeg" width="300">  


This notebook uses the TensorFlow neural network tool along with the Keras "wrapper" to the tool (which makes it easier to run).

https://keras.io/

There is a great neural network simulator at:

http://playground.tensorflow.org/  

Go there and find out:  
* With one hidden layer, what is the minimum number of neurons to learn each classificiation task?  
* What is happening in the Output section with Test loss and Training loss.  
* How can you use the plots to determine when you need more training or not?
* What is Occam's razor?

In [None]:
# GET SOME DATA
from keras.datasets import mnist

# Load MNIST handwritten digit data
(X_train, y_train), (X_test, y_test) = mnist.load_data()


QUESTIONS:

1. What is the size and dimension of each of the variables:  
* X_train
* y_train  
* X_test  
* y_test

# What does it look like?
![](https://www.thescottishsun.co.uk/wp-content/uploads/sites/2/2019/01/birdybox.jpg)

In [None]:
# FINISH THE CODE BELOW
print("y_train[0]:",
print("X_train[0]:",

In [None]:
import matplotlib.pyplot as plt
plt.imshow(X_train[0])
plt.title( #TODO PUT IN y_train label

In [None]:
# Display some images
fig, axes = plt.subplots(ncols=5, sharex=False,
			 sharey=True, figsize=(10, 4))
for i in range(5):
	axes[i].set_title('label:'+str(y_train[i]))
	axes[i].imshow(X_train[i], cmap='gray')
	axes[i].get_xaxis().set_visible(False)
	axes[i].get_yaxis().set_visible(False)
plt.show()


In [None]:
# PREPROCESS DATA INTO A "GOOD FORMAT" FOR NN TRAINING
# Convert y_train into one-hot format
import numpy as np
from tensorflow.keras.utils import to_categorical

temp = []
for i in range(len(y_train)):
    temp.append(to_categorical(y_train[i], num_classes=10))

y_train = np.array(temp)

# Convert y_test into one-hot format
temp = []
for i in range(len(y_test)):
    temp.append(to_categorical(y_test[i], num_classes=10))

y_test = np.array(temp)

### Define the neural network architecture
We will use a single hidden layer architecture followed by a ***softmax*** layer for multiclass probabilities.  
<img src="https://successfularchistudent.com/wp-content/uploads/2020/02/successfularchistudent-how-to-ace-any-project-2.png" width="300">

In [None]:
# DEFINE THE ARCHITECTURE
from keras.layers import Dense, Flatten
from keras.models import Sequential

# Create simple Neural Network model
model = Sequential()
model.add(Flatten(input_shape=(28,28)))     # just rearranges input data
model.add(Dense(1, activation='sigmoid'))  # a linear sum with a sigmoid output
model.add(Dense(10, activation='softmax'))  # a linear sum with a softmax output

model.summary()

### What is a softmax?

<img src="https://i0.wp.com/cultivatecourage.com/wp-content/uploads/2014/08/judging_scores.jpg" width="300">

$\sigma(\mathbf{z})_i = \frac{e^{z_i}}{\sum_{j=1}^K e^{z_j}} \ \ \text{ for } i = 1, \dotsc, K \text{ and } \mathbf{z} = (z_1, \dotsc, z_K) \in
 \mathbb{R} ^K.$

$probability(class \space i) = \frac{e^{[score_i]}}{\sum_{j=1}^{total \# classes} e^{[score_j]}} \ $

QUESTION:
1. How many network weights and biases need to be trained in this network?  

VonNeuman to Fermi: "With four parameters I can fit an elephant, and with five I can make him wiggle his trunk."  
*By this he meant that the Fermi simulations relied on too many input parameters, presupposing an overfitting phenomenon.*  

<img src="https://francis.naukas.com/files/2010/05/dibujo20100527_five_complex_parameters_encoding_elephant_with_wiggling_trunk.png" width="200">

### Neural Network Training
![](https://image.freepik.com/free-vector/robot-exercise-set-character-cartoon-mascot-vector_193274-4896.jpg)  

***epoch***
***batch***
***accuracy***

In [None]:
# TRAIN
model.compile(loss='categorical_crossentropy',
	      optimizer='adam',
	      metrics=['acc'])

# Train the Neural Network model
model.fit(X_train, y_train, epochs=20, validation_data=(X_test,y_test))

# Making predictions using our trained model
predictions = model.predict(X_test)
predictions = np.argmax(predictions, axis=1)

# Display some predictions on test data
fig, axes = plt.subplots(ncols=20, sharex=False,
			 sharey=True, figsize=(20, 4))
for i in range(20):
	axes[i].set_title(predictions[i])
	axes[i].imshow(X_test[i], cmap='gray')
	axes[i].get_xaxis().set_visible(False)
	axes[i].get_yaxis().set_visible(False)
plt.show()
fig, axes = plt.subplots(ncols=20, sharex=False,
			 sharey=True, figsize=(20, 4))
for i in range(20):
	axes[i].set_title(predictions[i+20])
	axes[i].imshow(X_test[i+20], cmap='gray')
	axes[i].get_xaxis().set_visible(False)
	axes[i].get_yaxis().set_visible(False)
plt.show()

![](https://loonylabs.files.wordpress.com/2016/03/alive.jpg)

In [None]:
# Train more?
INSERT CODE


In [None]:
# Display again
predictions = model.predict(X_test)
predictions = np.argmax(predictions, axis=1)

# INSERT CODE

a good description of softmax function:

https://towardsdatascience.com/softmax-activation-function-how-it-actually-works-d292d335bd78