<a href="https://colab.research.google.com/github/PhoebeJS/casa0018/blob/main/Week1/CASA0018_1_Hello_World.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# The Hello World of Deep Learning with Neural Networks

## Imports

Import **TensorFlow** and **numpy** (the latter helps us to represent our data as lists easily and quickly). The framework for defining a neural network as a set of sequential layers is called **keras**. Keras is a **deep learning** API written in Python, running on top of the **machine learning** platform TensorFlow.

In [1]:
import tensorflow as tf
import numpy as np
from tensorflow import keras

## Providing the Data

Next up we'll feed in some data. Can you guess what the relationshop is between this data? (If you are not familiar with 'Numpy' it is worth taking a look at the pre-course Python training).

In [2]:
xs = np.array([-1.0, 0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)
ys = np.array([-2.0, 1.0, 4.0, 7.0, 10.0, 13.0], dtype=float)

Notes (personal):


*   xs are the input values (features)
*   ys are the target outputs (labels)
* Floats are specified because neural networks work with floating-point numbers (gradients, weight updates & loss calculations all require gloats)

The neural networks task is to learn the relationship in this data without being told the equation explilcitly.



## Define and Compile the Neural Network

Next we will create the simplest possible neural network using the keras **Sequential** function. This allows us to add layers sequentially to the model that describe the inputs, outputs and hidden layers of the neural network. We define the **input shape** to have a value of 1 (ie x). We then add one Dense **layer** with a size of 1  neuron. The **Dense** function describes a network layer where all neurons are connected to all other neurons. However, in the example below we have the simpest model possible which is just one neuron that receives one input and generates a single output.

In [8]:
model = tf.keras.Sequential()         # 'sequential' creates a model where layers are stacked one after another
model.add(tf.keras.Input(shape=(1,))) # Declares that each input is a single number. 'shape=(1,)' means one feature per observation
model.add(tf.keras.layers.Dense(10,)) # Hidden layer. Dense means 'fully connected' & 10 means 10 neurons. Default is linear activation
model.add(tf.keras.layers.Dense(1,))  # One neuron = predict a single number
#model.summary()

Now we compile our Neural Network and we specify 2 functions, a loss and an optimizer. The LOSS function measures the guessed answer against the known correct answers. The OPTIMIZER function will try to minimize the loss. Here we use 'MEAN SQUARED ERROR' for the loss and 'STOCHASTIC GRADIENT DESCENT' for the optimizer.

In [9]:
model.compile(optimizer='sgd', loss='mean_squared_error')
# ^ tells tensor flow how the model should learn
# Optimizer='sdg' means 'stochastic gradient descent'
# SDG updates weights step-by-step to reduce error
# In simple words: move the weights slightly in the direction that reduces error

# loss = 'mean_squared_error' penalises large error more than small ones
# standard choice for regression
# The model attempts to minimise this value during training

# Training the Neural Network

The process of training the neural network, where it 'learns' the relationship between the Xs and Ys is in the **model.fit**  call. It loops for a number of epochs, making a guess, measuring the loss and using the opimizer to make another guess. In the results you can see the loss on the right hand side for each guess.

In [10]:
model.fit(xs, ys, epochs=200)

Epoch 1/200
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 400ms/step - loss: 53.1324
Epoch 2/200
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 64ms/step - loss: 25.8606
Epoch 3/200
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 58ms/step - loss: 10.0978
Epoch 4/200
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 55ms/step - loss: 2.4661
Epoch 5/200
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 53ms/step - loss: 0.3465
Epoch 6/200
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 56ms/step - loss: 0.0358
Epoch 7/200
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 68ms/step - loss: 0.0072
Epoch 8/200
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 58ms/step - loss: 0.0047
Epoch 9/200
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 60ms/step - loss: 0.0042
Epoch 10/200
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 65ms/step - loss: 0.0038
Epoch

<keras.src.callbacks.history.History at 0x7dd875e13950>

This training code makes it so that for each epoch:
* the model predicts ys from xs
* calculates the loss(error)
* computes gradients
* updates weights using SGD

200 epochs bc the dataset is tiny, more epochs mean more changes to converge, and overfitting is not a concern here.

Now we have a model you can use the **model.predict** method to have it figure out the Y for a previously unknown X.

In [11]:
print(model.predict(np.array([10])))

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 102ms/step
[[31.000027]]


This code feeds a new input into the trained network and the model predicts its output y.