# Intro to Neural Networks

If you are looking over this notebook is probably because you have some interest in Artificial Intelligence. This notebook is simply a brief explanation and overview of AI concepts.

### What is AI?

Artificial Intelligence might sound scary to some (it certainly makes some great headlines - ['Can Humans Be Replaced by Machines?,'](https://www.nytimes.com/2021/03/19/books/review/genius-makers-cade-metz-futureproof-kevin-roose.html) but at the very core, AI is all mathematics and programming. <br>According to [Britannica](https://www.britannica.com/technology/artificial-intelligence) 'Artificial intelligence (AI) is the ability of a computer or a robot controlled by a computer to do tasks that are usually done by humans because they require human intelligence and discernment.'


### Fields of AI

Within Artificial Intelligence, there are different fields and umbrellas. They differ in the approach and the types of tasks they are trying to accomplish. Here are some of the most well known:
- Machine Learning (ML)
- Deep Learning (DL)
- Neural Networks (NN)
- Natural Language Processing (NLP)
- Computer Vision

For this notebook, I will explain what the Machine Learning (ML) approach is and give a simple example of a Neural Network (NN).



### Machine Learning

Machine Learning is the science of programming computers so they can learn from data. ML is different than regular programming because they aim to solve problems in a different way. Imagine you are told to create a program that identifies cats. In a conventional programming approach, you might create a program and define classes or functions that define clearly what a cat is. For example, you might define a class that checks to see if the image has a tail, whiskers, eyes, four legs, and a nose, then your program would check if all those elements are found within the image, and if so, it would assert that it's a cat.

Machine Learning approaches this task a little different. With ML, you might collect a lot of images of cats and label them as cats, then create a model that goes over every image and it would derive from it certain features that it thinks it's a cat. Then, it would predict whether a certain image has those features that make up a cat.

The big difference is that in regular programming, you define the rules and explicitly program them. With Machine Learning, those rules are derived from data and examples.

### What are Neural Networks?
NN are a technique for building a computer program that learns from data. It is based very loosely on how we think the human brain works. First, a collection of software “neurons” are created and connected together, allowing them to send messages to each other. Next, the network is asked to solve a problem, which it attempts to do over and over, each time strengthening the connections that lead to success and diminishing those that lead to failure.

You can think of neurons as functions or variables, in which a computation takes place. One way to visualize it is with the following image:


<img src="imgs/neuralnet.jpeg" height=900 width=1200>

A way to think of this image is that the input layer neurons are essentially those examples you collected of cats, and the output layer are the labels that say if that image is a cat or not.

### An example of a Neural Network with TensorFlow

Before we begin with code, you might want to familiarize with [numpy](https://numpy.org) and [tensorflow](https://www.tensorflow.org). I have listed resources at the bottom of this notebook that you can check out. For this example, it is sufficient to know that TensorFlow is a library that allows us to create, train and deploy ML models. It does a lot of the heavy lifting so you don't have to (for example coding all the math behind our models).

We will write a Neural Network that is given degrees in Fahrenheit and converts them to degrees in Celsius. If we were doing it in a traditional programming approach, we would write a function that essentially solves the equation
$$
C° =(Degrees°F - 32) * .5556
$$
But we will solve it using the ML approach in which we will give our model input and outputs and it will derive it from it the right way to convert any degree in F to C.

We begin by importing the needed libraries

In [8]:
import tensorflow as tf
import numpy as np
from tensorflow import keras

We now define our inputs and outputs.
- Inputs: The degrees in F
- Outputs: The degrees in C

In [13]:
degrees_in_f = np.array([32, 60.8, 68, 122, 212, 107.6], dtype=float)
degrees_in_c = np.array([0, 16, 20, 50, 100, 42], dtype=float)

### Defining our model
This might seem a little daunting to understand. But don't worry. I will provide some brief explanations to the terms that you see and resources where you can learn more about them.


In [10]:
model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])

The line above defines our model. We use the tf.keras syntax to access the keras library within TensorFlow. In this example we will be using a [Sequential](https://keras.io/guides/sequential_model/) model. A Sequential model in Keras is simply an API that allows us to build a model layer by layer.
So far our model has one Dense layer [Dense](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense) that takes 1 input (units=1) at a time. The shape is just a 1 dimensional array.

### Compiling model

In [11]:
model.compile(optimizer=tf.keras.optimizers.Adam(0.1), loss='mean_squared_error')

The line above compiles our model. For simplicity now I will provide a short explanation of some of these keywords but I will provide links if you want to look into them further. Other notebooks will also go in greater depth as to what these keywords are.

First, let's understand what the code *loss='mean_squared_error'* is doing. This code is just telling our model that the loss function we are using is [mean squared error](https://statisticsbyjim.com/regression/mean-squared-error-mse/). A loss function tells a ML model how wrong a prediction is. There are several ways of calculating the error but for now, it's enough to know that the formula for this loss function is:
$$
 MSE = \frac{\sum(yi - \hat{y}i)^{2} }{n}
$$
Where:
- yi is the ith observed value.
- ŷi is the corresponding predicted value.
- n = the number of observations.

The goal of ML is to minimize that value, or in other words, to get as accurate as it can. Typically, ML models accomplish that by tweaking parameters, and the way they do it is by using an Optimizer.

To illustrate what these two things are doing, imagine you are dropped in the middle of California and your task is to get to the beach. Since you can't really tell where you are, you decide to just start walking in any direction until you see a landmark or something that tells you what direction you are heading. As you walk, you start seeing signs that point to the beach. So you adjust your route until you get to the beach.
 That's essentially what a ML model does. It starts walking (or making predictions) towards the beach and the loss function tells it whether it's going in the right direction or not, and then the optimizer makes sure that the next time the model makes a prediction, the parameters it used to make a prediction are updated.

The code *tf.keras.optimizers.Adam* is the way we access the optimizers API from keras. In this case, we will be using [Adam](https://optimization.cbe.cornell.edu/index.php?title=Adam) as our optimizer.


### Training the model

Now we will train our model. Since we want our model to convert degrees Fahrenheit to Celsius, we first pass our degrees in Fahrenheit and then we pass the degrees in Celsius. We are using a numpy array for this so that our model knows which pair of values are the correct answer. For example if we want to convert 80f to c, our array would look like this:

degrees_in_f = np.array([80.0])
degress_in_c = np.array([26.67])

Our model knows that the index 0 in our input array should output the value at index 0 in our output array.

We will tell our model to go over all the training data a certain amount of times, and we do that utilizing the *epochs=n* parameter. For example, if you want your model to go over the training data 100 times you would define it like this: *epochs=100*.

Run the cell below to train the model, and feel free to change the number of epochs do see how much each prediction differs after training!

In [None]:
history = model.fit(degrees_in_f, degrees_in_c, epochs=800)

### Making a prediction

We are now ready to make a prediction. To do it, we simply use the model.predict([value_to_convert]) syntax

In [18]:
print("Predicting...", model.predict([10]))

Predicting... [[-12.19515]]


Congratulations! You now have trained a NN that converts degrees F to C