<a href="https://colab.research.google.com/github/tinkercademy/ml-notebooks/blob/main/Machine Learning in Pytorch/01_Hello_ML_World.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# The Hello World of Deep Learning with Neural Networks

Adapted from https://codelabs.tf.wiki/codelabs/tensorflow-lab1-helloworld/#0.

Like every first app you should start with something super simple that shows the overall scaffolding for how your code works.

In the case of creating neural networks, the sample I like to use is one where it learns the relationship between two numbers. So, for example, if you were writing code for a function like this in Python, you know the 'rules' (and so do I):


```python
def my_function(x):
    y = (3 * x) + 1
    return y

print(my_function(10)) # this prints 31
```

So how would you train a neural network to do the equivalent task? Using data! By feeding it with a set of `x`s, and a set of `y`s, it should be able to figure out the relationship between them.

Let's step through this piece by piece.


## Imports

Let's start with our imports. Here we are importing numpy and part of the sklearn library. Specifically, we're importing a section that will allow us to use a Stochastic Gradient Descent (SGD) model.

In [29]:
import numpy as np
from sklearn.linear_model import SGDRegressor

## Define and Compile the Neural Network

Next we will specify the model to use. You can look up information on the [SGDRegressor model](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDRegressor.html) in the documentation. It gives us basic information (as well as a lot of details that may be way beyond our grasp at the moment):

*    Linear model fitted by minimizing a regularized empirical loss with SGD.
*    SGD stands for Stochastic Gradient Descent: the gradient of the loss is estimated each sample at a time and the model is updated along the way with a decreasing strength schedule (aka learning rate).

We can also view many of the parameters that we can customise. For now we'll only specify two:
*    max_iter = the number of max iterations, which defines the maximum number of passes over the training data (aka epochs). This essentially defines how long should the model train itself.
*    tol = tolerance, which determines the stopping criterion. In other words, this parameter allows us to stop the training once the model is accurate enough for our needs. A higher tolerance will stop the model sooner, while a lower tolerance will require more training to satisfy the stopping requirement.

In [24]:
model = SGDRegressor(max_iter=500, tol=1e-3)

## Providing the Data

Next up we'll feed in some data. In this case we are taking 6 `x`s and 6 `y`s. The _actual_ relationship between these is `y = 3x + 1`, which you can infer with a bit of mental math. We will be submitting our data in numpy arrays:

In [25]:
xs = np.array([-1.0, 0.0, 1.0, 2.0, 3.0, 4.0]).reshape(-1, 1)
ys = np.array([-2.0, 1.0, 4.0, 7.0, 10.0, 13.0])

# Training the model

Next we must train the model. In sklearn, we can easily use the .fit() method. If you've seen lots of math for machine learning, here's where it's usually used, but in this case it's nicely encapsulated in functions for you. But what happens here -- let's explain...

We know that in our function, the relationship between the numbers is `y=3x+1`.

When the computer is trying to 'learn' that, it makes a guess...maybe `y=10x+10`? A **loss function** measures the guessed answers against the known correct answers and measures how well or how badly it did.

It then uses an **optimizer function** to make another guess. Based on how the loss function went, it will try to minimize the loss. At that point maybe it will come up with something like `y=5x+5`, which, while still pretty bad, is closer to the correct result (i.e. the loss is lower).

It will repeat this for the number of **epochs** (here, a maximum of 500 as specified above, or until a specific loss threshold is reached).

In sklearn, all this happens behind the scenes: you don't need to specify any of these options (unless you want to! For example, there's a loss parameter if you want to use a specific loss function).The computer goes through a loop where it: makes a guess, measures how good or bad it is (aka the loss), uses an opimizer to make another guess etc. Once again, it will loop through these steps for the number of epochs you specify.

In [26]:
model.fit(xs, ys)

Ok, now you have a model that has been trained to learn the relationshop between X and Y. You can use the **model.predict** method to have it figure out the Y for a previously unknown X. So, for example, if X = 10, what do you think Y will be? Take a guess before you run this code:

In [27]:
model.predict([[10]])

array([30.81114987])

You might have thought 31, right? But it ended up not being exactly 31. Why do you think that is?

Remember that neural networks deal with probabilities, so given the data that we fed the NN with, it calculated that there is a very high probability that the relationship between `x` and `y` is `y=3x+1`, but with only 6 data points we can't know for sure. As a result, the result for 10 is very close to 31, but not necessarily 31.

As you work with neural networks, you'll see this pattern recurring. You will almost always deal with probabilities, not certainties, and will do a little bit of coding to figure out what the result is based on the probabilities, particularly when it comes to classification.


# Exercise


In this exercise you'll try to build a neural network that predicts the price of a house according to a simple formula.

So, imagine if house pricing was as easy as a house costs 50k + 50k per bedroom, so that a 1 bedroom house costs 100k, a 2 bedroom house costs 150k etc.

How would you create a neural network that learns this relationship so that it would predict a 7 bedroom house as costing close to 400k etc. (This part is where we see a hilarious amount of deviation from Singapore housing prices.)

Hint: Your network might work better if you scale the house price down. You don't have to give the answer 400...it might be better to create something that predicts the number 4, and then your answer is in the 'hundreds of thousands' etc.

Adapt the code below:

In [28]:
import numpy as np
from sklearn.linear_model import SGDRegressor

xs = np.array([-1.0, 0.0, 1.0, 2.0, 3.0, 4.0]).reshape(-1, 1)
ys = np.array([-2.0, 1.0, 4.0, 7.0, 10.0, 13.0])

model = SGDRegressor(max_iter=500, tol=1e-3)
# try model = LinearRegression()

model.fit(xs, ys)

print(model.predict([[10]]))


[30.79069858]


# Takeaways

*    The **scikit-learn** (sklearn) library hides most of the complexities in machine learning and allows us to simply call functions to perform tasks.
*    You can practice using sklearn with real data to solve problems.
*    Once you're comfortable with the basics, you can begin to explore customising the various parameters. This allows us to fine-tune and optimise our models.

