## 1. What is ML?
### Consider the traditional manner of building apps, as represented in the following diagram:

You express rules in a programming language. They act on data and your program provides answers**.** In the case of the activity detection, the rules (the code you wrote to define activity types) acted upon the data (the person's movement speed) to produce an answer: the return value from the function for determining the activity status of the user (whether they were walking, running, biking, or doing something else).

The process for detecting that activity status via ML is very similar, only the axes are different.

Instead of trying to define the rules and express them in a programming language, you provide the answers (typically called labels) along with the data, and the machine infers the rules that determine the relationship between the answers and data. For example, your activity detection scenario might look like this in an ML context:

You gather lots of data and label it to effectively say, "This is what walking looks like," or "This is what running looks like." Then, the computer can infer the rules that determine, from the data, what the distinct patterns that denote a particular activity are.

Beyond being an alternative method to programming that scenario, that approach also gives you the ability to open new scenarios, such as the golfing one that may not have been possible under the rules-based traditional programming approach.

In traditional programming, your code compiles into a binary that is typically called a program. In ML, the item that you create from the data and labels is called a model.

You pass the model some data and the model uses the rules that it inferred from the training to make a prediction, such as, "That data looks like walking," or "That data looks like biking."

## 2. Create your first ML model
### Consider the following sets of numbers. Can you see the relationship between them?

As you look at them, you might notice that the value of X is increasing by 1 as you read left to right and the corresponding value of Y is increasing by 3. You probably think that Y equals 3X plus or minus something. Then, you'd probably look at the 0 on X and see that Y is 1, and you'd come up with the relationship Y=3X+1.

That's almost exactly how you would use code to train a model to spot the patterns in the data!

Now, look at the code to do it.

How would you train a neural network to do the equivalent task? Using data! By feeding it with a set of X's and a set of Y's, it should be able to figure out the relationship between them.

Start with your imports. Here, you're importing TensorFlow and calling it tf for ease of use.

Next, import a library called numpy, which represents your data as lists easily and quickly.

The framework for defining a neural network as a set of sequential layers is called keras, so import that, too.

In [1]:
import tensorflow as tf
import numpy as np
from tensorflow import keras

#### Define and compile the neural network
Next, create the simplest possible neural network. It has one layer, that layer has one neuron, and the input shape to it is only one value.

In [2]:
model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Next, write the code to compile your neural network. When you do so, you need to specify two functions—a loss and an optimizer.

In this example, you know that the relationship between the numbers is Y=3X+1.

When the computer is trying to learn that, it makes a guess, maybe Y=10X+10. The loss function measures the guessed answers against the known correct answers and measures how well or badly it did.

Next, the model uses the optimizer function to make another guess. Based on the loss function's result, it tries to minimize the loss. At this point, maybe it will come up with something like Y=5X+5. While that's still pretty bad, it's closer to the correct result (the loss is lower).

The model repeats that for the number of epochs, which you'll see shortly.

First, here's how to tell it to use mean_squared_error for the loss and stochastic gradient descent (sgd) for the optimizer. You don't need to understand the math for those yet, but you can see that they work!

Over time, you'll learn the different and appropriate loss and optimizer functions for different scenarios.

In [3]:
model.compile(optimizer='sgd', loss='mean_squared_error')

#### Provide the data
Next, feed some data. In this case, you take the six X and six Y variables from earlier. You can see that the relationship between those is that Y=3X+1, so where X is -1, Y is -2.

A python library called NumPy provides lots of array type data structures to do this. Specify the values as an array in NumPy with np.array[].

In [4]:
xs = np.array([-1.0, 0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)
ys = np.array([-2.0, 1.0, 4.0, 7.0, 10.0, 13.0], dtype=float)

Now you have all the code you need to define the neural network. The next step is to train it to see if it can infer the patterns between those numbers and use them to create a model.

## 3. Train the neural network
The process of training the neural network, where it learns the relationship between the X's and Y's, is in the model.fit call. That's where it will go through the loop before making a guess, measuring how good or bad it is (the loss), or using the optimizer to make another guess. It will do that for the number of epochs that you specify. When you run that code, you'll see the loss will be printed out for each epoch.

In [5]:
model.fit(xs, ys, epochs=500)

Epoch 1/500
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 351ms/step - loss: 99.2257
Epoch 2/500
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 51ms/step - loss: 78.0750
Epoch 3/500
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 77ms/step - loss: 61.4345
Epoch 4/500
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 81ms/step - loss: 48.3423
Epoch 5/500
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 75ms/step - loss: 38.0418
Epoch 6/500
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 57ms/step - loss: 29.9377
Epoch 7/500
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 54ms/step - loss: 23.5617
Epoch 8/500
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 50ms/step - loss: 18.5451
Epoch 9/500
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 46ms/step - loss: 14.5982
Epoch 10/500
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 48ms/step - loss: 11.492

<keras.src.callbacks.history.History at 0x18101730c20>

You probably don't need all 500 epochs and can experiment with different amounts. As you can see from the example, the loss is really small after only 50 epochs, so that might be enough!

## 4. Use the model
You have a model that has been trained to learn the relationship between X and Y. You can use the model.predict method to have it figure out the Y for a previously unknown X. For example, if X is 10, what do you think Y will be? Take a guess before you run the following code:

Neural networks deal with probabilities, so it calculated that there is a very high probability that the relationship between X and Y is Y=3X+1, but it can't know for sure with only six data points. The result is very close to 31, but not necessarily 31.

As you work with neural networks, you'll see that pattern recurring. You will almost always deal with probabilities, not certainties, and will do a little bit of coding to figure out what the result is based on the probabilities, particularly when it comes to classification.

In [8]:
import tensorflow as tf

model.predict(tf.constant([[10.0]]))


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 40ms/step


array([[30.996449]], dtype=float32)

## 5. Congratulations
Believe it or not, you covered most of the concepts in ML that you'll use in far more complex scenarios. You learned how to train a neural network to spot the relationship between two sets of numbers by defining the network. You defined a set of layers (in this case only one) that contained neurons (also in this case, only one), which you then compiled with a loss function and an optimizer.

The collection of a network, loss function, and optimizer handles the process of guessing the relationship between the numbers, measuring how well they did, and then generating new parameters for new guesses. Learn more at TensorFlow.org.

## The End !!