<a href="https://colab.research.google.com/github/lamiaehana/Udacity-Intro-to-TenserFlow-for-deep-learning/blob/master/Lesson%201_Note.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Intro to TenserFlow for deep learning

## Introduction

When studying Machine Learning you will come across many different terms such as <u>artificial intelligence</u>, <u>machine learning</u>, <u>neural network</u>, and <u>deep learning</u>. But what do these terms actually mean and how do they relate to each other?

Below we give a brief description of these terms:  

**Artificial Intelligence:** A field of computer science that aims to make computers achieve human-style intelligence. There are many approaches to reaching this goal, including machine learning and deep learning.

*   **Machine Learning:** A set of related techniques in which computers are trained to perform a particular task rather than by explicitly programming them.
*   **Neural Network:** A construct in Machine Learning inspired by the network of neurons (nerve cells) in the biological brain. Neural networks are a fundamental part of deep learning, and will be covered in this course.
*   **Deep Learning:** A subfield of machine learning that uses multi-layered neural networks. Often, “machine learning” and “deep learning” are used interchangeably

<u>Machine learning</u> and <u>deep learning</u> also have many subfields, branches, and special techniques. A notable example of this diversity is the separation of **Supervised Learning** and **Unsupervised Learning**.

To over simplify — in **supervised learning** you know what you want to teach the computer, while **unsupervised learning** is about letting the computer figure out what can be learned. Supervised learning is the most common type of machine learning, and will be the focus of this course.






## Applications of Machine Learning


## Recap:

We saw that by training the model with input data and the corresponding output, the model learned to multiply the input by 1.8 and then add 32 to get the correct result.

This was really impressive considering that we only needed a few lines code:

In [0]:
l0 = tf.keras.layers.Dense(units=1, input_shape=[1]) 
model = tf.keras.Sequential([l0])
model.compile(loss='mean_squared_error', optimizer=tf.keras.optimizers.Adam(0.1))
history = model.fit(celsius_q, fahrenheit_a, epochs=500, verbose=False)
model.predict([100.0])

This example is the general plan for of any machine learning program. You will use the same structure to create and train your neural network, and use it to make predictions.

## The Training Process

The training process (happening in model.fit(...)) is really about tuning the internal variables of the networks to the best possible values, so that they can map the input to the output. This is achieved through an optimization process called Gradient Descent, which uses Numeric Analysis to find the best possible values to the internal variables of the model.

To do machine learning, you don't really need to understand these details. But for the curious: **gradient descent** iteratively adjusts parameters, nudging them in the correct direction a bit at a time until they reach the best values. In this case “best values” means that nudging them any more would make the model perform worse. The function that measures how good or bad the model is during each iteration is called the **“loss function”**, and the goal of each nudge is to “minimize the loss function.”

The training process starts with a forward pass, where the input data is fed to the neural network. Then the model applies its internal math on the input and internal variables to predict an answer.

Once a value is predicted, the difference between that predicted value and the correct value is calculated. This difference is called **the loss**, and it's a measure of how well the model performed the mapping task. The value of the loss is calculated using a loss function, which we specified with the loss parameter when calling **model.compile()**.

After the loss is calculated, the internal variables (weights and biases) of all the layers of the neural network are adjusted, so as to minimize this loss — that is, to make the output value closer to the correct value.

This optimization process is called **Gradient Descent**. The specific algorithm used to calculate the new value of each internal variable is specified by the optimizer parameter when calling **model.compile(...)**. In this example we used the **Adam optimizer**.



By now you should know what the following terms are:
*   **Feature:** The input(s) to our model
*   **Examples:** An input/output pair used for training
*   **Labels:** The output of the model
*   **Layer:** A collection of nodes connected together within a neural network.
*   **Model:** The representation of your neural network
*   **Dense and Fully Connected (FC):** Each node in one layer is connected to each node in the previous layer.
*   **Weights and biases:** The internal variables of model
*   **Loss:** The discrepancy between the desired output and the actual output
*   **MSE:** Mean squared error, a type of loss function that counts a small number of large discrepancies as worse than a large number of small ones.
*   **Gradient Descent:** An algorithm that changes the internal variables a bit at a time to gradually reduce the loss function.
*   **Optimizer:** A specific implementation of the gradient descent algorithm. (There are many algorithms for this. In this course we will only use the “Adam” Optimizer, which stands for ADAptive with Momentum. It is considered the best-practice optimizer.)
*   **Learning rate:** The “step size” for loss improvement during gradient descent.
*   **Batch:** The set of examples used during training of the neural network
*   **Epoch:** A full pass over the entire training dataset
*   **Forward pass:** The computation of output values from input
*  ** Backward pass (backpropagation): ** The calculation of internal variable adjustments according to the optimizer algorithm, starting from the output layer and working back through each layer to the input.