<img src = 'https://www.gstatic.com/devrel-devsite/prod/v15f72515e1c53f03e6d573e85fc193d888eb8fb1758082e4a5ecf80f00fa48ef/tensorflow/images/lockup.svg'/>

An end-to-end open source machine learning platform The core open source library to help you develop and train ML models. Get started quickly by running Colab notebooks directly in your browser.


TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications.

1. Easy model building
2. Robust ML production anywhere
3. Powerful experimentation for research


In [2]:
import tensorflow as tf

In [7]:
tf.__version__

'2.0.0'

In [8]:
# Caddreate a Constant op
# The op is added as a node to the default graph.
#
# The value returned by the constructor represents the output
# of the Constant op.

hello = tf.constant('Hello, TensorFlow!')

In [9]:
hello

<tf.Tensor: id=1, shape=(), dtype=string, numpy=b'Hello, TensorFlow!'>

In [6]:
print(hello)

tf.Tensor(b'Hello, TensorFlow!', shape=(), dtype=string)


In [14]:
# Basic constant operations
# The value returned by the constructor represents the output
# of the Constant op.
a = tf.constant(2)
b = tf.constant(3)

In [15]:
c = a+b

In [24]:
print(c)

tf.Tensor(5, shape=(), dtype=int32)


In [29]:
import tensorflow as tf

# create graph
a = tf.constant(2)
b = tf.constant(3)
c = tf.add(a, b)
print(c)

tf.Tensor(5, shape=(), dtype=int32)


In [31]:
tf.subtract(a,b)

<tf.Tensor: id=29, shape=(), dtype=int32, numpy=-1>

In [33]:
tf.multiply(a,b)

<tf.Tensor: id=31, shape=(), dtype=int32, numpy=6>

In [35]:
tf.pow(a,b)

<tf.Tensor: id=33, shape=(), dtype=int32, numpy=8>

In [18]:
const1 = tf.constant([[1,2,3], [1,2,3]]);
const2 = tf.constant([[3,4,5], [3,4,5]]);

result = tf.add(const1, const2);
print(result)

tf.Tensor(
[[4 6 8]
 [4 6 8]], shape=(2, 3), dtype=int32)


In [23]:
%load_ext tensorboard
%tensorboard --logdir {logs_base_dir}

The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard


Reusing TensorBoard on port 6006 (pid 11880), started 0:00:39 ago. (Use '!kill 11880' to kill it.)

##### Keras – High-Level API

 Keras is default High-Level API of the TensorFlow. In this article, we will use this API to build a simple neural network later, so let’s explore a little bit how it functions. Depending on the type of a problem we can use a variety of layers for the neural network that we want to build. Essentially, Keras is providing different types of layers (tensorflow.keras.layers) which we need to connect into a meaningful graph that will solve our problem. There are several ways in which we can do this API when building deep learning models:

1. Using Sequential class
2. Using Functional API
3. Model subclassing

The first approach is the simplest one. We are using Sequential class, which is actually a placeholder for layers and we add layers in the order we want to. We may want to choose this approach when we want to build neural networks in the fastest way possible. There are many types of Keras layers we can choose from, too. The most basic one and the one we are going to use in this article is called Dense. It has many options for setting the inputs, activation functions and so on. Apart from Dense, Keras API provides different types of layers for Convolutional Neural Networks, Recurrent Neural Networks, etc. This is out of the scope of this post. So, let’s see how one can build a Neural Network using Sequential and Dense.

In [60]:
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense

model = Sequential()
model.add(Dense(3, input_dim=2, activation='relu'))
model.add(Dense(1, activation='softmax'))

In [62]:
model

<tensorflow.python.keras.engine.sequential.Sequential at 0x248ecc7c908>

There are many types of neural network architectures. However, no matter what architecture you choose, the math it contains (what calculations are being performed, and in what order) is not modified during training. Instead, it is the internal variables (“weights” and “biases”) which are updated during training.

For example, in the Celsius to Fahrenheit conversion problem, the model starts by multiplying the input by some number (the weight) and adding another number (the bias). Training the model involves finding the right values for these variables, not changing from multiplication and addition to some other operation.

One cool thing to think about. If you solved the Celsius to Fahrenheit conversion problem you saw in the video, you probably did so because you had some prior knowledge of how to convert between Celsius and Fahrenheit. For example, you may have known that 0 degrees Celsius corresponds to 32 degrees Fahrenheit. On the other hand, machine learning systems never have any previous knowledge to help them solve problems. They learn to solve these types of problems without any prior knowledge.

Recap
Congratulations! You just trained your first machine learning model. We saw that by training the model with input data and the corresponding output, the model learned to multiply the input by 1.8 and then add 32 to get the correct result.


This was really impressive considering that we only needed a few lines code:

In [64]:
l0 = tf.keras.layers.Dense(units=1, input_shape=[1]) 
model = tf.keras.Sequential([l0])
model.compile(loss='mean_squared_error', optimizer=tf.keras.optimizers.Adam(0.1))
history = model.fit(celsius_q, fahrenheit_a, epochs=500, verbose=False)
model.predict([100.0])

NameError: name 'celsius_q' is not defined

This example is the general plan for of any machine learning program. You will use the same structure to create and train your neural network, and use it to make predictions.

The Training Process
The training process (happening in model.fit(...)) is really about tuning the internal variables of the networks to the best possible values, so that they can map the input to the output. This is achieved through an optimization process called Gradient Descent, which uses Numeric Analysis to find the best possible values to the internal variables of the model.

To do machine learning, you don't really need to understand these details. But for the curious: gradient descent iteratively adjusts parameters, nudging them in the correct direction a bit at a time until they reach the best values. In this case “best values” means that nudging them any more would make the model perform worse. The function that measures how good or bad the model is during each iteration is called the “loss function”, and the goal of each nudge is to “minimize the loss function.”

The training process starts with a forward pass, where the input data is fed to the neural network (see Fig.1). Then the model applies its internal math on the input and internal variables to predict an answer ("Model Predicts a Value" in Fig. 1).

In our example, the input was the degrees in Celsius, and the model predicted the corresponding degrees in Fahrenheit.


Figure 1. Forward Pass

Once a value is predicted, the difference between that predicted value and the correct value is calculated. This difference is called the loss, and it's a measure of how well the model performed the mapping task. The value of the loss is calculated using a loss function, which we specified with the loss parameter when calling model.compile().

After the loss is calculated, the internal variables (weights and biases) of all the layers of the neural network are adjusted, so as to minimize this loss — that is, to make the output value closer to the correct value (see Fig. 2).


Figure 2. Backpropagation

This optimization process is called Gradient Descent. The specific algorithm used to calculate the new value of each internal variable is specified by the optimizer parameter when calling model.compile(...). In this example we used the Adam optimizer.

It is not required for this course, but if you're interested in learning more details about how the training process works, you can look at the lesson on reducing loss in Google’s machine learning crash course.

By now you should know what the following terms are:

Feature: The input(s) to our model
Examples: An input/output pair used for training
Labels: The output of the model
Layer: A collection of nodes connected together within a neural network.
Model: The representation of your neural network
Dense and Fully Connected (FC): Each node in one layer is connected to each node in the previous layer.
Weights and biases: The internal variables of model
Loss: The discrepancy between the desired output and the actual output
MSE: Mean squared error, a type of loss function that counts a small number of large discrepancies as worse than a large number of small ones.
Gradient Descent: An algorithm that changes the internal variables a bit at a time to gradually reduce the loss function.
Optimizer: A specific implementation of the gradient descent algorithm. (There are many algorithms for this. In this course we will only use the “Adam” Optimizer, which stands for ADAptive with Momentum. It is considered the best-practice optimizer.)
Learning rate: The “step size” for loss improvement during gradient descent.
Batch: The set of examples used during training of the neural network
Epoch: A full pass over the entire training dataset
Forward pass: The computation of output values from input
Backward pass (backpropagation): The calculation of internal variable adjustments according to the optimizer algorithm, starting from the output layer and working back through each layer to the input.

The Rectified Linear Unit (ReLU)
In this lesson we talked about ReLU and how it gives our Dense layer more power. ReLU stands for Rectified Linear Unit and it is a mathematical function that looks like this:
<img src ='image/tensorflow-l3f1.png'/>

As we can see, the ReLU function gives an output of 0 if the input is negative or zero, and if input is positive, then the output will be equal to the input.

ReLU gives the network the ability to solve nonlinear problems.

Converting Celsius to Fahrenheit is a linear problem because f = 1.8*c + 32 is the same form as the equation for a line, y = m*x + b. But most problems we want to solve are nonlinear. In these cases, adding ReLU to our Dense layers can help solve the problem.

ReLU is a type of activation function. There several of these functions (ReLU, Sigmoid, tanh, ELU), but ReLU is used most commonly and serves as a good default. To build and use models that include ReLU, you don’t have to understand its internals. But, if you want to know more, see this article on ReLU in Deep Learning.

Let’s review some of the new terms that were introduced in this lesson:

Flattening: The process of converting a 2d image into 1d vector
ReLU: An activation function that allows a model to solve nonlinear problems
Softmax: A function that provides probabilities for each possible output class
Classification: A machine learning model used for distinguishing among two or more output categories
