# TensorFlow

+ you first define in Python a graph of computations to perform, 
+ and then TensorFlow takes that graph and runs it efficiently using optimized C++ code

![image.png](images/tfgraph.png)

+ It runs not only on Windows, Linux, and macOS, but also on mobile devices, including both iOS and Android.
+ It provides a very simple Python API called TF.Learn (tensorflow.contrib.learn), compatible with Scikit-Learn. As you will see, you can use it to train various types of neural networks in just a few lines of code. It was previously an independent project called Scikit Flow (or skflow).
+ It also provides another simple API called TF-slim (tensorflow.contrib.slim) to simplify building, training, and evaluating neural networks.
+ Several other high-level APIs have been built independently on top of TensorFlow, such as Keras or Pretty Tensor.
+ Its main Python API offers much more flexibility (at the cost of higher complexity) to create all sorts of computations, including any neural network architecture you can think of.
+ It includes highly efficient C++ implementations of many ML operations, particularly those needed to build neural networks. There is also a C++ API to define your own high-performance operations.
+ It provides several advanced optimization nodes to search for the parameters that minimize a cost function. These are very easy to use since TensorFlow automatically takes care of computing the gradients of the functions you define. This is called automatic differentiating (or autodiff).
+ It also comes with a great visualization tool called TensorBoard that allows you to browse through the computation graph, view learning curves, and more.
+ Google also launched a cloud service to run TensorFlow graphs.
+ Last but not least, it has a dedicated team of passionate and helpful developers, and a growing community contributing to improving it. It is one of the most popular open source projects on GitHub, and more and more great projects are being built on top of it (for examples, check out the
resources page on https://www.tensorflow.org/, or https://github.com/jtoy/awesome-tensorflow).
+ To ask technical questions, you should use http://stackoverflow.com/ and tag your question with "tensorflow". You can file bugs and feature requests through GitHub. For general discussions, join the Google group.

### Creating Your First Graph and Running It in a Session




In [17]:
def reset_graph(seed=42):
    tf.reset_default_graph()
    tf.set_random_seed(seed)
    np.random.seed(seed)

In [2]:
import tensorflow as tf


x = tf.Variable(3, name="x")
y = tf.Variable(4, name="y")
f = x*x*y + y + 2

+ The most important thing to understand is that this code does not actually perform any computation, even though it looks like it does (especially the last line). 
+ It just creates a computation graph. 
+ In fact, even the variables are not initialized yet. 
+ To evaluate this graph, you need to open a TensorFlow session and use it to initialize the variables and evaluate f. 
+ A TensorFlow session takes care of placing the operations onto devices such as CPUs and GPUs and running them, and it holds all the variable values.

In [3]:
with tf.Session() as sess:    
    x.initializer.run()
    y.initializer.run()
    result = f.eval()

In [4]:
result

42

+ Instead of manually running the initializer for every single variable, you can use the
`global_variables_initializer()` function. 
+ Note that it does not actually perform the initialization immediately, but rather creates a node in the graph that will initialize all variables when it is run/


In [5]:
init = tf.global_variables_initializer() # prepare an init node

with tf.Session() as sess:
    init.run() # actually initialize all the variables
    result = f.eval()
    
print(result)

42


+ A TensorFlow program is typically split into two parts: 
    + the first part builds a computation graph (this is called the construction phase), and the second part runs it (this is the execution phase). 
        + The construction phase typically builds a computation graph representing the ML model and the computations required to train it. 
        
        + The execution phase generally runs a loop that evaluates a training step repeatedly (for example, one step per mini-batch), gradually improving the model parameters.

Any node you create is automatically added to the default graph:

In [6]:
x1 = tf.Variable(1)
x1.graph is tf.get_default_graph()

True

+ In most cases this is fine, but sometimes you may want to manage multiple independent graphs. 
+ You can do this by creating a new Graph and temporarily making it the default graph inside a with block, like so:

In [7]:
graph = tf.Graph()
with graph.as_default():
    x2 = tf.Variable(2)
    
print(x2.graph is graph)
print(x2.graph is tf.get_default_graph())

True
False


+ In Jupyter (or in a Python shell), it is common to run the same commands more than once while you are experimenting. As a result, you may end up with a default graph containing many duplicate nodes. 
+ One solution is to restart the Jupyter kernel (or the Python shell), but a more convenient solution is to just reset the default graph by running

In [8]:
tf.reset_default_graph()

### Lifecycle of a Node Value


When you evaluate a node, TensorFlow automatically determines the set of nodes that it depends on and it
evaluates these nodes first. 

For example, consider the following code:

In [9]:
w = tf.constant(3)
x = w + 2
y = x + 5
z = x * 3
with tf.Session() as sess:
    print(y.eval()) # 10
    print(z.eval()) # 15

10
15


+ First, this code defines a very simple graph. Then it starts a session and runs the graph to evaluate y:
+ TensorFlow automatically detects that y depends on w, which depends on x, so it first evaluates w, then x, then y, and returns the value of y. 
+ Finally, the code runs the graph to evaluate z. 
+ Once again, TensorFlow detects that it must first evaluate w and x. 
+ It is important to note that it will **not reuse the result of the previous evaluation** of w and x. 
+ In short, the preceding code evaluates w and x twice.

+ All node values are dropped between graph runs, except variable values, which are maintained by the session across graph runs
+ A variable starts its life when its initializer is run, and it ends when the session is closed.

If you want to evaluate y and z efficiently, without evaluating w and x twice as in the previous code, you must ask TensorFlow to evaluate both y and z in just one graph run, as shown in the following code:

In [10]:
with tf.Session() as sess:
    y_val, z_val = sess.run([y, z])
    print(y_val) # 10
    print(z_val) # 15

10
15


+ In single-process TensorFlow, multiple sessions do not share any state, even if they reuse the same graph (each session would have its own copy of every variable).
+ In distributed TensorFlow, variable state is stored on the servers, not in the sessions, so multiple sessions can share the same variables.

### Linear Regression with TensorFlow

+ TensorFlow operations (also called ops for short) can take any number of inputs and produce any number of outputs. 
+ For example, the addition and multiplication ops each take two inputs and produce one output.
+ Constants and variables take no input (they are called source ops). 
+ The inputs and outputs are multidimensional arrays, called tensors (hence the name “tensor flow”). 
+ Just like NumPy arrays, tensors have a type and a shape. In fact, in the Python API tensors are simply represented by NumPy ndarrays.
+ They typically contain floats, but you can also use them to carry strings (arbitrary byte arrays).

##### California Housing Prices

+ The following code manipulates 2D arrays to perform Linear Regression on the California housing dataset. 
+ It starts by fetching the dataset; 
+ then it adds an extra bias input feature (x0 = 1) to all training instances (it does so using NumPy so it runs immediately); 
+ then it creates two TensorFlow constant nodes, X and y, to hold this data and the targets, and it uses some of the matrix operations provided by TensorFlow to define theta. + These matrix functions — transpose(), matmul(), and matrix_inverse() — are self-explanatory, but as usual they do not perform any computations immediately; instead, they create nodes in the graph that will perform them when the graph is run. 
+ You may recognize that the definition of theta corresponds to the Normal Equation (theta t = XT · X)–1 · XT · y; see ). 
+ Finally, the code creates a session and uses it to evaluate theta.

In [11]:
import numpy as np
from sklearn.datasets import fetch_california_housing

housing = fetch_california_housing()

m, n = housing.data.shape

housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]

X = tf.constant(housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")

XT = tf.transpose(X)

theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, X)), XT), y)

with tf.Session() as sess:
    theta_value = theta.eval()

In [12]:
theta_value

array([[-3.7185181e+01],
       [ 4.3633747e-01],
       [ 9.3952334e-03],
       [-1.0711310e-01],
       [ 6.4479220e-01],
       [-4.0338000e-06],
       [-3.7813708e-03],
       [-4.2348403e-01],
       [-4.3721911e-01]], dtype=float32)

In [13]:
## using plain numpy

X = housing_data_plus_bias
y = housing.target.reshape(-1, 1)
theta_numpy = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)

print(theta_numpy)

[[-3.69419202e+01]
 [ 4.36693293e-01]
 [ 9.43577803e-03]
 [-1.07322041e-01]
 [ 6.45065694e-01]
 [-3.97638942e-06]
 [-3.78654266e-03]
 [-4.21314378e-01]
 [-4.34513755e-01]]


The main benefit of this code versus computing the Normal Equation directly using NumPy is that
TensorFlow will automatically run this on your GPU card if you have one

### Implementing Gradient Descent

When using Gradient Descent, remember that it is important to first normalize the input feature vectors, or else training may be
much slower. You can do this using TensorFlow, NumPy, Scikit-Learn’s StandardScaler, or any other solution you prefer. The
following code assumes that this normalization has already been done.

#### Manually computing the gradient



In [14]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

scaled_housing_data = scaler.fit_transform(housing.data)
scaled_housing_data_plus_bias = np.c_[np.ones((m, 1)), scaled_housing_data]

In [21]:
reset_graph()

n_epochs = 1000
learning_rate = 0.01

X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")

theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0), name="theta")

y_pred = tf.matmul(X, theta, name="predictions")

error = y_pred - y

mse = tf.reduce_mean(tf.square(error), name="mse")

# gradients = 2/m * tf.matmul(tf.transpose(X), error)
# training_op = tf.assign(theta, theta - learning_rate * gradients)
# init = tf.global_variables_initializer()
# with tf.Session() as sess:
# sess.run(init)
# for epoch in range(n_epochs):
# if epoch % 100 == 0:
# print("Epoch", epoch, "MSE =", mse.eval())
# sess.run(training_op)
# best_theta = theta.eval()

<tf.Tensor 'mse:0' shape=() dtype=float32>