# Tensorflow 
Open source library for numerical computation using data flow graph. Created and Maintained by Google.<br><br>
Tensorflow got it's name from **tensor**, array of arbitrary dmensions. Using Tensorflow, one can manipulate tensors with higher dimensions.

## Why Tensorflow?
1. Efficient
2. Scalable
3. Maintainable
4. Portable
5. Flexible
6. Visualization in TensorBoard
7. Easy to save and restore models

In [2]:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import tensorflow as tf

## How Tensorflow works?
Tensorflow operations creates, destroys, and manipulates tensors. All the computation can be operations can be easily visualized using *computation graph* or *data flow graph*.<br>
Graph's **nodes** are operations and **edges** are tensors. Tensors flows through graph, and gets manipulated at each node by an operation.

### Tensor
A tensor is an n-d array,
* 0-d tensor : scalar
* 1-d tensor : vector
* 2-d tensor : matrix
<br>

A tensor can be defined as a constant or a variable.

### Constants

In [3]:
s = tf.constant(24)  #scalar
v = tf.constant([1, 2, 3, 4], dtype=tf.int64, name='vector')  #vector
m = tf.constant([[1,2], [3,4]]) #matrix
t = tf.constant( [ [[1,2,3],[2,3,4],[3,4,5]] , [[4,5,6],[5,6,7],[6,7,8]] , [[7,8,9],[8,9,10],[9,10,11]] ] )

tf.constant(24, name="scalar") creates a new tf.Operation named "scalar" and returns a tf.Tensor named "scalar:0"

In [4]:
s

<tf.Tensor 'Const:0' shape=() dtype=int32>

As you can see, it just show the name, shape and type of the tensor in the graph. We will see it's value when we run it in a TensorFlow session.

## Using tf.Session() to evaluate the graph
A Session object encapsulates the environment in which memory is allocated for storing values of variables, operations are executed, and tensors are evaluated.

In [5]:
with tf.Session() as sess:
    result = sess.run(s)
    print ("Scalar (1 entry):\n %s \n" % result)
    result = sess.run(v)
    print ("Vector (3 entries) :\n %s \n" % result)
    result = sess.run(m)
    print ("Matrix (3x3 entries):\n %s \n" % result)
    result = sess.run(t)
    print ("Tensor (3x3x3 entries) :\n %s \n" % result)

Scalar (1 entry):
 24 

Vector (3 entries) :
 [1 2 3 4] 

Matrix (3x3 entries):
 [[1 2]
 [3 4]] 

Tensor (3x3x3 entries) :
 [[[ 1  2  3]
  [ 2  3  4]
  [ 3  4  5]]

 [[ 4  5  6]
  [ 5  6  7]
  [ 6  7  8]]

 [[ 7  8  9]
  [ 8  9 10]
  [ 9 10 11]]] 



In [6]:
s.shape, v.shape, m.shape, t.shape

(TensorShape([]),
 TensorShape([Dimension(4)]),
 TensorShape([Dimension(2), Dimension(2)]),
 TensorShape([Dimension(3), Dimension(3), Dimension(3)]))

### Some Operations

In [7]:
#Creating a graph
g = tf.Graph()

#Setting the generated graph as default graph
with g.as_default():
    x = tf.constant(5, name="x")
    y = tf.constant(4, name="y")
    
    add = tf.add(x, y, name="add")
    mul = tf.multiply(x, y, name="mul")
    
    with tf.Session() as sess:
        print(sess.run(add))
        print(mul.eval())

9
20


In [8]:
g = tf.Graph()

with g.as_default():
    x = tf.constant(5, name="x")
    y = tf.constant(4, name="y")
    
    add = tf.add(x, y, name="add")
    mul = tf.multiply(x, y, name="mul")
    
    with tf.Session() as sess:
        #sess.run(fetches) will help you fetch multiple values, eval() cannot.
        a, m = sess.run(fetches=[add, mul])
        print(a)
        print(m)

9
20


In [9]:
graph3 = tf.Graph()
with graph3.as_default():
    Matrix_one = tf.constant([[1,2,3],[2,3,4],[3,4,5]])
    Matrix_two = tf.constant([[2,2,2],[2,2,2],[2,2,2]])

    add_1_operation = tf.add(Matrix_one, Matrix_two)
    add_2_operation = Matrix_one + Matrix_two

with tf.Session(graph =graph3) as sess:
    result = sess.run(add_1_operation)
    print ("Defined using tensorflow function :")
    print(result)
    result = sess.run(add_2_operation)
    print ("Defined using normal expressions :")
    print(result)

Defined using tensorflow function :
[[3 4 5]
 [4 5 6]
 [5 6 7]]
Defined using normal expressions :
[[3 4 5]
 [4 5 6]
 [5 6 7]]


In [10]:
graph4 = tf.Graph()
with graph4.as_default():
    Matrix_one = tf.constant([[2,3],[3,4]])
    Matrix_two = tf.constant([[2,3],[3,4]])

    mul_operation = tf.matmul(Matrix_one, Matrix_two)

with tf.Session(graph = graph4) as sess:
    result = sess.run(mul_operation)
    print ("Defined using tensorflow function :")
    print(result)

Defined using tensorflow function :
[[13 18]
 [18 25]]


### Variables

TensorFlow variables are used to share and persist some stats that are manipulated by our program. That is, when you define a variable, TensorFlow adds a tf.Operation to your graph. Then, this operation will store a writable tensor value that persists between tf.Session.run calls. So, you can update the value of a variable through each run, while you cannot update tensor (e.g a tensor created by tf.constant()) through multiple runs in a session. 

In [11]:
#Creating variable using Variable object
v_s = tf.Variable(5)
v_v = tf.Variable([1, 2, 3, 4], dtype=tf.int32)
v_m = tf.Variable(tf.zeros([25,4]), dtype=tf.float32, name="matrix")

To define variables we use the command tf.Variable(). To be able to use variables in a computation graph it is necessary to initialize them before running the graph in a session. This is done by running tf.global_variables_initializer().

In [12]:
update = tf.assign(v_s, 25)

To do this we use the tf.assign(reference_variable, value_to_update) command. tf.assign takes in two arguments, the reference_variable to update, and assign it to the value_to_update it by.

In [13]:
init_op = tf.global_variables_initializer()

As stated earlier, Variables must be initialized by running an initialization operation after having launched the graph. We first have to add the initialization operation to the graph.

In [14]:
with tf.Session() as session:
    session.run(init_op)
    print(session.run(v_s))
    session.run(update)
    print(session.run(v_s))

5
25


In [17]:
#Creating variable with tf.get_variable method
Weights = tf.get_variable("Weights", shape=(25,4), initializer=tf.random_uniform_initializer())
Bias = tf.get_variable("Bias", initializer=tf.random.normal([25]))

ValueError: Variable Weights already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:

  File "<ipython-input-15-4e5fe2949989>", line 2, in <module>
    Weights = tf.get_variable("Weights", shape=(25,4), initializer=tf.random_uniform_initializer())
  File "/home/akshay/venv/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3326, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "/home/akshay/venv/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3249, in run_ast_nodes
    if (await self.run_code(code, result,  async_=asy)):
  File "/home/akshay/venv/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3058, in run_cell_async
    interactivity=interactivity, compiler=compiler, result=result)
  File "/home/akshay/venv/lib/python3.6/site-packages/IPython/core/async_helpers.py", line 68, in _pseudo_sync_runner
    coro.send(None)


In [18]:
g = tf.Graph()

with g.as_default():
    weights = tf.get_variable("Weights", shape=(25,4), initializer=tf.random_uniform_initializer())
    bias = tf.get_variable("Bias", initializer=tf.random.normal([25]))
    
    with tf.Session() as sess:
        #initialising all variables at once
        sess.run(tf.global_variables_initializer())
        print(weights.eval())
        print(sess.run(bias))

[[1.99679613e-01 5.72880149e-01 7.97352433e-01 4.76609468e-02]
 [1.45309567e-01 2.71592140e-02 7.42301941e-02 8.92637730e-01]
 [1.34758353e-01 2.31332541e-01 7.14537382e-01 2.79857516e-01]
 [7.24956632e-01 7.66727805e-01 8.31718445e-02 2.09059954e-01]
 [4.53525782e-02 2.70085216e-01 3.33217621e-01 8.06650996e-01]
 [4.68064547e-01 7.51365542e-01 4.29036856e-01 1.87196255e-01]
 [4.66478586e-01 4.64671135e-01 8.54185820e-02 5.02733350e-01]
 [1.13506794e-01 5.74873209e-01 1.01101160e-01 2.47488618e-01]
 [6.57383442e-01 6.27930045e-01 2.67067671e-01 8.87267113e-01]
 [7.32314587e-02 6.64256692e-01 7.02024460e-01 8.70902538e-01]
 [2.92692184e-01 9.94082689e-01 1.19687080e-01 1.02327228e-01]
 [7.93508172e-01 1.89821482e-01 1.98092818e-01 3.21712136e-01]
 [2.22542286e-02 3.03220987e-01 9.84072685e-04 3.58362436e-01]
 [7.48715043e-01 8.98232102e-01 9.74064112e-01 7.15899467e-02]
 [2.48135328e-02 3.58215094e-01 9.67699409e-01 9.16957140e-01]
 [3.49476218e-01 4.48064208e-01 5.31490088e-01 4.410138

## Visualizing Graphs using TensorBoard

In [19]:
import shutil
if os.path.exists('./graphs'):
    shutil.rmtree('./graphs/')

In [20]:
x = tf.constant(5, name="x")
y = tf.constant(4, name="y")

add = tf.add(x, y, name="add")
mul = tf.multiply(x, y, name="mul")

with tf.Session() as sess:
    #Creates the summary writer
    #After graph definition
    #Before Session
    #Since we not created a graph explicitly,
    #Every operation is being done on default_graph
    writer = tf.summary.FileWriter('./graphs', tf.get_default_graph())
    a, m = sess.run(fetches=[add, mul])
    print(a, m)
    
#To access graph in Tensorboard
#0. Copy the code. Add import tensorflow as tf (at the top). Save the file as tboard.py.
#1. Open terminal. Run python(or python3) tboard.py.
#2. Check for graphs folder in the same directory. 
#3. If it is present. Run: tensorboard --logdir="./graphs" --port 6006
#4. Open browser and go to: http://localhost:6006/

9 20


In [21]:
#If you are using Jypyter Notebook, You can try the following command (uncomment next line)
!tensorboard --logdir="./graphs" --port 6008
#Open browser and go to: http://localhost:6006/

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
TensorBoard 1.14.0 at http://Ubuntu:6008/ (Press CTRL+C to quit)
I1003 10:20:15.390483 140286350763776 _internal.py:122] ::ffff:127.0.0.1 - - [03/Oct/2020 10:20:15] "[37mGET / HTTP/1.1[0m" 200 -
I1003 10:20:16.239715 140286350763776 _internal.py:122] ::ffff:127.0.0.1 - - [03/Oct/2020 10:20:16] "[37mGET /font-roboto/oMMgfZMQthOryQo9n22dcuvvDin1pK8aKteLpeZ5c0A.woff2 HTTP/1.1[0m

### Placeholders
A placeholder is simply a variable that we will assign data to at a later time. It allows us to create our operations and build our computation graph, without needing the data.
<br>
Placeholders are simplest way to load data, but it is not efficient for loading large data. You can go for estimators or other options that follows in eager execution mode or tensorflow==2.x

In [22]:
#creating a placeholder
data = tf.placeholder(shape=[25,4], dtype=tf.float32)

In [23]:
a = tf.placeholder(tf.float32, shape=[3])
b = tf.constant([5, 5, 5], tf.float32)

In [24]:
add = a+b

In [25]:
#Value to the placeholder is provided during the run
#Using feed_dict
with tf.Session() as sess:
    res = sess.run(add, feed_dict={a:[4,4,4]})
    print(res)

[9. 9. 9.]


#### An example to show how Placeholders and Variables are created in data flow graph

In [26]:
import shutil
if os.path.exists('./graphs_linear/'):
    shutil.rmtree('./graphs_linear/')

In [29]:
import numpy as np
inp_data = np.random.rand(100,1).reshape(25,4)

glinear = tf.Graph()
with glinear.as_default():
    
    inputs = tf.placeholder(shape=[25,4], dtype=tf.float32)

    W = tf.get_variable("W", shape=(1, 4), initializer=tf.random_uniform_initializer())
    B = tf.get_variable("B", initializer=tf.random.normal([25,1]))

    y = tf.matmul(inputs, tf.transpose(W)) + B

    with tf.Session() as sess:
        writer = tf.summary.FileWriter('./graphs_linear', tf.get_default_graph())
        sess.run(tf.initialize_all_variables())
        res = sess.run(y, feed_dict={inputs:inp_data})
        print(res)

Instructions for updating:
Use `tf.global_variables_initializer` instead.
[[ 0.4380539 ]
 [-0.330014  ]
 [ 1.0499535 ]
 [ 0.4561507 ]
 [ 1.5493466 ]
 [ 0.02349877]
 [ 1.4971657 ]
 [ 3.0115342 ]
 [ 1.9058092 ]
 [ 0.05143076]
 [ 0.43108734]
 [ 0.05005777]
 [-0.79303914]
 [ 0.65684104]
 [ 1.1828899 ]
 [-0.19562519]
 [ 0.81509984]
 [ 1.8582444 ]
 [ 0.87833464]
 [ 1.3966749 ]
 [ 0.9746148 ]
 [-0.7965375 ]
 [ 0.25242484]
 [ 1.6816807 ]
 [ 1.7769268 ]]


Ucomment the line below to visualize in tensorboard

In [30]:
!tensorboard --logdir="./graphs_linear/" --port 6008

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
TensorBoard 1.14.0 at http://Ubuntu:6008/ (Press CTRL+C to quit)
I1003 10:23:28.091343 140334998562560 _internal.py:122] ::ffff:127.0.0.1 - - [03/Oct/2020 10:23:28] "[37mGET / HTTP/1.1[0m" 200 -
I1003 10:23:28.511120 140334998562560 _internal.py:122] ::ffff:127.0.0.1 - - [03/Oct/2020 10:23:28] "[37mGET /font-roboto/oMMgfZMQthOryQo9n22dcuvvDin1pK8aKteLpeZ5c0A.woff2 HTTP/1.1[0m

## Simple Linear Regression

Defining a linear regression in simple terms, is the approximation of a linear model used to describe the relationship between two or more variables. In a simple linear regression there are two variables, the dependent variable, which can be seen as the "state" or "final goal" that we study and try to predict, and the independent variables, also known as explanatory variables, which can be seen as the "causes" of the "states". 

When more than one independent variable is present the process is called multiple linear regression. <br>
When multiple dependent variables are predicted the process is known as multivariate linear regression.

The equation of a simple linear model is

$$Y = a X + b $$

Where Y is the dependent variable and X is the independent variable, and <b>a</b> and <b>b</b> being the parameters we adjust. <b>a</b> is known as "slope" or "gradient" and <b>b</b> is the "intercept". You can interpret this equation as Y being a function of X, or Y being dependent on X.

In [31]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

### Loading and pre-processing dataset

In [32]:
df = pd.read_csv('./../data/imdbtop1000/imdb_data.csv', sep='\t')
df = df.rename(columns={'User Votes': 'Votes',
                        'Imdb Rating': 'Rating',
                       'Gross(in Million Dollars)': 'Earnings',
                       'Runtime(Minutes)' : 'Runtime'})
#It is very important to normalise the input features in a proper range
#It helps in avoiding very large calculations
df.Votes = df.Votes / 1000000
df.head()

FileNotFoundError: [Errno 2] File b'./../data/imdbtop1000/imdb_data.csv' does not exist: b'./../data/imdbtop1000/imdb_data.csv'

In [None]:
df.describe()

In [None]:
#Correlation between columns to identify best feature for training a model
df.corr()

In [None]:
train_x = np.asarray(df.Votes)
train_y = np.asarray(df.Rating)

In [None]:
plt.figure(figsize=(8,6))
plt.title("Analysis of data points Votes Vs Rating")
sns.scatterplot(x=train_x, y=train_y)
plt.xlabel('User Votes')
plt.ylabel('IMDB Rating')
plt.show()

### Initialising weights and bias. And defining model

In [None]:
w = tf.Variable(10.0)
b = tf.Variable(12.3)
y = w * train_x + b

Now, we are going to define a loss function for our regression, so we can train our model to better fit our data. In a linear regression, we minimize the squared error of the difference between the predicted values(obtained from the equation) and the target values (the data that we have). In other words we want to minimize the square of the predicted values minus the target value. So we define the equation to be minimized as loss.

In [None]:
loss = tf.reduce_mean(tf.square(y - train_y))

Then, we define the optimizer method. The gradient Descent optimizer takes in parameter: learning rate, which corresponds to the speed with which the optimizer should learn

In [None]:
optimizer = tf.train.GradientDescentOptimizer(0.1)

Define the training method of our graph

In [None]:
train = optimizer.minimize(loss)

### Combining all previous steps under a graph

In [None]:
input_shape = train_x.shape

In [None]:
graph_lin_reg = tf.Graph()
with graph_lin_reg.as_default():
    #Placeholder for input features
    X = tf.placeholder(shape=input_shape, dtype=tf.float32)
    
    #Initialising weights
    w = tf.Variable(10.0)
    b = tf.Variable(12.3)
    
    #defining model
    y = w * X + b
    
    #defining loss
    loss = tf.reduce_mean(tf.square(y - train_y))
    
    #defining optimizer
    optimizer = tf.train.GradientDescentOptimizer(0.1)
    
    #initialising training
    train = optimizer.minimize(loss)
    
    init = tf.global_variables_initializer()
    sess = tf.Session()
    writer = tf.summary.FileWriter('./graphs_linearRegression', tf.get_default_graph())
    sess.run(init)
    
    loss_values = []
    train_data = []
    for step in range(100):
        _, loss_val, w_val, b_val = sess.run([train, loss, w, b], feed_dict={X:train_x})
        loss_values.append(loss_val)
        if step % 5 == 0:
            print(step, loss_val, w_val, b_val)
            train_data.append([w_val, b_val])

Ucomment the line below to visualize in tensorboard

In [None]:
# !tensorboard --logdir="./graphs_linearRegression/" --port 6008

In [None]:
plt.figure(figsize=(8,6))
plt.plot(loss_values, label='loss')
plt.xlabel('epochs')
plt.ylabel('loss')
plt.legend()
plt.show()