In [22]:
import numpy as np
import tensorflow as tf

----------
**What is Tensorflow?**  

---------
TensorFlow is a library for number crunching created and maintained by Google. It’s used mainly for machine learning (especially deep learning) tasks. While still in beta (version 1.0 is currently in alpha), the library was open sourced more than a year ago (November 9, 2015). Since then it pretty much took the Deep Learning (DL) community by a storm. Tons of companies are using it in production, also. The best place to learn more is the official page of TensorFlow.

On the more technical side, TensorFlow allows you to do computations on your PC/Mac (CPU & GPU), Android, iOS and lots more places. Of course, being created by Google, it aims to bring massive parallelism to your backprop musings. The main abstraction behind all the magic is stateful dataflow graphs.

-----------
![caption](inline_images/parallel.gif)

---------------
TensorFlow provides multiple APIs. The lowest level API - TensorFlow Core, provides you with complete programming control. The higher level APIs are built on top of TensorFlow Core. These higher level APIs are typically easier to learn and use than TensorFlow Core. In addition, the higher level APIs make repetitive tasks easier and more consistent between different users.

----------------

### Definitions  

-------------------
**Tensors**

--------------------
A Tensor is an n-dimensional array (most often used when n is > 3)  

It's a catch-all for terms you've previously heard
    
    --> i.e. A scalar is a 1D tensor, a vector is a 2D tensor, a 3D matrix is a 3D tensor

- For example: A 4-D array of floating point numbers representing a mini-batch of images with dimensions [batch, height, width, channel]    
        
- Tensorflow uses numpy arrays to represent tensor values

--------------------

Some Examples

    --> This is an example of a rank 0 tensor; a scalar with shape [] -->  [3.0]
    --> This is an example of a rank 1 tensor; a vector with shape [3] -->  [1.0, 2.0, 3.0]
    --> This is an example of a rank 2 tensor; a matrix/vector with shape [2, 3] -->  [[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]
    --> This is an example of a rank 3 tensor; a matrix/vector with shape [2, 1, 2] -->  [[[1.0, 2.0]], [[5.0, 6.0]]]
    
--------------

-------------
**Graphs**

-------------
- A computational graph is a series of TensorFlow operations (*nodes/vertices*) connected by tensors (*edges/connections*)
- The graph is composed of two types of objects:
    1. ***tf.Operation (or "ops")***
        - The nodes of the graph
        - Operations describe calculations that consume and produce tensors (constant, addition, log, matmul, etc.)
    2. ***tf.Tensor***
        - The edges in the graph
        - These represent the values that will flow through the graph (inputs and outputs of various operations)
        - Most TensorFlow functions return tf.Tensors
        
--------------
**An Example:**

- In a TensorFlow graph, the <span style="color:red">tf.matmul operation</span> would correspond to a <span style="color:red">single node</span> with <span style="color:blue">two incoming edges/tensors (the matrices to be multiplied)</span> and <span style="color:blue">one outgoing edge/tensor (the result of the multiplication)</span>  

----------------
**Even More Information (Skip This Section If You'd Like...):**

- Most TensorFlow programs start with a dataflow graph construction phase. In this phase, you invoke TensorFlow API functions that construct new <span style="color:red">tf.Operation (node)</span> and <span style="color:blue">tf.Tensor (edge)</span> objects and add them to a tf.Graph instance. TensorFlow provides a default graph that is an implicit argument to all API functions in the same context

- For example:
    - Calling tf.constant(42.0) creates a single <span style="color:red">tf.Operation</span> that produces the value 42.0, adds it to the default graph, and returns a <span style="color:blue">tf.Tensor</span> that represents the value of the constant
    - Calling tf.matmul(x, y) creates a single <span style="color:red">tf.Operation</span> that multiplies the values of <span style="color:blue">tf.Tensor objects x and y</span>, adds it to the default graph, and returns a <span style="color:blue">tf.Tensor that represents the result of the multiplication</span>
    - Executing v = tf.Variable(0) adds to the graph a <span style="color:red">tf.Operation</span> that will store a <span style="color:blue">writeable tensor value</span> that persists between tf.Session.run calls
        - The tf.Variable object wraps this operation, and can be used like a tensor, which will read the current value of the stored value
        - The tf.Variable object also has methods such as tf.Variable.assign and tf.Variable.assign_add that create tf.Operation objects that, when executed, update the stored value. (We will discuss variables more in depth later)
    - Calling tf.train.Optimizer.minimize will add <span style="color:red">operations</span> and <span style="color:blue">tensors</span> to the default graph that calculates gradients, and return a <span style="color:red">tf.Operation</span> that, when run, will apply those gradients to a set of variables.

----------------
**Important: tf.Tensors do not have values, they are just handles to elements in the computation graph** 

*Calling most functions in the TensorFlow API merely adds operations and tensors to the default graph, but does not perform the actual computation.   
You compose functions until you have a tf.Tensor or tf.Operation that represents the overall computation (blueprint), i.e. Performing one step of gradient descent, and then pass that object to a tf.Session to perform the computation (Note: **Sessions** will be covered in more detail later)*

------------------------
Let's build a simple computational graph to help illustrate the principles:  

The most basic **operation** is a ***constant***. The Python function that builds the **operation** takes a **tensor value** as *input*. The resulting **operation** takes *no inputs*. When run, it outputs the value that was passed to the ***constructor***.    
To create our first graph we will define two constants and add them together  

    --> We will create two floating point constants: a and b  
    --> We will create an operation that adds them together (total) 
        -- This is depicted in the image below
        -- Code is in the cell below the image

-----------------------

![first_graph](inline_images/first_graph.PNG)


In [23]:
# Define operations and their tensors for graph
a = tf.constant(1.0, dtype=tf.float32)
b = tf.constant(2.0) # also tf.float32 implicitly

total = a + b

print(a)
print(b)
print(total)

Tensor("Const_2:0", shape=(), dtype=float32)
Tensor("Const_3:0", shape=(), dtype=float32)
Tensor("add_2:0", shape=(), dtype=float32)


------------
We see that the print statements produced the following output:

*Tensor("Const:0", shape=(), dtype=float32)*  
*Tensor("Const_1:0", shape=(), dtype=float32)*  
*Tensor("add:0", shape=(), dtype=float32)*  

-------------

Notice that printing the tensors does not output the values 1.0, 2.0, and 3.0 as you might expect 

-------------

The above definition statements have only **built the computation graph**  
  
These tf.Tensor objects just represent the **future results** of the operations that **will be** run (within a ***session***)

Each operation in a graph is given a unique name
    
    - This name is independent of the names the objects are assigned to in Python
    - Tensors are named after the operation that produces them followed by an output index, as in "add:0" above

-----------------
  
**A graph is the backbone of TensorFlow and every computation/operation/variable resides on the graph**  

**Everything that happens in the code, resides on a default graph provided by TensorFlow**

------------
*You can access and view the graph with the code below...*

**NOTE:**
    - if you've run the cells above multiple times then you can clear the graph (as it will develop duplicates) with the code
        --> tf.reset_default_graph()
        
--------------

In [3]:
# Access the graph
graph = tf.get_default_graph()

# Retrieve the operations if there are only three... if not clear the code and instruct user to run the above code again before checking
if len(graph.get_operations()) == 3:
    for op in graph.get_operations(): # For operation in the graph print the name
        print(op.name)
else:
    tf.reset_default_graph()  # Code to reset the default graph
    print("Please go re-run the cell 3 above that defines the graph operations!")

Const
Const_1
add


-----------
**Sessions**

------------
- Remember earlier when we tried to print the constant tensor and it returned a computational graph description?
- Sessions are what we run when we wish to pass edges through the graph essentially utilizing the architecture that we built to generate something real

-------------------
- You can imagine graph to be similar to a blueprint, and a session to be similar to a construction site

                        Note that graphs and sessions are created independently of each other

    - A **graph** only defines the computations or builds the blueprint
    - However, there are no values unless we run the graph or part of the graph within a **session**
    
--------------
**More Information (Skip This Section If You'd Like...)**
- TensorFlow uses the tf.Session class to represent a connection between the client program (typically a Python program, although a similar interface is available in other languages) and the C++ runtime
- A tf.Session object provides access to devices in the local machine, and remote devices using the distributed TensorFlow runtime
- A tf.Session also caches information about your graph so that you can efficiently run the same computation multiple times

------------
To access the tensor we created (*'total'*) we need to pass ( *sess.run()* ) it inside a session... let's see how that works!

------------

In [4]:
# This is one way to structure a session
with tf.Session() as sess:
    print("Inside a session total returns : {}".format(sess.run(total)))
print("Outside a session total returns : {}".format(total))


# ------------------------------------------------------------------------------------------------------------------------------
#                                                NOTE ON OPENING/STRUCTURING SESSIONS
# ------------------------------------------------------------------------------------------------------------------------------
#
# - Since a tf.Session owns physical resources (such as GPUs and network connections) ..
#   .. it is typically used as a context manager (in a with block) that automatically closes the session when you exit the block
# 
# - It is also possible to create a session without using a with block ..
#   .. but you should explicitly call tf.Session.close when you are finished with it to free the resources
# ------------------------------------------------------------------------------------------------------------------------------

Inside a session total returns : 3.0
Outside a session total returns : Tensor("add:0", shape=(), dtype=float32)


---------------------
**SOoo... What Just Happened?**

------------------
When you request the output of a node with Session.run, TensorFlow backtracks through the graph and runs all the nodes that provide input to the requested output node

    --> i.e. TF runs the two constant nodes ('a and b') as they provide input into the output node ('total')

--------------  

*NOTE:* You could also pass multiple tensors to tf.Session.run
    
    --> The run method transparently handles any combo of tuples (val_1, val_2) or dictionaries {'key1':val_1, 'key2':val_2}
    --> I'll print both the constants and the addition operation in the cell below

In [5]:
with tf.Session() as sess:
    # Note: The names ('constants' and 'addition') are arbitrary and do not change the output
    print(sess.run({'constants':(a, b), 'addition':total}))
    print(sess.run((a,b,total)))

{'constants': (1.0, 2.0), 'addition': 3.0}
(1.0, 2.0, 3.0)


----------
**What Else Do I Need To Know?**

----------

When running a session a tensor is locked into **one specific value**, for example:

    --> Let's use tf.random_uniform to produce a tf.Tensor that generates a random 3-element vector (with values in [0,1])

In [6]:
# Add an operation that generates 3 random numbers between 0 and 1 to the graph taking no inputs and having one output (vec)
vec = tf.random_uniform(shape=(3,))

In [7]:
# Values rounded to 3 decimal places for ease of viewing (np.round(#,3))
with tf.Session() as sess:
    print("These two rows show different values as the vector is generated within different tf.Session.runs")
    print(np.round(sess.run(vec),3))
    print(np.round(sess.run(vec),3))

    print("\nThese two rows show the same value as the vector is generated within one single tf.Session.runs")
    print(np.round(sess.run((vec, vec)),3))

These two rows show different values as the vector is generated within different tf.Session.runs
[0.511 0.406 0.237]
[0.526 0.469 0.454]

These two rows show the same value as the vector is generated within one single tf.Session.runs
[[0.568 0.742 0.081]
 [0.568 0.742 0.081]]


-------------
**One Final Note on Sessions For Now (Skip This Section if You'd Like...)**  

-------------
- Some TensorFlow functions return tf.Operations instead of tf.Tensors
    - The result of calling run on an Operation is ***None***
    - You run an operation to cause a side-effect, not to retrieve a value
        - Examples of this include the initialization, and training ops (To be demonstrated later)

-----------------

--------------
**Placeholders and Feeding**

-------------------
As it stands, our graph so far is not especially interesting because it always produces a constant result. This isn't always the case though.  
A graph can be *parameterized* to accept **external inputs, known as placeholders**
    
    -->  A placeholder is a promise to provide a value later, like a function argument, or training data

- Placeholders are used to feed in data from outside the computational graph
    - i.e. if you need to pass data to the model from outside TensorFlow, you'll need to define a placeholder    
    
    
- Each placeholder must be fed at runtime with the appropriate inputs it is waiting for
    - You specify your data using the command [ >>> feed_dict <<< ] when running your computation  
    - feed_dict refers to the act of FEEDING a dictionary into a tf.Session.run, so as to define/feed the empty placeholder
----------------
*Let's see an example in the cell below...*

----------------

In [8]:
x = tf.placeholder(tf.float32) # We define our datatype as float32
y = tf.placeholder(tf.float32) # We define our datatype as float32
z = x + y # We define an operation that adds the two placeholders

In [9]:
with tf.Session() as sess:
    
    print("We can feed in constants: x=100, y=50")
    print(sess.run(z, feed_dict = {x : 100, y : 50}))
    
    print("\nWe can feed in vectors: x = [1,2], y = [4,8]")
    print(sess.run(z, feed_dict = {x : [1,2], y : [4,8]}))
    
    print("\nWe can feed in multidimensional arrays/matrices: x = [[-1,2,-3,4],[-5,6,-7,8]], y = [[16,-15,14,-13],[12,-11,10,-9]]")
    print(sess.run(z, feed_dict = {x : [[-1,2,-3,4],[-5,6,-7,8]], y : [[16,-15,14,-13],[12,-11,10,-9]]}))
    

We can feed in constants: x=100, y=50
150.0

We can feed in vectors: x = [1,2], y = [4,8]
[ 5. 10.]

We can feed in multidimensional arrays/matrices: x = [[-1,2,-3,4],[-5,6,-7,8]], y = [[16,-15,14,-13],[12,-11,10,-9]]
[[ 15. -13.  11.  -9.]
 [  7.  -5.   3.  -1.]]


----------
**Final Note On Placeholders/Feeding**  
  
----------  
- Typically we load feed_dict from something else (like training data or testing data)
- feed_dict can also be defined prior to the sess.run call
        
        >>> dict = {x:100, y:100}
        
        >>> sess.run(z, feed_dict = dict) 
        OR 
        >>> sess.run(z, feed_dict = {x:100, y:100)

        
- The feed_dict argument can be used to overwrite any tensor in the graph  
- The only difference between placeholders and other tf.Tensors is that placeholders throw an error if no value is fed to them

----------

------------
**Variables**

------------

***In short:***

- Variables are a type of tensor that can be added to a graph (edges)
- You use **tf.Variable** for trainable variables such as weights (W) and biases (b) for your model (mutable)
    - Also variables require a value to be specified when they are declared
    - Also variables require initialization ... which can be taken care of globally so as to avoid repetition
- You use **tf.placeholder** to feed in actual training examples (unmutable)  
    - Also placeholders DO NOT require a value to be specified when they are declared
---------

***In long:***  

- You add a variable to the graph by constructing an instance of the class Variable
    - The Variable() constructor requires an initial value for the variable, which can be a Tensor of any type and shape
    - The initial value defines the type and shape of the variable
    - After construction, the type and shape of the variable are fixed
    - The value can be changed using one of the assign methods
- If you want to change the shape of a variable later you have to use an assign Op with validate_shape=False.
- Just like any Tensor, variables created with Variable() can be used as inputs for other Ops in the graph
    - Additionally, all the operators overloaded for the Tensor class are carried over to variables, so you can also add nodes to the graph by just doing arithmetic on variables

------------

**Let's see an example below:**

------------------------

In [10]:
b = tf.Variable(2.0)
print(b)

<tf.Variable 'Variable:0' shape=() dtype=float32_ref>


------------
- As before... this doesn't print 2.0
- To print the number instead of the tensor object we have to run it inside a session
    - Also remember that we have to initialize the variables using the command...

                                        >>> tf.global_variables_initializer()  
    
**Let's see it below**

------------

In [11]:
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(b))

2.0


------------------
**As we can now see, it printed as we expected from within the Session... let's move on!**

------------------

----------------
### LET'S PAUSE TO REVIEW WITH A FEW IMAGES

--------------
**As we can see below we have the following processes:**

--------------
- **Tensors** are edges
- **Operations** are nodes
- Operations connected by tensors make up what is known as a **Graph** (a blueprint for what you wish to accomplish)
- A graph can only be run inside a **Session** (a session is the implementation of the blueprint -- bringing it to life)  
--------------
  
  
- **Tensorflow**, as a whole, allows the processes executed within the sessions (flow through the graph) to be distributed across multiple devices/processes
--------------

![summary](./inline_images/tensorflow_helpful_image.png)  
    
----------- 
- Below is an example graph showing the *flow* along *tensors* through the *nodes* of the graph    
      
-----------
![example_graph](./inline_images/example_graph.PNG)

--------------

### LET'S MOVE ON TO SOME ACTUAL CODING!!

--------------

### Let's Explore Some Basic Functions We Will Use**

---------------
***1. tf.random_normal([array shape], standard_deviation)***

---------------
- Calculates random numbers over a normal distribution that fill the matrix shape specified, at the standard deviation specified

---------------
***2. tf.initialize_all_variables()***
- Initializes all tensorflow variables (as previously discussed)  

---------------
***3. tf.reduce_mean(array)***
- Calculates the mean of an array (the array can be a tensorflow variable)

---------------
***4. tf.argmax(array, axis)***
- Very similar to python argmax. Gets you the maximum value from a tensor along the specified axis (returns the index)

---------------

### Let's Complete Some Linear Regression To Explore These Functions and Tensorflow

---------------

**Create the Training Data**

In [12]:
# 101 numbers equally spaced between -1 and 1
train_X = np.linspace(-1, 1, 101)

print("\nX -- Training Array\n")
print(train_X)


X -- Training Array

[-1.   -0.98 -0.96 -0.94 -0.92 -0.9  -0.88 -0.86 -0.84 -0.82 -0.8  -0.78
 -0.76 -0.74 -0.72 -0.7  -0.68 -0.66 -0.64 -0.62 -0.6  -0.58 -0.56 -0.54
 -0.52 -0.5  -0.48 -0.46 -0.44 -0.42 -0.4  -0.38 -0.36 -0.34 -0.32 -0.3
 -0.28 -0.26 -0.24 -0.22 -0.2  -0.18 -0.16 -0.14 -0.12 -0.1  -0.08 -0.06
 -0.04 -0.02  0.    0.02  0.04  0.06  0.08  0.1   0.12  0.14  0.16  0.18
  0.2   0.22  0.24  0.26  0.28  0.3   0.32  0.34  0.36  0.38  0.4   0.42
  0.44  0.46  0.48  0.5   0.52  0.54  0.56  0.58  0.6   0.62  0.64  0.66
  0.68  0.7   0.72  0.74  0.76  0.78  0.8   0.82  0.84  0.86  0.88  0.9
  0.92  0.94  0.96  0.98  1.  ]


In [13]:
randomness = np.random.randn(train_X.shape[0]) * 0.225

print("\nRandomness Array\n")
print(randomness)


Randomness Array

[ 0.03582397 -0.09258469  0.07339797  0.27250087  0.3084785   0.53581066
 -0.14874046 -0.05947265 -0.28003108 -0.11182668 -0.24166805  0.35693169
 -0.30925841 -0.01404736  0.00328727  0.03434783  0.26064318 -0.1532392
 -0.10889427  0.22139626  0.4223643   0.05864717 -0.23961182  0.21394186
 -0.27066577  0.0248633   0.12284425 -0.00753126  0.07545059  0.24240982
 -0.06081502 -0.0830893   0.40294765 -0.03582967  0.21237549 -0.05147962
 -0.19288975 -0.15837292 -0.28907397 -0.0714619   0.29404456  0.10993598
 -0.34383004 -0.20625316 -0.17389004  0.31465138 -0.47942715  0.25871168
  0.13702043 -0.14613349  0.45383417  0.45570307  0.00278165 -0.13320626
  0.08012388 -0.09368986  0.01102227  0.3902249  -0.14005521 -0.42216742
 -0.07528841 -0.14303251 -0.22940418 -0.04435646 -0.1791005   0.15628359
  0.20833272  0.42119224 -0.04125379 -0.13258771  0.03082654  0.21025027
  0.11857941 -0.08236529 -0.06541995  0.25716597 -0.0477811  -0.16055015
 -0.13388875  0.06250947  0.15075

In [14]:
# Generate an artificial relationship where Y is 3 times bigger than X (plus/minus the randomness)
train_Y = (3 * train_X) + randomness

print("\nY -- Training Array\n")
print(train_Y)


Y -- Training Array

[-2.96417603 -3.03258469 -2.80660203 -2.54749913 -2.4515215  -2.16418934
 -2.78874046 -2.63947265 -2.80003108 -2.57182668 -2.64166805 -1.98306831
 -2.58925841 -2.23404736 -2.15671273 -2.06565217 -1.77935682 -2.1332392
 -2.02889427 -1.63860374 -1.3776357  -1.68135283 -1.91961182 -1.40605814
 -1.83066577 -1.4751367  -1.31715575 -1.38753126 -1.24454941 -1.01759018
 -1.26081502 -1.2230893  -0.67705235 -1.05582967 -0.74762451 -0.95147962
 -1.03288975 -0.93837292 -1.00907397 -0.7314619  -0.30595544 -0.43006402
 -0.82383004 -0.62625316 -0.53389004  0.01465138 -0.71942715  0.07871168
  0.01702043 -0.20613349  0.45383417  0.51570307  0.12278165  0.04679374
  0.32012388  0.20631014  0.37102227  0.8102249   0.33994479  0.11783258
  0.52471159  0.51696749  0.49059582  0.73564354  0.6608995   1.05628359
  1.16833272  1.44119224  1.03874621  1.00741229  1.23082654  1.47025027
  1.43857941  1.29763471  1.37458005  1.75716597  1.5122189   1.45944985
  1.54611125  1.80250947  1.95

**Create Placeholders for X and Y We Wish To Investigate**

In [15]:
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)

-----------------
**A Note On Linear Regression**

----------------
Background:  

- You have 2 *(assume just two for now although there are possibly more)* variables
    - An ***independent*** variable (lets call it ***X***)
    - A ***dependent*** variable (lets call it ***Y***)

-----------------
Linear Regression:

- Linear regression is a test that accomplishes two things (mainly)
    1. Find out how related the two (or more) variables are to each other
        - Basically applying linear regression gives us a number between -1 and 1 that gives us an indication of the the strength of correlation between two variables.
            -  ***0*** means no correlation
            -  ***1*** means they are positively correlated (an ***increase in X*** means an ***increase in Y***)
            -  ***-1*** means negatively correlated (***increase in X*** means a ***decrease in Y*** and vice versa).
    2. We can use linear regression for prediction
        - i.e. If we know the rough relationship between ***X*** and ***Y***, then we can use this relationship to predict values of ***Y*** for a new value of ***X*** that we are curious about

------------------
The Typical Form Of Linear Regression:  

![linear regression](inline_images/linear_regression.PNG)

------------------
For example: 
  
Lets say ***X*** is the ***number of workers*** painting a house and ***Y*** is the ***amount of time*** needed to finish a job.   
You do several jobs with different numbers of workers and you time how much it takes to finish each job.

We put those numbers on a graph and do simple linear regression and we learn 2 things:   

1. Does increasing the number of workers really decrease the time needed to finish a job? (i.e. are they correlated and how much)

        --> The HOW MUCH part, is where the slope comes in
        --> Slope is the RATE OF CHANGE of Y with respect to X               <<<<  m = Δy/Δx = rise/run  >>>>
            --> i.e. if we change X by 1, how much does Y change!

2. If we get a customer who wants the job done in a very short time, then we can use our study to predict how many workers it might need to finish it  

3. The other factor to take into consideration is the offset or *bias* (represented by the symbol b); in some models if we decrease the independent variable to nothing yet some value for y (dependant value) still remains, then there is some offset inherent in the model; this offset is denoted as b.

-----------------
***NOTE:***

*Another important result of linear regression is that it is easy to extend to more than 2 variables. The only difference is instead of line you will get a plane in 3 and a hyper plane in 4+*

-----------------

**Create the Model**

Using what we discussed above we can create a model for our data

        --> Note w (weights) replaces m (slope) as this nomenclature is more common in machine learning
        --> The function of the term is identical... it explains the rate of change between two variables

---------------

In [16]:
# Create the variable w (w stands for weights which in this case plays the same roll as the slope (m) would have as discussed above)
w = tf.Variable(0.0, name="weights")

In [17]:
# Create the model (equation) assuming no bias (b = 0) --- i.e. y = wx instead of y = mx+b
Y_model = tf.multiply(X, w)

**Mean Squared Error (MSE) is the average of the squared error that is used as the loss function for least squares regression:**

\begin{align}
MSE = \sum_{k=1}^n\frac{(w^T \cdot x_i - y_i)^2}{n}
\end{align}

**In this equation (assuming MSE loss) the numerator (the individual loss) is also known as the *cost***

\begin{align}
COST = (w^T \cdot x_i - y_i)^2
\end{align}

In [18]:
# Define cost as described above
cost = tf.pow(Y_model-Y, 2)

**Now we need to tell Tensorflow to find a line (y=mx+b) that minimizes the cost (i.e. makes MSE as small as possible)**

--> This is known (for two variables) as the line of best fit and SHOULD be  -->  y ≈ 3x  (from above)

In [19]:
# Define the training operation to have a step size of 0.01 and for it's goal to be the minimization of the cost function as defined above
train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost)

**Time To Train!!**

In [20]:
first_summary = tf.summary.scalar(name='My_first_scalar_summary', tensor=w)

In [21]:
# Start session
with tf.Session() as sess:
    
    # Initialize all the tensorflow variables
    sess.run(tf.global_variables_initializer())

    # Define how many iterations we want
    iterations = 100
    
    writer = tf.summary.FileWriter('./graphs', sess.graph)
    
    # Run the loop for the specified number of iterations
    for i in range(iterations):
    
        # loop through all the training pairs (X,Y)
        for x, y in zip(train_X, train_Y):
            
            # Run the training operation feeding the individual example in (TF will use gradient descent on this to calculate path to cost minimization)
            sess.run(train_op, feed_dict={X: x, Y: y})

            # Tensorboard jazz
            summary = sess.run(first_summary)
            writer.add_summary(summary, i)
            
    # Print the weight after we're done (should be approx. 3)
    print(sess.run(w))

3.0343678
