# TensorFlow Introduction

Dale Smith, Ph.D. Math, Georgia Tech.

Data Scientist with FraudScope.

https://github.com/dtsmith2001

## Preliminaries

I am using Anaconda 4.3.1. I ran

    conda update --yes conda
    conda update --all --yes

TensorFlow can be installed via conda-forge but it's version 0.12. Use

    pip install tensorflow

For GPU use, follow the [instructions](https://www.tensorflow.org/install/), or use an AWS instance that is already configured. You can also build your own instance; see [http://expressionflow.com/2016/10/09/installing-tensorflow-on-an-aws-ec2-p2-gpu-instance/](http://expressionflow.com/2016/10/09/installing-tensorflow-on-an-aws-ec2-p2-gpu-instance/) but note this uses an older version of TensorFlow.

I also installed [jupyter-themer](https://github.com/transcranial/jupyter-themer), which is why my Pandas tables have every other row grey.

## TensorFlow

TensorFlow was created internally at Google to serve as their AI platform for deep learning, but it does more than deep learning or neural networks. We will use it to train a linear regression model.

[Keras](https://keras.io) was recently adopted by Google as their neural network front-end. I highly recommend using Keras instead of TensorFlow to build a neural network.

Many people believe Google open sourced TensorFlow because they realized the data is more valuable than the code itself.

## Declarative versus Imperative Programming

With **declarative programming** you tell the computer what you want to do, but not how to do it. This is what you do when you write an SQL query or a program in TensorFlow.

**Imperative programming** requires you to tell the computer how to do the job (Python, C/C++, C#, etc).

## Let's get started!

In [1]:
import tensorflow as tf
import numpy as np

In [2]:
hello = tf.constant('Hello, TensorFlow!')
hello

<tf.Tensor 'Const:0' shape=() dtype=string>

In [3]:
sess = tf.Session()

**Note**: Nothing happens until you run the session.

In [4]:
print(sess.run(hello))

b'Hello, TensorFlow!'


## Computational Graphs

Yes, these are real graph structures. I would describe them as acyclic directed graphs. These structures are often used to find best paths to recompute, say, bond prices.

A **computational graph** is a series of operations (+, -, etc) expressed as nodes of the graph. Each node takes a tensor as input and a tensor as output.

A **[tensor](https://en.wikipedia.org/wiki/Tensor)** is an object represented as a multidimensional array with respect to a basis, much as a matrix is a representation of a **linear transformation** with respect to a basis. The Stress tensor and curvature tensor are used extensively in elastic materials and general relativity. Tensors were also used in psychology to handle multidimensional data, and more recently in machine learning.

### Create Nodes and Visualize a Simple Computation Graph

We can create a constant node as

In [5]:
node1 = tf.constant(3.0)

In [6]:
node2 = tf.constant(4.0)

In [7]:
node2

<tf.Tensor 'Const_2:0' shape=() dtype=float32>

Add two objects:

In [8]:
node3 = tf.add(node1, node2)
node3

<tf.Tensor 'Add:0' shape=() dtype=float32>

In [9]:
sess = tf.Session()

In [10]:
print("sess.run(node3): ", sess.run(node3))

sess.run(node3):  7.0


In [11]:
node4 = tf.constant(10.0)

In [12]:
node5 = tf.add(node3, node4)

TensorFlow comes with an application called **TensorBoard** which we can use to visualize the computational graph for the nodes we've constructed.

In [13]:
graph = tf.get_default_graph()
summary_writer = tf.summary.FileWriter("/Users/driver.dan12/Temp", graph)
summary_writer.flush()

Now run ```tensorboard --logdir=/Users/driver.dan12/Temp``` and click on http://localhost:6006. Here is the computation graph ![graph1.png](graph1.png).

## Warnings from TensorFlow

> ```The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.```

Compile your own TensorFlow. See http://stackoverflow.com/questions/42270739/how-do-i-resolve-these-tensorflow-warnings/42975902#42975902.

Let's clear the default graph and take a look at this more closely.

In [14]:
tf.reset_default_graph()

## Does this look odd?

In [15]:
x = tf.Variable(0, name='x')
model = tf.global_variables_initializer()
with tf.Session() as session:
    for i in range(5):
        session.run(model)
        x = x + 1
        print(session.run(x))

1
2
3
4
5


In [16]:
x = tf.Variable(0, name='x')
model = tf.global_variables_initializer()
with tf.Session() as session:
    for i in range(5):
        x = x + 1
        session.run(model)
        print(session.run(x))

1
2
3
4
5


Nope, this makes perfect sense - we are not executing the computation graph until the ```run``` command. So whether we put the ```x = x + 1``` before or after ```session.run(model)```. This statement initializes ```x``` to be zero, and ```run``` executes the computation graph built so far.

But what does ```global_variables_initializer``` do?

It initializes all the *global variables* in the graph. The global variables are shared across machines in a distributed environment. Contrast with *local variables* which are per-process variables.

We don't need to initialize ```Constant```, just ```Variable```. Reflect on that a little bit.

In [17]:
graph = tf.get_default_graph()
summary_writer = tf.summary.FileWriter("/Users/driver.dan12/Temp", graph)
summary_writer.flush()

Here is the computation graph ![graph2.png](graph2.png)

In [18]:
tf.reset_default_graph()

## Matrix Multiplication

In [19]:
W = tf.Variable(tf.random_uniform([1000,1000]))
x = tf.Variable(tf.ones([1000,1]))
sess = tf.Session()

In [20]:
sess.run(tf.matmul(W, x))

FailedPreconditionError: Attempting to use uninitialized value Variable_1
	 [[Node: Variable_1/read = Identity[T=DT_FLOAT, _class=["loc:@Variable_1"], _device="/job:localhost/replica:0/task:0/cpu:0"](Variable_1)]]

Caused by op 'Variable_1/read', defined at:
  File "/Applications/anaconda/lib/python3.5/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/Applications/anaconda/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Applications/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py", line 3, in <module>
    app.launch_new_instance()
  File "/Applications/anaconda/lib/python3.5/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/Applications/anaconda/lib/python3.5/site-packages/ipykernel/kernelapp.py", line 474, in start
    ioloop.IOLoop.instance().start()
  File "/Applications/anaconda/lib/python3.5/site-packages/zmq/eventloop/ioloop.py", line 177, in start
    super(ZMQIOLoop, self).start()
  File "/Applications/anaconda/lib/python3.5/site-packages/tornado/ioloop.py", line 887, in start
    handler_func(fd_obj, events)
  File "/Applications/anaconda/lib/python3.5/site-packages/tornado/stack_context.py", line 275, in null_wrapper
    return fn(*args, **kwargs)
  File "/Applications/anaconda/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
    self._handle_recv()
  File "/Applications/anaconda/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
    self._run_callback(callback, msg)
  File "/Applications/anaconda/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
    callback(*args, **kwargs)
  File "/Applications/anaconda/lib/python3.5/site-packages/tornado/stack_context.py", line 275, in null_wrapper
    return fn(*args, **kwargs)
  File "/Applications/anaconda/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 276, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "/Applications/anaconda/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 228, in dispatch_shell
    handler(stream, idents, msg)
  File "/Applications/anaconda/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 390, in execute_request
    user_expressions, allow_stdin)
  File "/Applications/anaconda/lib/python3.5/site-packages/ipykernel/ipkernel.py", line 196, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/Applications/anaconda/lib/python3.5/site-packages/ipykernel/zmqshell.py", line 501, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/Applications/anaconda/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2717, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "/Applications/anaconda/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2821, in run_ast_nodes
    if self.run_code(code, result):
  File "/Applications/anaconda/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-19-803b4c44af68>", line 2, in <module>
    x = tf.Variable(tf.ones([1000,1]))
  File "/Applications/anaconda/lib/python3.5/site-packages/tensorflow/python/ops/variables.py", line 197, in __init__
    expected_shape=expected_shape)
  File "/Applications/anaconda/lib/python3.5/site-packages/tensorflow/python/ops/variables.py", line 315, in _init_from_args
    self._snapshot = array_ops.identity(self._variable, name="read")
  File "/Applications/anaconda/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1490, in identity
    result = _op_def_lib.apply_op("Identity", input=input, name=name)
  File "/Applications/anaconda/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "/Applications/anaconda/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2327, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/Applications/anaconda/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1226, in __init__
    self._traceback = _extract_stack()

FailedPreconditionError (see above for traceback): Attempting to use uninitialized value Variable_1
	 [[Node: Variable_1/read = Identity[T=DT_FLOAT, _class=["loc:@Variable_1"], _device="/job:localhost/replica:0/task:0/cpu:0"](Variable_1)]]


In [21]:
sess.run(tf.global_variables_initializer())

In [22]:
sess.run(tf.matmul(W, x))

array([[ 486.96716309],
       [ 494.3180542 ],
       [ 493.93426514],
       [ 498.88598633],
       [ 496.36682129],
       [ 490.76159668],
       [ 496.66485596],
       [ 501.85339355],
       [ 496.61224365],
       [ 484.89224243],
       [ 499.96755981],
       [ 502.76635742],
       [ 485.69223022],
       [ 489.42373657],
       [ 499.30114746],
       [ 479.47381592],
       [ 518.17590332],
       [ 502.5244751 ],
       [ 492.92700195],
       [ 507.40032959],
       [ 488.90527344],
       [ 521.24780273],
       [ 500.87557983],
       [ 483.1940918 ],
       [ 502.75689697],
       [ 473.52655029],
       [ 500.91094971],
       [ 503.74822998],
       [ 500.21224976],
       [ 500.36148071],
       [ 504.19726562],
       [ 493.81417847],
       [ 494.17263794],
       [ 504.15057373],
       [ 498.22952271],
       [ 504.20053101],
       [ 506.99389648],
       [ 499.05679321],
       [ 502.36593628],
       [ 509.06689453],
       [ 496.71917725],
       [ 488.823

In [23]:
graph = tf.get_default_graph()
summary_writer = tf.summary.FileWriter("/Users/driver.dan12/Temp", graph)
summary_writer.flush()

http://localhost:6006 for Tensorboard and the computation graph is ![graph3.png](graph3.png)

In [24]:
sess.close()
tf.reset_default_graph()

## Numpy Style Slicing is Not Supported!

Slicing a matrix requires reshaping into a vector and using an index vector.

## Let's Look at Linear Regression in TensorFlow

For another example, see http://tneal.org/post/tensorflow-iris/TensorFlowIris/.

In [25]:
w=tf.Variable(tf.random_uniform([4, 1]))

In [26]:
sess = tf.Session()

In [27]:
sess.run(w)

FailedPreconditionError: Attempting to use uninitialized value Variable
	 [[Node: _send_Variable_0 = _Send[T=DT_FLOAT, client_terminated=true, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=545785210948208991, tensor_name="Variable:0", _device="/job:localhost/replica:0/task:0/cpu:0"](Variable)]]

In [28]:
sess.run(tf.global_variables_initializer())

In [29]:
sess.run(w)

array([[ 0.21173167],
       [ 0.71594012],
       [ 0.55553162],
       [ 0.00845778]], dtype=float32)

In [30]:
def f(X):
	return tf.matmul(X, w)

In [31]:
def objective(X, Y):
	return tf.reduce_sum(tf.square(tf.subtract(Y,f(X))))

Let's generate some synthetic data, and see how we can get a numpy array into TensorFlow.

In [32]:
a = np.array([[1, 2, 3, 4]], dtype=np.float32)
a

array([[ 1.,  2.,  3.,  4.]], dtype=float32)

In [33]:
XX = np.random.rand(10000, 4)
XX

array([[ 0.2978405 ,  0.51167069,  0.21030663,  0.8166861 ],
       [ 0.8610743 ,  0.73845028,  0.86269288,  0.43507281],
       [ 0.0254931 ,  0.59831671,  0.81338779,  0.29694047],
       ..., 
       [ 0.56366656,  0.39858554,  0.45267797,  0.89401097],
       [ 0.56901679,  0.5466776 ,  0.70618478,  0.59461648],
       [ 0.34110282,  0.86880576,  0.45805236,  0.75228494]])

In [34]:
YY = np.dot(XX, a.transpose())
YY

array([[ 5.21884618],
       [ 6.66634475],
       [ 4.8500518 ],
       ..., 
       [ 6.2949154 ],
       [ 6.15939224],
       [ 6.46201116]])

In [35]:
X = tf.placeholder(tf.float32, [None, 4])
Y = tf.placeholder(tf.float32, [None, 1])

**Automatic differentiation** at work here - no more calculus by hand.

In [36]:
grad = tf.gradients(objective(X,Y), [w])

Let's set up the optimization problem. We want to adjust the weights ```w``` in order to mimimize the sum of squared differences.

```python
def objective(X, Y):
    return tf.reduce_sum(tf.square(tf.subtract(Y,f(X))))
```

In [37]:
step = tf.constant(1e-5)

In [38]:
sess.run(tf.global_variables_initializer())

In [39]:
for i in range(200):
	sess.run(tf.assign_add(w, tf.multiply(-step, grad[0])), feed_dict={X:XX, Y:YY})

In [40]:
sess.run(w)

array([[ 1.05474031],
       [ 2.00871181],
       [ 2.98940301],
       [ 3.94768739]], dtype=float32)

What is ```w```? Remember, we're trying to find the weights that fit

```python
def f(X):
	return tf.matmul(X, w)
```

to ```Y```.

In [41]:
sess.run(objective(X, Y))

InvalidArgumentError: You must feed a value for placeholder tensor 'Placeholder' with dtype float
	 [[Node: Placeholder = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

Caused by op 'Placeholder', defined at:
  File "/Applications/anaconda/lib/python3.5/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/Applications/anaconda/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Applications/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py", line 3, in <module>
    app.launch_new_instance()
  File "/Applications/anaconda/lib/python3.5/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/Applications/anaconda/lib/python3.5/site-packages/ipykernel/kernelapp.py", line 474, in start
    ioloop.IOLoop.instance().start()
  File "/Applications/anaconda/lib/python3.5/site-packages/zmq/eventloop/ioloop.py", line 177, in start
    super(ZMQIOLoop, self).start()
  File "/Applications/anaconda/lib/python3.5/site-packages/tornado/ioloop.py", line 887, in start
    handler_func(fd_obj, events)
  File "/Applications/anaconda/lib/python3.5/site-packages/tornado/stack_context.py", line 275, in null_wrapper
    return fn(*args, **kwargs)
  File "/Applications/anaconda/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
    self._handle_recv()
  File "/Applications/anaconda/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
    self._run_callback(callback, msg)
  File "/Applications/anaconda/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
    callback(*args, **kwargs)
  File "/Applications/anaconda/lib/python3.5/site-packages/tornado/stack_context.py", line 275, in null_wrapper
    return fn(*args, **kwargs)
  File "/Applications/anaconda/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 276, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "/Applications/anaconda/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 228, in dispatch_shell
    handler(stream, idents, msg)
  File "/Applications/anaconda/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 390, in execute_request
    user_expressions, allow_stdin)
  File "/Applications/anaconda/lib/python3.5/site-packages/ipykernel/ipkernel.py", line 196, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/Applications/anaconda/lib/python3.5/site-packages/ipykernel/zmqshell.py", line 501, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/Applications/anaconda/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2717, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "/Applications/anaconda/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2821, in run_ast_nodes
    if self.run_code(code, result):
  File "/Applications/anaconda/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-35-bfdb0d5263c1>", line 1, in <module>
    X = tf.placeholder(tf.float32, [None, 4])
  File "/Applications/anaconda/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py", line 1502, in placeholder
    name=name)
  File "/Applications/anaconda/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2149, in _placeholder
    name=name)
  File "/Applications/anaconda/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "/Applications/anaconda/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2327, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/Applications/anaconda/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1226, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Placeholder' with dtype float
	 [[Node: Placeholder = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]


How do we fix this? Remember, ```X``` and ```Y``` are *placeholders*, and we need to tell TensorFlow what they should be using ```feed_dict```. See http://stackoverflow.com/questions/33810990/how-to-feed-a-placeholder.

In [42]:
sess.run(objective(X, Y), feed_dict={X:XX, Y:YY})

4.9653783

# Finis

We're finished with this tutorial.

You can practice by using TensorFlow to [fit a linear regression model to housing prices](http://www.learndatasci.com/predicting-housing-prices-linear-regression-using-python-pandas-statsmodels/).

And to facilitate your TensorFlow work on GPU, you can [run notebooks on AWS](https://blog.keras.io/running-jupyter-notebooks-on-gpu-on-aws-a-starter-guide.html).