This Colab is the practical part of the 13th Session "From PyTorch to TensorFlow" (guest lecture by Andreas Steiner) in François Fleuret's Deep Learning Course (https://fleuret.org/dlc)

##### Copyright 2018 Google LLC.

Licensed under the Apache License, Version 2.0 (the "License");

In [0]:
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# 1. Colab Mechanics

Important shortcuts – **try them now**!

- `<CTRL-ENTER>`  : executes current cell
- `<SHIFT-ENTER>` : executes current cell and moves to next cell
- `<CTRL-SHIFT-P>` : shows searchable command palette
- `<CTRL-M> <A>` : insert cell above
- `<CTRL-M> <B>` : append cell below
- `<CTRL-M> <Y>` : convert cell to code
- `<CTRL-M> <M>` : convert cell to Markdown
- ...

Note: On OS X you can use `<COMMAND>` instead of `<CTRL>`

In [0]:
# Kernel : Before you can execute any code, your Colab needs to be connected
# to a Python kernel (only the frontend runs in your browser...)
# Colab connects to a hosted kernel automatically when you click on the
# "Connect" button (top right), or when you execute a first cell.

# (There is more about runtimes in the "Bonus : runtimes" section below...)

In [1]:
# So let's give it a try : execute this cell
# ... you should see the output "3" (without quotation marks)

1 + 2

3

In [0]:
# Note that the Python context spans the entire notebook.
# You can define a function in this cell...

def hello_world():
  print('helle world')

In [0]:
# ... and then execute it in the next cell:

hello_world()

# Of course this only works if you have executed the previous cell !

In [3]:
# Colab also has some extra syntax, you can for example run shell
# commands like this:

! ls -lh
! df -h
! ps -x

total 4.0K
drwxr-xr-x 1 root root 4.0K May 11 18:15 datalab
Filesystem      Size  Used Avail Use% Mounted on
overlay         359G  6.8G  334G   2% /
tmpfs           6.4G     0  6.4G   0% /dev
tmpfs           6.4G     0  6.4G   0% /sys/fs/cgroup
/dev/root       1.2G  537M  684M  44% /opt/bin
tmpfs           6.4G  248M  6.2G   4% /usr/lib64-nvidia
/dev/sda1       365G  9.8G  356G   3% /etc/hosts
shm              64M     0   64M   0% /dev/shm
tmpfs           6.4G     0  6.4G   0% /sys/firmware
  PID TTY      STAT   TIME COMMAND
    1 ?        Ss     0:00 /bin/bash -e /datalab/run.sh
   70 ?        Sl     0:00 node /tools/node/bin/forever --minUptime 1000 --spinS
   80 ?        Sl     0:00 /tools/node/bin/node /datalab/web/app.js
   90 ?        Sl     0:00 /usr/bin/python2 /usr/local/bin/jupyter-notebook -y -
   98 ?        Ssl    0:11 /usr/bin/python3 -m ipykernel_launcher -f /content/.l
  132 pts/0    Ss+    0:01 /bin/sh -c  ps -x
  133 pts/0    R+     0:00 ps -x


In [0]:
# Another special syntax is the so called "magic commands".
# They are often used to do some "cell level" stuff, like for example directly
# entering HTML code:
%%html
<marquee>Cutting-edge HTML tags :-)</marquee>

In [0]:
# Magic commands can also be single-line statements:
%time print('printing...')

In [0]:
# Let's import some interesting libraries...
import tensorflow as tf

In [0]:
# This cell is NOT A VALID STATEMENT.
# But you can place the cursor between the parens and press <TAB>.
# This will show you pydoc in a popover. Very useful for exploring an API...
tf.constant()

## TOC, Folding

In [0]:
# See the subtitle "TOC, Folding" above. See the same subtitle in the TOC to the
# left. You can toggle folding of the current subsection with <CTRL-'>.

## Code Snippets

In [0]:
# Choose the tab "Code snippets" next to "Table of Contents" on the left.
# Then search for "torch" and click on "INSERT".
# This will paste a snippet below and executing this snippet will install
# PyTorch on the machine running this kernel! Neat.

# But please continue with TensorFlow for the other exercises ;-)

# (You probably want to switch back to "Table of Contents" tab again.)

In [0]:
# http://pytorch.org/
from os import path
from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag
platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())

accelerator = 'cu80' if path.exists('/opt/bin/nvidia-smi') else 'cpu'

!pip install -q http://download.pytorch.org/whl/{accelerator}/torch-0.3.0.post4-{platform}-linux_x86_64.whl torchvision
import torch

-> Please finish the remainder of this Colab **BEFORE unfolding this section** ...

## Bonus : runtimes

In [0]:
# You should now be connected to a default hosted Phyton 3 runtime.
# You can switch the runtime with "Change Runtime type" command in the "Runtime"
# menu.

import sys
sys.version

In [0]:
# Execute this cell to see files in the current directory.
! ls -lh

In [0]:
# Generate a new timestamp file in the current directory.
! touch $(date +%Y%m%d_%H%M%S)

In [0]:
# We can also define some variable
foo = 'bar'

In [0]:
# And then check whether that variable is currently defined.
'foo' in globals()

In [0]:
# Try to change runtiome type Python 2<->3 and check:
# - Do you see the same files? Do the processes run on the same computer?
# - Do you reset the kernel when changing the runtime?

# More things to try out:
# - What happens when you restart the runtime with "<CTRL-M> ." ?
# - Do you have access to the same files in a new Colab? (File/New ...)
# - What about state and files when you open the SAME notebook in a second tab?

In [0]:
# Let's check our hardware configuration...
from tensorflow.python.client import device_lib
device_lib.list_local_devices()

In [0]:
# When changing the runtime you can select "GPU" !
# - You should see a different output above.
# - What about files and state?

In [0]:
# You can also connect Colab to a locally running Jupyter instance. You'll need
# to install the jupyter_http_over_ws extension.
# See detailed instructions here:
# https://research.google.com/colaboratory/local-runtimes.html

# 2. Low-Level TensorFlow

Note : This entire section is mostly copied from the AMLD workshop notebook
[2_tf_basics](https://colab.research.google.com/github/andsteing/workshops/blob/master/extras/amld/notebooks/exercises/2_tf_basics.ipynb).

In [0]:
from __future__ import division, print_function

import tensorflow as tf
# Always make sure you are using running the expected version.
# There are considerable differences between versions...
# Tested with 1.7.0
tf.__version__

#### The Graph

Most important concept with Tensorflow : There is a Graph to which
tensors are attached. This graph is never specified explicitly but
has important consequences for the tensors that are attached to it
(e.g. you cannot connect two tensors that are in different graphs).

The python variable "tensor" is simply a reference to the actual
tensor in the Graph. More precisely, it is a reference to an operation
that will produce a tensor (in the Tensorflow Graph, the nodes are
actually operations and the tensors "flow" on the edges between
the nodes...)

In [0]:
# There is always a "graph" even if you haven't defined one.
tf.get_default_graph()

In [0]:
# Store the default graph in a variable for exploration.
graph = tf.get_default_graph()

In [0]:
# Ok let's try to get all "operations" that are currently defined in this
# default graph.
# Remember : Placing the caret at the end of the line and typing <tab> will
# show an auto-completed list of methods...
graph.get_


In [0]:
# Let's create a separate graph:
graph2 = tf.Graph()

# Try to predict what these statements will output.
print(tf.get_default_graph() == graph)
print(tf.get_default_graph() == graph2)
with graph2.as_default():
    print(tf.get_default_graph() == graph)
    print(tf.get_default_graph() == graph2)

In [0]:
# We define our first TENSOR. Fill in your favourite numbers
# You can find documentation to this function here:
# https://www.tensorflow.org/versions/master/api_docs/python/tf/constant

# Try to change data type and shape of the tensor...

favorite_numbers = tf.constant([13, 22, 83])

print(favorite_numbers)

# (Note that this only prints the "properties" of the tensor
# and not its actual value -- more about this strange behavior
# in the section "The Session".)

In [0]:
# Remember that graph that is always in the background? All the
# tensors that you defined above have been duefully attached to the
# graph by Tensorflow -- check this out:
# (Also note how the operations are named by default)

graph.get_operations()  # Show graph operations.

In [0]:
# Note that above are the OPERATIONS that are the nodes in the
# graph (in our the case the "Const" operation creates a constant
# tensor). The tensors themselves are the EDGES between the nodes,
# and their name is usually the operation's name + ":0".
favorite_numbers.name

In [0]:
# Let's say we want to clean up our experimental mess...
# Search on Tensorflow homepage for a command to "reset" the graph:
# https://www.tensorflow.org/api_docs/

# YOUR ACTION REQUIRED:
# Find the right Tensorflow command to reset the graph.
tf.
tf.get_default_graph().get_operations()

In [0]:
# Important note: "resetting" didn't clear our original graph but
# rather replace it with a new graph:
tf.get_default_graph() == graph

In [0]:
# Because we cannot define operations across graphs, we need to
# redefine our favorite numbers in the context of the new
# graph:

favorite_numbers = tf.constant([13, 22, 83])

In [0]:
# Now let's do some computations. Actually we don't really execute
# any computation yet (see next section "The Session" for that), but
# rather define how we intend to do computation later on...

# We first multiply our favorite numbers with our favorite multiplier:
favorite_multiplier = tf.constant(7)
# Do you have an idea how to write below multiplication more succinctly?
# Try it! (Hint: operator overloading)
favorite_products = tf.multiply(favorite_multiplier, favorite_numbers)
print('favorite_products.shape=', favorite_products.shape)

# Now we want to add up all the favorite products to a single scalar
# (0-dim tensor).
# There is a Tensorflow function for this. It starts with "reduce"...
# (Use <tab> auto-completion and/or tensorflow documentation)
# YOUR ACTION REQUIRED:
# Find the correct Tensorflow command to sum up the numbers.
favorite_sum = tf.
print('favorite_sum.shape=', favorite_sum.shape)

In [0]:
# Because we really like our "first" favorite number we add this number
# again to the sum:

favorite_sum_enhanced = favorite_sum + favorite_numbers[0]
# See how we used Python's overloaded "+" and "[]" operators?

# You could also define the same computation using Tensorflow
# functions only:
# favorite_sum_enhanced = tf.add(favorite_sum, tf.slice(favorite_numbers, [0], [1]))

In [0]:
# Of course, it's good practice to avoid a global invisible graph, and
# you can use a Python "with" block to explicitly specify the graph for
# a codeblock:
with tf.Graph().as_default():
    within_with = tf.constant([1, 2, 3], name='within_with')
    print('within with:')
    print(tf.get_default_graph())
    print(within_with)
    print(tf.get_default_graph().get_operations())
print('\noutside with:')
print(tf.get_default_graph())
print(within_with)
print(tf.get_default_graph().get_operations())

# You can execute this cell multiple times without messing up any graph.
# Note that you won't be able to connect the tensor to other tensors
# because we didn't store a reference to the graph of the with statement.

In [0]:
# Note : Tested with Chrome 66 -- might not work with all browsers :-(

# Let's visualize our graph!
# Tip: to make your graph more readable you can add a
# name="..." parameter to the individual Ops.

# src: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/deepdream/deepdream.ipynb

import numpy as np
import tensorflow as tf
from IPython.display import clear_output, Image, display, HTML

def strip_consts(graph_def, max_const_size=32):
    """Strip large constant values from graph_def."""
    strip_def = tf.GraphDef()
    for n0 in graph_def.node:
        n = strip_def.node.add() 
        n.MergeFrom(n0)
        if n.op == 'Const':
            tensor = n.attr['value'].tensor
            size = len(tensor.tensor_content)
            if size > max_const_size:
                tensor.tensor_content = "<stripped %d bytes>"%size
    return strip_def

def show_graph(graph_def, max_const_size=32):
    """Visualize TensorFlow graph."""
    if hasattr(graph_def, 'as_graph_def'):
        graph_def = graph_def.as_graph_def()
    strip_def = strip_consts(graph_def, max_const_size=max_const_size)
    code = """
        <script>
          function load() {{
            document.getElementById("{id}").pbtxt = {data};
          }}
        </script>
        <link rel="import" href="https://tensorboard.appspot.com/tf-graph-basic.build.html" onload=load()>
        <div style="height:600px">
          <tf-graph-basic id="{id}"></tf-graph-basic>
        </div>
    """.format(data=repr(str(strip_def)), id='graph'+str(np.random.rand()))

    iframe = """
        <iframe seamless style="width:1200px;height:620px;border:0" srcdoc="{}"></iframe>
    """.format(code.replace('"', '&quot;'))
    display(HTML(iframe))

In [0]:
show_graph(tf.get_default_graph())

## The Session

So far we have only setup our computational Graph -- If you want to actually
*do* any computations, you need to attach the graph to a Session.

In [0]:
# The only difference to a "normal" session is that the interactive
# session registers itself as default so .eval() and .run() methods
# know which session to use...
interactive_session = tf.InteractiveSession()

In [0]:
# Hooray -- try printing other tensors of above to see the intermediate
# steps. What is their type and shape ?
print(favorite_sum.eval())

In [0]:
# Note that the session is also connected to a Graph, and if no Graph
# is specified then it will connect to the default Graph. Try to fix
# the following code snippet:

graph2 = tf.Graph()
with graph2.as_default():
    graph2_tensor = tf.constant([1])
with tf.Session() as sess:
    print(graph2_tensor.eval())

In [0]:
# Providing input to the graph: The value of any tensor can be overwritten
# by the "feed_dict" parameter provided to Session's run() method:
a = tf.constant(1)
b = tf.constant(2)
a_plus_b = tf.add(a, b)
print(interactive_session.run(a_plus_b))
print(interactive_session.run(a_plus_b, feed_dict={a: 123000, b:456}))

In [0]:
# It's good practice not to override just any tensor in the graph, but to
# rather use "tf.placeholder" that indicates that this tensor must be
# provided through the feed_dict:
placeholder = tf.placeholder(tf.int32)
placeholder_double = 2 * placeholder
# YOUR ACTION REQUIRED:
# Modify below command to make it work.
placeholder_double.eval()


## The Shapes

Another basic skill with Tensorflow is the handling of shapes. This
sounds pretty simple but you will be surprised by how much time of
your Tensorflow coding you will spend on massaging Tensors in the
right form...

Here we go with a couple of exercises with increasing difficulty...

Please refer to the Tensorflow documentation
[Tensor Transformations](https://www.tensorflow.org/versions/master/api_guides/python/array_ops#Shapes_and_Shaping)
for useful functions.

In [0]:
tensor12 = tf.constant([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
print(tensor12)
batch = tf.placeholder(tf.int32, shape=[None, 3])
print(batch)

In [0]:
# Tensor must be of same datatype. Try to change the datatype
# of one of the tensors to fix the ValueError...
multiplier = tf.constant(1.5)
# YOUR ACTION REQUIRED:
# Fix error below.
tensor12 * multiplier


In [0]:
# What does tf.squeeze() do? Try it out on tensor12_3!
tensor12_3 = tf.reshape(tensor12, [3, 2, 2, 1])
# YOUR ACTION REQUIRED:
# Checkout the effects of tf.squeeze()
tensor12_3.shape


In [0]:
# This cell is about accessing individual elements of a 2D tensor:
batch = tf.constant([[1, 2, 3, 0, 0],
                     [2, 4, 6, 8, 0],
                     [3, 6, 0, 0, 0]])
# Note that individual elements have lengths < batch.shape[1] but
# are zero padded.
lengths = tf.constant([3, 4, 2])

# The FIRST elements can be accessed by using Python's
# overloaded bracket indexing OR the related tf.slice():
print('first elements:')
print(batch[:, 0].eval())
print(tf.slice(batch, [0, 0], [3, 1]).eval())

In [0]:
# Accessing the LAST (non-padded) element within every sequence is
# somewhat more involved -- you need to specify both the indices in
# the first and the second dimension and then use tf.gather_nd():
# YOUR ACTION REQUIRED:
# Define provide the correct expression for indices_0 and indices_1.
indices_0 =
indices_1 =
print('last elements:')
print(tf.gather_nd(batch, tf.transpose([indices_0, indices_1])).eval())

In [0]:
# Below you have an integer tensor and then an expression that is set True
# for all elements that are odd. Try to print those elements using the
# operations tf.where() and tf.gather()
numbers = tf.range(1, 11)
odd_condition = tf.logical_not(tf.equal(0, tf.mod(numbers, 2)))
# YOUR ACTION REQUIRED:
# Define provide the correct expression for odd_indices and odd_numbers.
odd_indices =
odd_numbers =
odd_numbers.eval()

In [0]:
# "Dynamic shapes" : This feature is mainly used for variable size batches.
# "Dynamic" means that one (or multiple) dimensions are not specified
# before graph execution time (when running the graph with a session).
batch_of_pairs = tf.placeholder(dtype=tf.int32, shape=(None, 2))

# Note how the "unknown" dimension displays as a "?".
batch_of_pairs

In [0]:
# So we want to reshape the batch of pairs into a batch of quadruples.
# Since we don't know the batch size at runtime we will use the special
# value "-1" (meaning "as many as needed") for the first dimension.
# (Note that this wouldn't work for batch_of_triplets.)
# YOUR ACTION REQUIRED:
# Complete next line.
batch_of_quadruples = tf.reshape(batch_of_pairs, 

# Test run our batch of quadruples:
batch_of_quadruples.eval(feed_dict={
    batch_of_pairs: [[1,2], [3,4], [5,6], [7,8]]})

In [0]:
# Dynamic shapes cannot be accessed at graph construction time;
# accessing the ".shape" attribute (which is equivalent to the
# .get_shape() method) will return a "TensorShape" with "Dimension(None)".
batch_of_pairs.shape

# i.e. .shape is a property of every tensor that can contain
# values that are not specified -- Dimension(None)

In [0]:
# i.e. first dimension is dynamic and only known at runtime
batch_of_pairs.shape[0].value == None

In [0]:
# The actual dimensions can only be determined at runtime
# by calling tf.shape() -- the output of the tf.shape() Op
# is a tensor like any other tensor whose value is only known
# at runtime (when also all dynamic shapes are known).
batch_of_pairs_shape = tf.shape(batch_of_pairs)
batch_of_pairs_shape.eval(feed_dict={
    batch_of_pairs: [[1, 2]]
})

# i.e. tf.shape() is an Op that takes a tensor (that might have
# a dynamic shape or not) as input and outputs another tensor
# that fully specifies the shape of the input tensor.

In [0]:
# So you think shapes are easy, right?
# Well... Then here we go with a real-world shape challenge!
#
# (You probably won't have time to finish this challenge during
# the workshop; come back to this later and don't feel bad about
# consulting the solution...)
#
# Imagine you have a recurrent neural network that outputs a "sequence"
# tensor with dimension [?, max_len, ?], where
# - the first (dynamic) dimension is the number of elements in the batch
# - the second dimension is the maximum sequence length
# - the third (dynamic) dimension is the number of number per element
#
# The actual length of every sequence in the batch (<= max_len) is also
# specified in the tensor "lens" (length=number of elements in batch).
#
# The task at hand is to extract the "nth" element of every sequence.
# The resulting tensor "last_elements" should have the shape [?, ?],
# matching the first and third dimension of tensor "sequence".
#
# Hint: The idea is to reshape the "sequence" to "partially_flattened"
# and then construct a "idxs" tensor (within this partially flattened
# tensor) that returns the requested elements.
#
# Handy functions:
# tf.gather()
# tf.range()
# tf.reshape()
# tf.shape()

lens = tf.placeholder(dtype=tf.int32, shape=(None,))
max_len = 5
sequences = tf.placeholder(dtype=tf.int32, shape=(None, max_len, None))
# YOUR ACTION REQUIRED:
# Find the correct expression for below tensors.
batch_size = 
hidden_state_size = 
idxs = 
partially_flattened =
last_elements =

sequences_data = [
    [[1,1], [1,1], [2,2], [0,0], [0,0]],
    [[1,1], [1,1], [1,1], [3,3], [0,0]],
    [[1,1], [1,1], [1,1], [1,1], [4,4]],
]
lens_data = [3, 4, 5]
# Should output [[2,2], [3,3], [4,4]]
last_elements.eval(feed_dict={sequences: sequences_data, lens: lens_data})

## Variables

So far all our computations have been purely stateless. Obviously,
programming become much more fun once we add some state to our code...
Tensorflow's **variables** encode state that persists between calls to
`Session.run()`.

The confusion with Tensorflow and variables comes from the fact that we
usually "execute" the graph from within Python by running some nodes of
the graph -- via `Session.run()` -- and that variable assignments are also
encoded through nodes in the graph that only get executed if we ask the
value of one of its descendants (see explanatory code below).

Tensorflow's overview of
[variable related functions](https://www.tensorflow.org/versions/r1.0/api_guides/python/state_ops#Variables),
the
[variable HOWTO](https://www.tensorflow.org/versions/r1.0/programmers_guide/variables),
and the
[variable guide](https://www.tensorflow.org/programmers_guide/variables).

And finally some notes on [sharing variables](https://www.tensorflow.org/api_guides/python/state_ops#Sharing_Variables).

In [0]:
counter = tf.Variable(0)
increment_counter = tf.assign_add(counter, 1)
with tf.Session() as sess:
    # Something is missing here...
    # -> Search the world wide web for the error message...
    # YOUR ACTION REQUIRED:
    # Add a statement that fixes the error.

    print(increment_counter.eval())
    print(increment_counter.eval())
    print(increment_counter.eval())

In [0]:
# Same conditions apply when we use our global interactive session...
interactive_session.run([tf.global_variables_initializer()])
increment_counter.eval()

In [0]:
# Execute this cell multiple times and note how our global interactive
# sessions keeps state between cell executions.
increment_counter.eval()

In [0]:
# Usually you would create variables with tf.get_variable() which makes
# it possible to "look up" variables later on.

# For a change let's not try to fix a code snippet but rather to make it
# fail:
# 1. What happens if the block is not wrapped in a tf.Graph()?
# 2. What happens if reuse= is not set?
# 3. What happens if dtype= is not set?

with tf.Graph().as_default():
    with tf.variable_scope('counters'):
        counter1 = tf.get_variable('counter1', initializer=1)
        counter2 = tf.get_variable('counter2', initializer=2)
        counter3 = tf.get_variable('counter3', initializer=3)
    with tf.Session() as sess:
        sess.run([tf.global_variables_initializer()])
        print(counter1.eval())
        with tf.variable_scope('counters', reuse=True):
            print(tf.get_variable('counter2', dtype=tf.int32).eval())

# 3. MNIST CodeJam

Refer back to the lecture accompanying Colab (https://goo.gl/xYKeWD) for a simple working example. It's fine to copy the code from "putting it all together", but you should play around with it to make sure you understand everything completely.

Recommended next steps:
- Add some plotting of loss/accuracy (train/test) over time.
- Implement a model using convolutions using the following helpers : tf.layers.conv2d(), tf.layers.max_pooling2d(), tf.layers.dense().
- You can get inspiration from https://www.tensorflow.org/tutorials/layers
- Compare the difference in runtime when using a GPU ("Change Runtime Type" in the "Runtime" menu).
- Implement the same model using Keras (you probably need to restart the kernel for this).
- Implement the same model using Eager (you need to restart the kernel for this).

In [0]:
import tensorflow as tf
print (tf.__version__)

from tensorflow.python.client import device_lib
device_lib.list_local_devices()

In [0]:
# Download MNIST data.
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

In [0]:
# ~~ YOUR TURN ~~