In [4]:
# This note is created usingtensorflow 2.0.0

import tensorflow as tf
tf.__version__

'2.0.0'

# Summary of Intro to Tensor Flow

# Week 1

## Introduction

- You will learn how to use the TensorFlow libraries to solve numerical problems

- You will learn how to troubleshoot and debug common TensorFlow called pitfalls

- You will learn how to debug TensorFlow programs

- Understanding how well a machine learning model is doing will require you to views KLR numbers like losses and weights over the course of training as a chart

- You will learn what I mean when I say lazy evaluation imperative, and learn how to write lazy evaluation and imperative programs

- Lazy evaluation means that TensorFlow works of variables that are parts of graphs that are tied to sessions

- You also typically want to look at things called embeddings, or projectors, and the architecture of your model

### What is Tensor Flow?

- Vector to mean 1D arrays. A two-dimensional array is a matrix, but the three-dimensional array, we just call it a 3D tensor. So scalar, vector, matrix 3D tensor, 4D tensor et cetera.

- A tensor is an n dimensional array of data. So your data in TensorFlow, they are tensors

- The way TensorFlow works is that you create a directed acyclic graph, a DAG


### Benefits of directed acyclic graph, a DAG

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/c/c6/Topological_Ordering.svg/1200px-Topological_Ordering.svg.png" alt="DAG" style="width: 250px;"/>

- Portability, the directed acyclic graph, the DAG is a language-independent representation of the code in your model

- You can use the same Python code and execute it on both CPUs and GPUs, so it gives you language and hardware portability

- Like JVM in Java, Python code will execute by Tensor Flow Engine

- You can train a TensorFlow model on the cloud, on lots and lots of powerful hardware, and then take that trained model and put it on mobile device.

- Like Google Translate App that can work completely offline because a trained translation model is stored on the phone and is available for offline translation.

- These sorts of smaller, less powerful models are typically implemented using TensorFlow Lite.

### Model Training on Mobile Device

- One situation is that you train a model, and then you deploy to a bunch of phones. And then when you make a prediction, the user says nope, this isn't right, or please show me more results like this

- You want to update the weights of the model to reflect that user's preferences. Which, you can set up what is called federated learning, where you aggregate many users' updates.

- This aggregate is essentially like a weight update on a batch of samples, except that it comes from different users

- this consensus change happens to the shared model on the cloud. So you deploy the shared model, you fine tune it on different users' devices, rinse and repeat

## Tensor Flow Hierarchy

- You don't need a custom neural network model, many times you are quite happy to go with a relatively standard way of training

- You don't need to customize the way you train, you're going to use one of a family of gradient descent optimizer, and you're going to back propagator the weights, and you're going to do this iteratively. In that case, don't write a low level session loop. Just use an estimator

### Tensor Flow Hierarchy API

![tensorflow_herarchy](https://developers.google.com/machine-learning/crash-course/images/TFHierarchy.svg?hl=id)

- TensorFlow has in it a number of abstraction layers

- The lowest level of abstruction, is a layer that's implemented to target different hardware platforms

- The next level, is a TensorFlow c++ API. This is how you can write a custom TensorFlow app, You will implement a function you want in C++, and register it as a TensorFlow operation

- The core Python API the next level, is what contains much of the numeric processing code, add, subtract, divide, matrix multiply etc. creating variables, creating tensors, getting the shape, all the dimensions of a tensor, all that core basic numeric processing stuff, that's all in the python API

- Then the next one is in tf layers, a way to compute the root mean square error and data as it comes in, tf metrics, a way to compute cross entropy with Logic's. This is a common last measurement classification problems, cross entropy with logits, it's in tf losses.

- The estimator, is the high-level API in TensorFlow. It knows how to do this to be the training, it knows how to evaluate how to create a checkpoint, how to Save a model, how to set it up for serving

## Graph and Session

### Benefit of using directed graph

- Tensorflow can assign different parts of the DAG to different devices, depending on whether it's I/O bound, or whether it's going to require GPU capabilities

- Tensorboard is used to visualize the graph

- When the graph is being compiled, TensorFlow can take two ops and fuse them to improve performance, e.g. if there are 2 add operation it will be fuse into 1

- The most exciting part is that the DAG can be remotely executed and assigned to devices. It's possible for TensorFLow to partition your program across multiple devices; CPUs, GPUs, TPUs, etc that are attached even to different machines

### Lazy Evaluation Mode

- Why does TensorFlow do lazy evaluation? It's because lazy evaluation allows for a lot of flexibility and optimization when you're running the graph

In [15]:
# to evaluate tensor
sess.run(z)
z.eval()
sess.run([z])

NameError: name 'sess' is not defined

In Tensorflow 1.x, we still know Session, but in Tensorflow 2.X, we dont need it

In [16]:
import tensorflow as tf

# build the DAG
a = tf.constant([3,5,7])
b = tf.constant([1,2,3])

c = tf.add(a,b)

# run it
tf.print(c)

<tf.Operation 'PrintV2' type=PrintV2>

### Eager Mode

In [17]:
import tensorflow as tf

tf.executing_eagerly()

x = tf.constant([3,5,7])
y = tf.constant([1,2,3])

tf.print(x-y)

<tf.Operation 'PrintV2_1' type=PrintV2>

## Tensor and Variable

- We can do the stacking in code instead, instead of counting all those parenthesis.

- You can also slice a tensor to pull out lower dimensional tensors

- Once you have the data into a tensor, you can take all that data and it can reshape the tensor

In [18]:
import tensorflow as tf

# stacking
print('stacking')
a = tf.constant([3,5,7])
b = tf.stack([a,a])
tf.print(b)

# slicing
print('slicing')
tf.print(b[:,1])
tf.print(b[1,:])
tf.print(b[1,0:2])

# reshape tensor
print('reshaping')
tf.print(tf.reshape(b, [3,2]))

stacking
slicing
reshaping


<tf.Operation 'PrintV2_6' type=PrintV2>

### Variable vs Place holder

- We initialize variables, we can modify it later as well. Whereas placeholder doesn’t require initial value. Placeholder simply allocates block of memory for future use. 

- Later, we can use feed_dict to feed the data into placeholder

Using tensorflow 1, to use placeholder

In [19]:
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior() 

a = tf.placeholder("float", None)
b = a * 4
print(a)

with tf.Session() as session:
    print(session.run(b, feed_dict={a: [1,2,3]}))

Tensor("Placeholder_2:0", dtype=float32)
[ 4.  8. 12.]


## Lab Excercise : Halley Method

In [3]:
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior() 

def compute_halley_method(x, ratio, tol):
  # slice the input to get the coefficient

    coefficient = tf.constant(
      [5.0, 3.0, 7.0, 8.0, 9.0]
    )
    
    a0 = coefficient[0]
    a1 = coefficient[1]
    a2 = coefficient[2]
    a3 = coefficient[3]
    a4 = coefficient[4]
    
    ddf_x = 2 * a2 + 6 * a3 * x + 12 * a4 * x**2
    df_x = a1 + 2 * a2 * x + 3 * a3 * x**2 + 4 * a4 * x**3
    f_x = a0 + a1 * x + a2 * x**2 + a3 * x**3 + a4 * x**4
  
    # Halley's formula
    num = (2 * f_x * df_x)
    den = tf.subtract((2 * df_x * df_x), (f_x * ddf_x))
    ratio = tf.divide(num, den) 
    x = tf.subtract(x, ratio)
    x = tf.Print(x, [x], 'x=')
    return x, ratio, tol

def cond(x, ratio, tol):
    diff = tf.abs(ratio)
    return tf.less(tol, diff)

with tf.Session() as sess:
    # initial guess
    x = 1.0
    
    # initial delta/ratio
    ratio = 3.0
    
    # error
    tol = 0.001
    
    halley_result = tf.while_loop(cond, compute_halley_method, [x, ratio, tol])
    # 1. Full 3 parameters
    result = sess.run(halley_result)
    print(result)

    # 2. One Parameter
    print(halley_result[0].eval())

(-0.29887062, 0.00010644918, 0.001)
-0.29887062


## Debugging Tensor Flow Program

- Debugging a TensorFlow program can be tricky because of the lazy evaluation paradigm.

- This is one of the reasons why we said TF eager can be helpful when developing TensorFlow programs

- If there is shape error, shape coercing can be done with reshape method

- Another common error that you will run into when developing TensorFlow programs are data type errors

### Debugging Whole Program

- TensorFlow also has a dynamic interactive debugger called tf_debug. You run it from the command line

- tfdbg is an interactive debugger that you can run from a terminal and attach to a local or remote TensorFlow session.

- TensorBoard is a visual monitoring tool. You can look at evaluation metrics, look for over-fitting, layers that are dead, etc. Higher level debugging of neural networks, in other words

### The flow of debugging

1. Read the call trace, 
2. Read the error message, find out where problem is. 
3. Having found out the problem, fix it, make sure that it works on your fake data
4. You can try it back on your full large data set and hopefully everything works

## Week 2

What you will learn

- you will learn how to create production-ready machine learning models the easy way
- train on large datasets that do not fit in memory
- monitor your training metrics in Tensorboard.

### Estimator API

- Estimators automatically surface key metrics during training that you can visualize in Tensor board
- Estimators are designed with a data set API that handles out of memory data sets
- Estimators come with the necessary cluster execution code already built in
- Do you need checkpoints to pause and resume your training? Estimators have them
- Estimators wrap your model to make it ready for ML-Engine's hyper-parameter tuning, and maybe also push it to production behind ML-Engine's managed and autoscaled prediction service.

### Pre Made Estimator

- How can we pack our data into the single input vector that linear regressor expect? The answer is in various ways depending on what data we are packing, and so that is where the feature columns API comes in handy
- There are many more feature column types to choose from: columns for continuous values, you want to bucketized, word embeddings, column crosses, and so on

### Demo House Pricing Predictor

In [1]:
import tensorflow.compat.v1 as tf
import numpy as np
import shutil

In [2]:
tf.disable_v2_behavior() # disable v2

Instructions for updating:
non-resource variables are not supported in the long term


In [5]:
shutil.rmtree("outdir", ignore_errors = True) # start fresh each time / ignore checkpoint

In [9]:
def train_input_fn():
    features = {"sqr_foot": [100, 2000, 3000, 400, 300, 9000],
                "type": ["apt", "house", "house", "house", "apt", "house" ]}
    
    labels = [400, 700, 900, 300, 400, 1300]
    
    return features, labels

featcols = [
    tf.feature_column.numeric_column("sqr_foot"),
    tf.feature_column.categorical_column_with_vocabulary_list("type", ["house","apt"])
]

model = tf.estimator.LinearRegressor(featcols, "outdir", "./model_trained)

model.train(train_input_fn, steps=10000)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': 'outdir', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x643f27c50>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_

INFO:tensorflow:global_step/sec: 540.392
INFO:tensorflow:loss = 472445.8, step = 8201 (0.186 sec)
INFO:tensorflow:global_step/sec: 546.953
INFO:tensorflow:loss = 471439.88, step = 8301 (0.182 sec)
INFO:tensorflow:global_step/sec: 566.803
INFO:tensorflow:loss = 470441.75, step = 8401 (0.177 sec)
INFO:tensorflow:global_step/sec: 660.393
INFO:tensorflow:loss = 469451.06, step = 8501 (0.151 sec)
INFO:tensorflow:global_step/sec: 537.978
INFO:tensorflow:loss = 468467.88, step = 8601 (0.186 sec)
INFO:tensorflow:global_step/sec: 618.851
INFO:tensorflow:loss = 467491.94, step = 8701 (0.161 sec)
INFO:tensorflow:global_step/sec: 613.049
INFO:tensorflow:loss = 466523.25, step = 8801 (0.163 sec)
INFO:tensorflow:global_step/sec: 558.603
INFO:tensorflow:loss = 465561.75, step = 8901 (0.179 sec)
INFO:tensorflow:global_step/sec: 607.471
INFO:tensorflow:loss = 464607.16, step = 9001 (0.164 sec)
INFO:tensorflow:global_step/sec: 620.747
INFO:tensorflow:loss = 463659.25, step = 9101 (0.161 sec)
INFO:tensor

<tensorflow_estimator.python.estimator.canned.linear.LinearRegressor at 0x643f27910>

In [10]:
def predict_input_fn():
    features = {"sqr_foot": [100, 1000, 2000, 300, 400, 9000],
                "type": ["apt", "house", "house", "house", "apt", "house" ]}
    return features

predictions = model.predict(predict_input_fn)

print(next(predictions))
print(next(predictions))
print(next(predictions))
print(next(predictions))
print(next(predictions))
print(next(predictions))

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from outdir/model.ckpt-12000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
{'predictions': array([98.66314], dtype=float32)}
{'predictions': array([239.65883], dtype=float32)}
{'predictions': array([397.38232], dtype=float32)}
{'predictions': array([129.25238], dtype=float32)}
{'predictions': array([145.9802], dtype=float32)}
{'predictions': array([1501.4469], dtype=float32)}


### Checkpointing

- Why are these important? They allow you to continue training, resume on failure, and predict from a train model. You get checkpoints for free, just specify a folder directory. And let's take a look at the code

In [19]:
## Create Checkpoint
# The model will continue from the checkpoint

model = tf.estimator.LinearRegressor(featcols, "./model_trained") # directory

model.train(train_input_fn, steps=10000)

predictions = model.predict(predict_input_fn)

print(next(predictions))
print(next(predictions))
print(next(predictions))
print(next(predictions))
print(next(predictions))
print(next(predictions))

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': './model_trained', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x6440cfd10>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calli

INFO:tensorflow:loss = 484066.25, step = 7101 (0.189 sec)
INFO:tensorflow:global_step/sec: 571.963
INFO:tensorflow:loss = 482964.7, step = 7201 (0.175 sec)
INFO:tensorflow:global_step/sec: 609.489
INFO:tensorflow:loss = 481872.75, step = 7301 (0.164 sec)
INFO:tensorflow:global_step/sec: 497.062
INFO:tensorflow:loss = 480790.1, step = 7401 (0.201 sec)
INFO:tensorflow:global_step/sec: 498.331
INFO:tensorflow:loss = 479716.8, step = 7501 (0.201 sec)
INFO:tensorflow:global_step/sec: 564.668
INFO:tensorflow:loss = 478652.38, step = 7601 (0.177 sec)
INFO:tensorflow:global_step/sec: 573.74
INFO:tensorflow:loss = 477596.84, step = 7701 (0.175 sec)
INFO:tensorflow:global_step/sec: 582.364
INFO:tensorflow:loss = 476549.94, step = 7801 (0.172 sec)
INFO:tensorflow:global_step/sec: 638.507
INFO:tensorflow:loss = 475511.56, step = 7901 (0.157 sec)
INFO:tensorflow:global_step/sec: 609.418
INFO:tensorflow:loss = 474481.56, step = 8001 (0.165 sec)
INFO:tensorflow:global_step/sec: 464.233
INFO:tensorflo

### Training on in-memory datasets

- If your data fits a memory in the form of either numpy arrays or Pandas, the Estimator API has easy convenience functions for feeding them into your model
- Typically, training works best when one training step is performed on what is called a mini batch of input data at a time, not a single data item and not the entire data set either

### Scaling Tensor Flow using Batching

- TensorFlow has an API called datasets that can handle this and feed your model while loading the data from disk in a progressive way

### Big jobs, Distributed training

- You have to distribute it on a cluster to make it faster. The function that implements distributed training is called estimator.train and evaluate

- With the estimator API and ML engine managing the cluster automatically, you get distribution out of the box

- The traditional distribution model for training neural networks is called data parallelism. Your model is replicated on multiple workers

- In distributed training even with a well-shuffled data set on disk, if all your workers are loading straight from this data set, they will be seeing the same batch of data, at the same time, and produce the same gradients.

- With data set that shuffle, the shuffling happens independently on each worker using a different random seed, so please use it

### Monitoring with Tensorboards

- TensorBoard is a tool that lets you visualize the training and the biometrics that your model writes to disk

- Ready for deployment with not only a checkpoint on good trained parameters, but also an extra input function that will map between the JSON received by the REST API and the features as expected by the model. This one is called the serving input function

- Serving and training time inputs are often very different

- At training time this is done through the training input function. We use the data as an API there to make an input node that could progressively read from CSV files and send batches of training data into the model.

- We will use a similar pattern for our deployed model. The serving input function lets us add a set of TensorFlow transformations between the JSON our REST API receives and the features expected by our model.