### grp

## Hands-On Machine Learning with Scikit-Learn & TensorFlow

## CHAPTER 9: Up and Running with Tensorflow

### What is Tensorflow?:
-  open source library for numerical computation
-  well suited for large-scale ML
-  define a graph of computations for TF to optimize and run C++ code under the hood
-  ability to process chunks of graphs in parallel across multiple CPUs [**tensorflow**] or GPUs [**tensorflow-gpu**]
-  ability to split neural network computations across hundreds of servers via distributed computing
-  runs on Windows, Linux, macOS, as well as, mobile like iOS and Android
-  Python APIs:
    -  TF.Learn (**tensorflow.contrib.learn**) compatible with SKLearn to train various types of neural networks
    -  TF-slim (**tensorflow.contrib.slim**) to build, train, and evaluate neural networks
    -  Keras (**tensorflow.contrib.keras**) high-level API built independently on top of tensorflow
-  provides advanced optimization nodes to search for parameters that minimize cost function
-  automatically takes care of computing the gradients of the functions defined (**automatic differentiating**)
-  visualization tool (**TensorBoard**) to view computation graphs / learning curves

### What is TensorBoard?:
-  displays interactive visualizations of training stats (learning curves)
-  ***must use a different log directory every time you run program or else logs will merge***

#### Documentation:
-  https://tensorflow.org/
-  https://github.com/jtoy/awesome-tensorflow
-  http://stackoverflow.com => #tensorflow
-  http://goo.gl/N7kRF9 => Google group

### Tensorflow Architecture:
1.  Construction Phase:
    -  builds a computation graph representing ML model + computations for training
2.  Execution Phase:
    -  executes an iterative process to train and evaluate the model repeatedly
    -  gradually improves model parameters
3.  Operations [ops]:
    -  can take any number of inputs and produce any number of outputs
    -  ex => addition, multiplication
4.  Source Operations [source ops]:
    -  take no inputs
    -  ex => constants, variables
5.  Tensors:
    -  inputs / outputs are multidimensional arrays
    -  have a **type** and **shape** like NumPy arrays
    -  represented as ndarrays in Python API

#### Running Session:
1.  create graph
2.  open TF session (**places operations onto devices [CPUs / GPUs] for execution**) to initialize variables
3.  evaluate computation
4.  close TF session

### Node / Variable Lifecycle:
-  all node values are dropped between graph runs
-  variable values are maintained by session across graph runs and begins its life when initializer is started and ends when session is closed

### Single Process TF vs Distributed TF:
-  Single => multiple sessions do not share any variable state [each session has its own copy of every variable]
-  Distributed => variable state is stored on servers NOT sessions [multiple sessions can share same variables]

### TF Gradient Descent:
-  uses TFs autodiff feature to compute gradients automatically
-  **important to normalize input feature vectors via something like SKLearn StandardScaler**

### TF Autodiff + Optimizer:
-  creates list of ops (one per variable) and computes gradient of the op w regards to each variable
-  provides many optimizers (Gradient Descent Optimizer, Momentum Optimizer)
-  **example exercise** => _gradients_ node will compute the gradient vector of the MSE w regards to theta

### TF Placeholder Nodes:
-  these nodes do not perform any computation
-  these nodes just output data 

### Saving and Restoring Models:
-  ability to save model w/ parameters and variables to disk
-  ability to save at regular intervals via **checkpoint** for recovery
-  .ckpt => checkpoint
-  .meta => graph structure

### Name Scopes:
-  useful for grouping related nodes together when graph contains thousands of nodes and becomes difficult to read

### Modularity:
-  **Rectified Linear Units** (ReLU) => computes linear function of the inputs and outputs the result if positive [1] or negative [0]
-  https://en.wikipedia.org/wiki/Rectifier_(neural_networks)

## _Exercises_

In [1]:
import tensorflow as tf
print(tf.__version__)

import sklearn
print(sklearn.__version__)

1.13.1
0.20.0


### create computation graph

In [50]:
x = tf.Variable(3, name = "x")
y = tf.Variable(4, name = "y")
f = x*x*y + y + 2

### create session

In [3]:
sess = tf.Session()
sess.run(x.initializer)
sess.run(y.initializer)

### evaluate

In [4]:
result = sess.run(f)
print(result)

42


### close session

In [5]:
sess.close()

### with block session

In [6]:
with tf.Session() as sess:
    x.initializer.run()
    y.initializer.run()
    result = f.eval()
# session automatically closed w/ "with block"

### global variable session initializer

In [7]:
init = tf.global_variables_initializer()
with tf.Session() as sess:
    init.run() # initializes all vars
    result = f.eval()

### interactive session

In [8]:
sess = tf.InteractiveSession() # automatically sets itself as the default session
init.run()
result = f.eval()
print(result)
sess.close() # interactivesession() must close session manually

42


### create node

In [9]:
x1 = tf.Variable(1)
x1.graph is tf.get_default_graph() # nodes are automatically added to default graph

True

### create independent graph

In [10]:
graph = tf.Graph()
with graph.as_default():
    x2 = tf.Variable(2)
print(x2.graph is graph) # confirm new graph is independent to default graph
print(x2.graph is tf.get_default_graph())

True
False


### reset default graph

In [11]:
tf.reset_default_graph()

### node dependencies

In [12]:
w = tf.constant(3)
x = w + 2
y = x + 5
z = x * 3

with tf.Session() as sess:
    print(y.eval()) # 10
    print(z.eval()) # 15
    
print("="*10)

with tf.Session() as sess:
    y_val, z_val = sess.run([y, z]) # evaluates y and z in one graph w/o evaluating w and x twice
    print(y_val) # 10
    print(z_val) # 15

10
15
10
15


### tf linear regression
-  https://en.wikipedia.org/wiki/Linear_least_squares#Derivation_of_the_normal_equations
-  http://mlwiki.org/index.php/Normal_Equation

In [13]:
import numpy as np
from sklearn.datasets import fetch_california_housing

tf.reset_default_graph()

housing = fetch_california_housing()
m, n = housing.data.shape
housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]

X = tf.constant(housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
XT = tf.transpose(X)
theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, X)), XT), y)

with tf.Session() as sess:
    theta_value = theta.eval()

In [14]:
theta_value

array([[-3.6959320e+01],
       [ 4.3698898e-01],
       [ 9.4245886e-03],
       [-1.0791138e-01],
       [ 6.4842808e-01],
       [-3.9986235e-06],
       [-3.7866351e-03],
       [-4.2142656e-01],
       [-4.3467718e-01]], dtype=float32)

In [15]:
import numpy as np
X = housing_data_plus_bias
y = housing.target.reshape(-1, 1)
theta_numpy = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)

print(theta_numpy)

[[-3.69419202e+01]
 [ 4.36693293e-01]
 [ 9.43577803e-03]
 [-1.07322041e-01]
 [ 6.45065694e-01]
 [-3.97638942e-06]
 [-3.78654265e-03]
 [-4.21314378e-01]
 [-4.34513755e-01]]


In [51]:
from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(housing.data, housing.target.reshape(-1, 1))

print(np.r_[lin_reg.intercept_.reshape(-1, 1), lin_reg.coef_.T])

[[-3.69419202e+01]
 [ 4.36693293e-01]
 [ 9.43577803e-03]
 [-1.07322041e-01]
 [ 6.45065694e-01]
 [-3.97638942e-06]
 [-3.78654265e-03]
 [-4.21314378e-01]
 [-4.34513755e-01]]


### sklearn standardscaler

In [17]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaled_housing_data = scaler.fit_transform(housing.data)
scaled_housing_data_plus_bias = np.c_[np.ones((m, 1)), scaled_housing_data]

print(scaled_housing_data_plus_bias.mean(axis=0))
print(scaled_housing_data_plus_bias.mean(axis=1))
print(scaled_housing_data_plus_bias.mean())
print(scaled_housing_data_plus_bias.shape)

[ 1.00000000e+00  6.60969987e-17  5.50808322e-18  6.60969987e-17
 -1.06030602e-16 -1.10161664e-17  3.44255201e-18 -1.07958431e-15
 -8.52651283e-15]
[ 0.38915536  0.36424355  0.5116157  ... -0.06612179 -0.06360587
  0.01359031]
0.11111111111111005
(20640, 9)


### tf gradient descent manual approach

In [18]:
tf.reset_default_graph()

n_epochs = 1000
learning_rate = 0.01

X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
gradients = 2/m * tf.matmul(tf.transpose(X), error)
training_op = tf.assign(theta, theta - learning_rate * gradients)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs): # epochs => executes training step over and over via N # of epochs
        if epoch % 100 == 0:
            print("Epoch", epoch, "MSE =", mse.eval()) # mse gets lower at every iteration
        sess.run(training_op)
    
    best_theta = theta.eval()

Epoch 0 MSE = 2.754427
Epoch 100 MSE = 0.63222194
Epoch 200 MSE = 0.5727803
Epoch 300 MSE = 0.5585008
Epoch 400 MSE = 0.54907006
Epoch 500 MSE = 0.542288
Epoch 600 MSE = 0.5373791
Epoch 700 MSE = 0.533822
Epoch 800 MSE = 0.53124255
Epoch 900 MSE = 0.5293705


### tf gradient descent autodiff

In [19]:
gradients = tf.gradients(mse, [theta])[0] # op => mse; variables => theta
print(gradients)

Tensor("gradients/predictions_grad/MatMul_1:0", shape=(9, 1), dtype=float32)


### tf gradient descent optimizer

In [20]:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)
print(optimizer)
print(training_op)

<tensorflow.python.training.gradient_descent.GradientDescentOptimizer object at 0x1297e3160>
name: "GradientDescent"
op: "NoOp"
input: "^GradientDescent/update_theta/ApplyGradientDescent"



### placeholder nodes

In [21]:
tf.reset_default_graph()

A = tf.placeholder(tf.float32, shape=(None, 3))
B = A + 5
with tf.Session() as sess:
    B_val_1 = B.eval(feed_dict={A: [[1, 2, 3]]})
    B_val_2 = B.eval(feed_dict={A: [[4, 5, 6], [7, 8, 9]]})

print(B_val_1)
print("="*10)
print(B_val_2)

[[6. 7. 8.]]
[[ 9. 10. 11.]
 [12. 13. 14.]]


### tf mini-batch gradient descent

In [22]:
tf.reset_default_graph()

n_epochs = 1000
learning_rate = 0.01

X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")

theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)

init = tf.global_variables_initializer()
n_epochs = 10

batch_size = 100
n_batches = int(np.ceil(m / batch_size))

def fetch_batch(epoch, batch_index, batch_size):
    np.random.seed(epoch * n_batches + batch_index)
    indices = np.random.randint(m, size=batch_size)
    X_batch = scaled_housing_data_plus_bias[indices]
    y_batch = housing.target.reshape(-1, 1)[indices]
    return X_batch, y_batch

with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs):
        for batch_index in range(n_batches):
            X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})

    best_theta = theta.eval()

In [23]:
best_theta

array([[ 2.070016  ],
       [ 0.8204561 ],
       [ 0.1173173 ],
       [-0.22739051],
       [ 0.3113402 ],
       [ 0.00353193],
       [-0.01126994],
       [-0.91643935],
       [-0.8795008 ]], dtype=float32)

### save / restore model

In [24]:
tf.reset_default_graph()

n_epochs = 1000                                                             
learning_rate = 0.01

X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)

init = tf.global_variables_initializer()
saver = tf.train.Saver()

with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs):
        if epoch % 100 == 0:
            print("Epoch", epoch, "MSE =", mse.eval())
            save_path = saver.save(sess, "/Users/grp/mlNotebooks/my_model.ckpt")
        sess.run(training_op)
    
    best_theta = theta.eval()
    save_path = saver.save(sess, "/Users/grp/mlNotebooks/my_model_final.ckpt")

Epoch 0 MSE = 2.754427
Epoch 100 MSE = 0.63222194
Epoch 200 MSE = 0.5727803
Epoch 300 MSE = 0.5585009
Epoch 400 MSE = 0.54907006
Epoch 500 MSE = 0.542288
Epoch 600 MSE = 0.5373791
Epoch 700 MSE = 0.533822
Epoch 800 MSE = 0.53124255
Epoch 900 MSE = 0.5293704


In [25]:
best_theta

array([[ 2.06855249e+00],
       [ 7.74078071e-01],
       [ 1.31192386e-01],
       [-1.17845066e-01],
       [ 1.64778143e-01],
       [ 7.44078017e-04],
       [-3.91945094e-02],
       [-8.61356676e-01],
       [-8.23479772e-01]], dtype=float32)

In [26]:
! ls /Users/grp/mlNotebooks/*my_model*

/Users/grp/mlNotebooks/my_model.ckpt.data-00000-of-00001
/Users/grp/mlNotebooks/my_model.ckpt.index
/Users/grp/mlNotebooks/my_model.ckpt.meta
/Users/grp/mlNotebooks/my_model_final.ckpt.data-00000-of-00001
/Users/grp/mlNotebooks/my_model_final.ckpt.index
/Users/grp/mlNotebooks/my_model_final.ckpt.meta


In [27]:
with tf.Session() as sess:
    saver.restore(sess, "/Users/grp/mlNotebooks/my_model_final.ckpt")
    best_theta_restored = theta.eval()

Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from /Users/grp/mlNotebooks/my_model_final.ckpt


### save specific variable

In [28]:
saver = tf.train.Saver({"weights": theta})

### restore graph structure and variable

In [29]:
tf.reset_default_graph()

saver = tf.train.import_meta_graph("/Users/grp/mlNotebooks/my_model_final.ckpt.meta")  # loads the graph structure
theta = tf.get_default_graph().get_tensor_by_name("theta:0")

with tf.Session() as sess:
    saver.restore(sess, "/Users/grp/mlNotebooks/my_model_final.ckpt")  # restores the graph's state
    best_theta_restored = theta.eval()

INFO:tensorflow:Restoring parameters from /Users/grp/mlNotebooks/my_model_final.ckpt


**"This means that you can import a pretrained model without having to have the corresponding Python code to build the graph. This is very handy when you keep tweaking and saving your model: you can load a previously saved model without having to search for the version of the code that built it."** - Aurelien Geron [Hands on ML w SKLearn & TF]

In [30]:
np.allclose(best_theta, best_theta_restored)

True

### tensorboard logs

In [31]:
tf.reset_default_graph()

from datetime import datetime

now = datetime.utcnow().strftime("%Y%m%d%H%M%S")
root_logdir = "tf_logs"
logdir = "{}/run-{}/".format(root_logdir, now)

In [32]:
n_epochs = 1000
learning_rate = 0.01

X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)

init = tf.global_variables_initializer()

In [33]:
mse_summary = tf.summary.scalar('MSE', mse) # writes MSE value to tensorboard log summary
file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph()) # writes summaries to log events file

In [34]:
n_epochs = 10
batch_size = 100
n_batches = int(np.ceil(m / batch_size))

with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs):
        for batch_index in range(n_batches):
            X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
            if batch_index % 10 == 0:
                summary_str = mse_summary.eval(feed_dict={X: X_batch, y: y_batch})
                step = epoch * n_batches + batch_index
                file_writer.add_summary(summary_str, step)
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})

    best_theta = theta.eval()

In [35]:
file_writer.close()

In [36]:
best_theta

array([[ 2.070016  ],
       [ 0.8204561 ],
       [ 0.1173173 ],
       [-0.22739051],
       [ 0.3113402 ],
       [ 0.00353193],
       [-0.01126994],
       [-0.91643935],
       [-0.8795008 ]], dtype=float32)

In [54]:
! ls /Users/grp/mlNotebooks/tf_logs/run-20190321031209 # write out log

events.out.tfevents.1553137929.garretts-MBP


### launch tensorboard via terminal

In [38]:
# garretts-MBP:~ grp$ tensorboard --logdir /Users/grp/mlNotebooks/tf_logs/
# TensorBoard 1.13.1 at http://garretts-MBP:6006 (Press CTRL+C to quit)

### name scopes

In [39]:
with tf.name_scope("loss") as scope:
    error = y_pred - y
    mse = tf.reduce_mean(tf.square(error), name="mse")

print(error.op.name)
print(mse.op.name)

loss/sub
loss/mse


### relu + shared variable via ...
-  input parameter
-  python dictionary
-  python class
-  get_variable()

In [40]:
tf.reset_default_graph()
n_features = 3

def relu(X, threshold): # shared variable threshold for all relus
    with tf.name_scope("relu"):
        w_shape = (int(X.get_shape()[1]), 1)
        w = tf.Variable(tf.random_normal(w_shape), name="weights")
        b = tf.Variable(0.0, name="bias")
        z = tf.add(tf.matmul(X, w), b, name="z")
        return tf.maximum(z, threshold, name="max")

threshold = tf.Variable(0.0, name="threshold")
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X, threshold) for i in range(5)]
output = tf.add_n(relus, name="output")

### get_variable() method

In [41]:
tf.reset_default_graph()

with tf.variable_scope("relu"):
    threshold = tf.get_variable("threshold", shape=(), # when wanting to use scalar
                                initializer=tf.constant_initializer(0.0))

In [42]:
with tf.variable_scope("relu", reuse=True): # when wanting to reuse variable
    threshold = tf.get_variable("threshold")

In [43]:
with tf.variable_scope("relu") as scope:
    scope.reuse_variables() # when wanting to reuse variable
    threshold = tf.get_variable("threshold")

1.  define relu() function
2.  create relu/threshold variable
3.  build five relus via function call
4.  reuse relu/threshold variable
5.  create remaining relu nodes

### relu pipeline w/ five relus sharing treshold variable

In [44]:
tf.reset_default_graph()

def relu(X):
    with tf.variable_scope("relu", reuse=True):
        threshold = tf.get_variable("threshold")
        w_shape = int(X.get_shape()[1]), 1
        w = tf.Variable(tf.random_normal(w_shape), name="weights")
        b = tf.Variable(0.0, name="bias")
        z = tf.add(tf.matmul(X, w), b, name="z")
        return tf.maximum(z, threshold, name="max")

X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
with tf.variable_scope("relu"):
    threshold = tf.get_variable("threshold", shape=(),
                                initializer=tf.constant_initializer(0.0))
relus = [relu(X) for relu_index in range(5)]
output = tf.add_n(relus, name="output")

In [45]:
file_writer = tf.summary.FileWriter("tf_logs/relu", tf.get_default_graph())
file_writer.close()

In [46]:
! ls /Users/grp/mlNotebooks/tf_logs/relu # write out log

events.out.tfevents.1553137930.garretts-MBP


### relu pipeline w/ shared variables living in 1st relu

In [47]:
tf.reset_default_graph()

def relu(X):
    threshold = tf.get_variable("threshold", shape=(),
                                initializer=tf.constant_initializer(0.0))
    w_shape = (int(X.get_shape()[1]), 1)
    w = tf.Variable(tf.random_normal(w_shape), name="weights")
    b = tf.Variable(0.0, name="bias")
    z = tf.add(tf.matmul(X, w), b, name="z")
    return tf.maximum(z, threshold, name="max")

X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = []
for relu_index in range(5):
    with tf.variable_scope("relu", reuse=(relu_index >= 1)) as scope:
        relus.append(relu(X))
output = tf.add_n(relus, name="output")

In [48]:
file_writer = tf.summary.FileWriter("tf_logs/relu1", tf.get_default_graph())
file_writer.close()

In [49]:
! ls /Users/grp/mlNotebooks/tf_logs/relu1 # write out log

events.out.tfevents.1553137930.garretts-MBP


### additional exercises:

https://github.com/ageron/handson-ml/blob/master/09_up_and_running_with_tensorflow.ipynb

1. What are the main benefits of creating a computation graph rather than directly executing the computations? What are the main drawbacks?
2. Is the statement a_val = a.eval(session=sess) equivalent to a_val = sess.run(a)?
3. Is the statement a_val, b_val = a.eval(session=sess), b.eval(ses sion=sess) equivalent to a_val, b_val = sess.run([a, b])?
4. Can you run two graphs in the same session?
5. If you create a graph g containing a variable w, then start two threads and open a session in each thread, both using the same graph g, will each session have its own copy of the variable w or will it be shared?
6. When is a variable initialized? When is it destroyed?
7. What is the difference between a placeholder and a variable?
8. What happens when you run the graph to evaluate an operation that depends on a placeholder but you don’t feed its value? What happens if the operation does not depend on the placeholder?
9. When you run a graph, can you feed the output value of any operation, or just the value of placeholders?
10. How can you set a variable to any value you want (during the execution phase)?
11. How many times does reverse-mode autodiff need to traverse the graph in order to compute the gradients of the cost function with regards to 10 variables? What about forward-mode autodiff? And symbolic differentiation?
12. Implement Logistic Regression with Mini-batch Gradient Descent using Tensor‐ Flow. Train it and evaluate it on the moons dataset (introduced in Chapter 5). Try adding all the bells and whistles: • Define the graph within a logistic_regression() function that can be reused easily. • Save checkpoints using a Saver at regular intervals during training, and save the final model at the end of training. • Restore the last checkpoint upon startup if training was interrupted. • Define the graph using nice scopes so the graph looks good in TensorBoard. • Add summaries to visualize the learning curves in TensorBoard. • Try tweaking some hyperparameters such as the learning rate or the mini- batch size and look at the shape of the learning curve.

### grp