We now know the basic of tensorflow and can create a model, or at least copy paste code from last notebook, then chose a optimizer,loss function and write/run the training step. We now want to study some utitites/extended libraries that extends this process.  

In [6]:
import tensorflow as tf
import tensorflow.keras as K
import pandas as pd

One thing we forgot in the basic tensorflow notebook was what to do after training a model. One would like to save the model so one can load it and work with it in another program.

A Keras model consists of multiple components:

- An architecture, or configuration, which specifyies what layers the model contain, and how they're connected.
- A set of weights values (the "state of the model").
- An optimizer (defined by compiling the model).
- A set of losses and metrics (defined by compiling the model or calling add_loss() or add_metric()).

The Keras API makes it possible to save of these pieces to disk at once, or to only selectively save some of them:

- Saving everything into a single archive in the TensorFlow SavedModel format (or in the older Keras H5 format). This is the standard practice.
- Saving the architecture / configuration only, typically as a JSON file. (Will not look into atm)
- Saving the weights values only. This is generally used when training the model. (Will not look into atm)

Let's use or model from the last notebook.

In [4]:
#CREATE SAME MODEL AS LAST NOTEBOOK
(mnist_images, mnist_labels), _ = K.datasets.mnist.load_data()

dataset = tf.data.Dataset.from_tensor_slices(
  (tf.cast(mnist_images[...,tf.newaxis]/255, tf.float32),
   tf.cast(mnist_labels,tf.int64)))

dataset = dataset.shuffle(1000).batch(32)

optimizer = K.optimizers.Adam()
loss_object = K.losses.SparseCategoricalCrossentropy(from_logits=True)

loss_history = []

In [3]:
mnist_model = K.Sequential([
  K.layers.Conv2D(16,[3,3], activation='relu',
                         input_shape=(None, None, 1)),
  K.layers.Conv2D(16,[3,3], activation='relu'),
  K.layers.GlobalAveragePooling2D(),
  K.layers.Dense(10)
])

In [5]:
def train_step(images, labels):
  with tf.GradientTape() as tape:
    logits = mnist_model(images, training=True)
    
    tf.debugging.assert_equal(logits.shape, (32, 10))
    
    loss_value = loss_object(labels, logits)

  loss_history.append(loss_value.numpy().mean())
  grads = tape.gradient(loss_value, mnist_model.trainable_variables)
  optimizer.apply_gradients(zip(grads, mnist_model.trainable_variables))

In [6]:
epochs = 1
for epoch in range(epochs):
    for (batch, (images, labels)) in enumerate(dataset):
      train_step(images, labels)
    print ('Epoch {} finished'.format(epoch))

Epoch 0 finished


We can save the whole model with: (this is Tensorflow Savemodel)
- model.save(path+'name')
- models.load_model(path+'name')

This will create a folder named name and contains 3 folders 
- assets  
- saved_model.pb  
- variables

The model architecture, and training configuration (including the optimizer, losses, and metrics) are stored in saved_model.pb, the rest in variables

In [7]:
mnist_model.save('mnist_model')

Instructions for updating:
This property should not be used in TensorFlow 2.0, as updates are applied automatically.
Instructions for updating:
This property should not be used in TensorFlow 2.0, as updates are applied automatically.
INFO:tensorflow:Assets written to: mnist_model\assets


In [11]:
%ls mnist_model

 Volume in drive C has no label.
 Volume Serial Number is DE56-6F87

 Directory of C:\Users\Jonathan\PycharmProjects\ML-notebook\Tensorflow\mnist_model

2020-08-29  20:36    <DIR>          .
2020-08-29  20:36    <DIR>          ..
2020-08-29  20:36    <DIR>          assets
2020-08-29  20:36            73ÿ090 saved_model.pb
2020-08-29  20:36    <DIR>          variables
               1 File(s)         73ÿ090 bytes
               4 Dir(s)  18ÿ282ÿ639ÿ360 bytes free


Sometime when training a model, the data is to big to be in the memory at the same time or the process of reading/loading a sample takes longer then running the model. Tensorflow has a built in function that handles the input pipeline. 

In [4]:
#Create from array, also from numpy etc.
dataset = tf.data.Dataset.from_tensor_slices([8, 3, 0, 8, 2, 1])
print(dataset)

<TensorSliceDataset shapes: (), types: tf.int32>
<class 'tensorflow.python.data.ops.dataset_ops.TensorSliceDataset'>


The basis of the input line is the dataset, we can construct a dataset from data in memory by
- tf.data.Dataset.from_tensors() or tf.data.Dataset.from_tensor_slices()

This create a Python iterable object.

In [5]:
#iterate like always
for elem in dataset:
  print(elem.numpy())

print('---------------------' +'\n')

it = iter(dataset)

print(next(it).numpy())

8
3
0
8
2
1
---------------------

8


In [18]:
df = pd.read_csv('input/GS.csv')

#remove object type, alt. change it to int (or string?)
print(df.dtypes)
del df['Date']

#We can add features
dataset = tf.data.Dataset.from_tensor_slices((df.index,df.values))

it = iter(dataset)
print(next(it))

#We see that it has shape ((),(40,))
print(dataset.element_spec)

Date             object
Open            float64
High            float64
Low             float64
Close           float64
Volume            int64
USD3MTD156N     float64
BAMLH0A0HYM2    float64
T10YIE          float64
UNRATE          float64
VIXCLS          float64
NIKKEI225       float64
NASDAQCOM       float64
MS_Close        float64
MS_Volume       float64
JPM_Close       float64
JPM_Volume      float64
WFC_Close       float64
WFC_Volume      float64
C_Close         float64
C_Volume        float64
BAC_Close       float64
BAC_Volume      float64
BCS_Close       float64
BCS_Volume      float64
HSBC_Close      float64
HSBC_Volume     float64
ma7             float64
ma21            float64
26ema           float64
12ema           float64
MACD            float64
20sd            float64
upper_band      float64
lower_band      float64
ema             float64
momentum        float64
fft3            float64
fft6            float64
fft9            float64
ARIMA           float64
dtype: object
(<

We saw how to create a dataset from data from memory. To 