# Tensorboard

I modified the notebook orginally created by JM Portilla (Pierian Data Inc.).

Tensorboard is a dashboard that visualizes how the network is trained, eg., the weight values along the epochs are displayed, etc.

Official tutorial: https://www.tensorflow.org/tensorboard/get_started

**Install**:
`pip/3 install tensorboard`

**Usage**:
1. We instantiate a TensorBoard callback and pass it to `model.fit()`; the callback logs can save many different data
2. Then, we launch tensorboard in the terminal: `tensorboard --logdir=path_to_your_logs`
3. We open the tensorboard dashboard with browser at: http://localhost:6006/


## 1. Data

In [27]:
import pandas as pd
import numpy as np

In [28]:
df = pd.read_csv('../data/cancer_classification.csv')

### Train Test Split

In [29]:
X = df.drop('benign_0__mal_1',axis=1).values
y = df['benign_0__mal_1'].values

In [30]:
from sklearn.model_selection import train_test_split

In [31]:
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.25,random_state=101)


### Scaling Data

In [32]:
from sklearn.preprocessing import MinMaxScaler

In [33]:
scaler = MinMaxScaler()

In [34]:
scaler.fit(X_train)

MinMaxScaler(copy=True, feature_range=(0, 1))

In [35]:
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

## 2. Creating the Model

In [36]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation,Dropout

In [37]:
# We import the TersorBoard callback
from tensorflow.keras.callbacks import EarlyStopping,TensorBoard

In [38]:
early_stop = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=25)

### Creating the Tensorboard Callback

Arguments to instantiate `TensorBoard` (from the help docstring):
- `log_dir`: directory of log files used by TensorBoard
- `histogram_freq`: frequency (in epochs) at which to compute activation and
weight histograms for the layers of the model. If set to 0, histograms
won't be computed. Validation data (or split) must be specified for
histogram visualizations.
- `write_graph`: whether to visualize the graph in TensorBoard. The log file
can become quite large when write_graph is set to True.
write_images: whether to write model weights to visualize as image in
TensorBoard.
- `update_freq`: `'batch'` or `'epoch'` or integer. When using `'batch'`,
writes the losses and metrics to TensorBoard after each batch. The same
applies for `'epoch'`. If using an integer, let's say `1000`, the
callback will write the metrics and losses to TensorBoard every 1000
samples. Note that writing too frequently to TensorBoard can slow down
your training.
- `profile_batch`: Profile the batch to sample compute characteristics. By
default, it will profile the second batch. Set `profile_batch=0` to
disable profiling. Must run in TensorFlow eager mode.
- `embeddings_freq`: frequency (in epochs) at which embedding layers will
be visualized. If set to 0, embeddings won't be visualized.       

In [46]:
from datetime import datetime

In [47]:
timestamp = datetime.now().strftime("%Y-%m-%d--%H%M")
timestamp

'2021-02-06--1558'

In [52]:
# WINDOWS: Use "logs\\fit"
# MACOS/LINUX: Use "logs/fit"

# Path where log files are stored needs to be specified
# Log files are necessary for the visualizations done in tensorboard
# Always use `logs/fit` and then what you want (eg, a timestamp) 
log_directory = 'logs/fit/'+ timestamp
# Later, when we launch tensorboard in the Terminal:
# --logdir=logs/fit/<timestamp>

board = TensorBoard(log_dir=log_directory,histogram_freq=1,
    write_graph=True,
    write_images=True,
    update_freq='epoch',
    profile_batch=2,
    embeddings_freq=1)

### Network / Model

In [49]:
model = Sequential()
model.add(Dense(units=30,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(units=15,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(units=1,activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')

### Train the Model

In [None]:
# We pass th early stop and the (tensor-)board as callbacks
model.fit(x=X_train, 
          y=y_train, 
          epochs=600,
          validation_data=(X_test, y_test), verbose=1,
          callbacks=[early_stop,board]
          )

## 3. Launch TensorBoard

Open terminal and start TensorBoard passing our path to the log files:

```bash
cd <path/to/our/project>
tensorboard --logdir=<path/to/your/logs>
tensorboard --logdir=logs/fit/<timestamp>
```

Open the TensorBoard dashboard with the broswer on [http://localhost:6006/](http://localhost:6006/)

More info:
https://www.tensorflow.org/tensorboard/

Some comments on the TensorBoard dashboard:
- Loss is ploted (smoothed or not) for train & validation split
- Images (activation maps?) can be visualized in different stages of the network -- it makes sense for CNNs processing images
- The graph of the model is visualized
- Weight (& bias) ranges during epochs visualized
- Histograms of weights (& biases) during epochs visualized
- Projector: Really cool data visualization (high-dim data projected) 