# Lightning Tour

Introduces the main ways of using Saber.

### Table of contents

1. [Training](#Training)
    1. [Transfer learning](#Transfer-learning)
    2. [Multi-task learning](#Multi-task-learning)
2. [Saving and loading](#Saving-and-loading)
    1. [Saving a model](#Saving-a-model)
    2. [Loading a model](#Loading-a-model)
3. [Performing predictions](#Preforming-predictions)
4. [Visualizations](#Visualizations)

## Training

Import the required modules

In [None]:
from saber.config import Config
from saber.sequence_processor import SequenceProcessor

Create a `SequenceProcessor` object. This object coordinates training, annotation, saving and loading of models and datasets.

In [None]:
sp = SequenceProcessor()

If not specified, SequenceProcessor uses the default `config.ini`. Alternatively, you can write your own config file and specify it as follows 

In [None]:
sp = SequenceProcessor(Config('path/to/your/config.ini'))

See [here](https://github.com/BaderLab/saber/blob/master/saber/config.ini) for the default `config` file.

> Note that if you are calling Saber from the command line, you can provide any parameter in the `config` file as a flag, e.g. `--config_filepath`.

We can then load the dataset (specified by the `dataset_folder` parameter). If pre-trained token embeddings were specified with `pretrained_embeddings`, you should load them here also (uncomment the `load_embeddings()` function call)

In [None]:
sp.load_dataset()
# sp.load_embeddings()

> Note, you can actually pass the filepath to the dataset straight to the `load_dataset()` method, e.g. `sp.load_dataset('path/to/NCBI-Disease')`. The same is true for the `load_embeddings()` method.

Lastly, we create the model we would like to use (specified by the `model_name` parameter).

In [None]:
sp.create_model()

> Again, `model_name` can alternatively be passed directly to `create_model()`.

We are then ready to train:

In [None]:
sp.fit()

### Transfer learning

Transfer learning is as easy as training, saving, loading, and then continuing training of a model. Here is an example

In [None]:
# Create and train a model on GENIA corpus
sp = SequenceProcessor()
sp.load_dataset('path/to/datasets/GENIA')
sp.create_model()
sp.fit()
sp.save('pretrained_models/GENIA')

# Load that model
del sp
sp = SequenceProcessor()
sp.load('pretrained_models/GENIA')

# Use transfer learning to continue training on a new dataset
sp.load_dataset('path/to/datasets/CRAFT')
sp.fit()

> Note that there is currently no way to easily do this with the command line interface, but I am working on it!

### Multi-task learning

Multi-task learning is as easy as specifying multiple dataset paths, either in the `config` file, at the command line via the flag `--dataset_folder`, or as an argument to `load_dataset()`. The number of datasets is arbitrary.

Here is an example using the last method

In [None]:
sp = SequenceProcessor()

# Simply pass multiple dataset paths to load_dataset to use multi-task learning. 
sp.load_dataset('path/to/datasets/NCBI-Disease', 'path/to/datasets/Linnaeus')

sp.create_model()
sp.fit()

## Saving and loading

In the following sections we introduce the saving and loading of models.

### Saving a model

Assuming the model has already been created (see above), we can easily save our model like so

In [None]:
path_to_saved_model = 'path/to/pretrained_models/PRGE'

sp.save(path_to_saved_model)

> Currently, `sp.save()` will save the weights from the last training epoch by default.

### Loading a model

Lets illustrate loading a model with a new `SequenceProccesor` object

In [None]:
# Delete our previous SequenceProccesor object (if it exists)
if 'sp' in locals(): del sp

# Create a new SequenceProccesor object
sp = SequenceProcessor()

# Load a previous model
sp.load(path_to_saved_model)

## Preforming predictions

### Library

If you are using Saber as a `python` library, you can perform predictions on raw text with the `annotate()` method. Passing the argument `jupyter=True` allows us to render the result directly in the notebook

In [None]:
# Assuming sp is a `SequenceProcessor` object with a loaded (and trained) model
sp.annotate('Viral-mediated noisy gene expression reveals biphasic E2f1 response to MYC Gene expression.', 
            jupyter=True)

### Web-service

Sabers web-service is invoked locally from the shell with 

In [None]:
$ python -m saber.app

> You have to run this from your terminal! Note in the notebook.

See [here](https://baderlab.github.io/saber-api-docs/) for Sabers web-service API docs.

## Visualizations

_Note: This is less a feature and more a by-product of the fact that the model is implemented in [Keras](https://keras.io)._

We can easily create an image depiction our model. First, install the [graphviz graph library](http://www.graphviz.org/) and the [Python interface](https://pypi.python.org/pypi/graphviz). This is useful if you plan on modifying the architecture of the model.

> More info can be found [here](https://machinelearningmastery.com/visualize-deep-learning-neural-network-model-keras/).

In [None]:
sp = SequenceProcessor()

# set this variable equal to your Keras model object.
model_ = sp.model.model[0]

We can either: create and save an image on our local machine,

In [None]:
from keras.utils import plot_model
plot_model(model_, to_file='model.png', show_shapes=True, show_layer_names=True)

or, visualize it directly in the notebook

In [None]:
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot

SVG(model_to_dot(model_, show_shapes=True, show_layer_names=True).create(prog='dot', format='svg'))