# What is Ktrain?

[Ktrain](https://github.com/amaiya/ktrain) is a small and lightweight helper library that tries to streamline the process of applying [Keras](https://keras.io/) models to datasets.

It's main purpose is to hide away much of the syntactic clutter in building _default_ pipelines for preprocessing data, training a predictor as well as it's deployment.

Or more in detail, what the author tells us:

"*ktrain* is a lightweight wrapper for the deep learning library [Keras](https://keras.io/) to help build, train, and deploy neural networks.  With only a few lines of code, ktrain allows you to easily and quickly:

- estimate an optimal learning rate for your model given your data using a Learning Rate Finder
- utilize learning rate schedules such as the [triangular policy](https://arxiv.org/abs/1506.01186), the [1cycle policy](https://arxiv.org/abs/1803.09820), and [SGDR](https://arxiv.org/abs/1608.03983) to effectively minimize loss and improve generalization
- employ fast and easy-to-use pre-canned models for:
  - **text classification** (e.g., [BERT](https://arxiv.org/abs/1810.04805), [NBSVM](https://www.aclweb.org/anthology/P12-2018), [fastText](https://arxiv.org/abs/1607.01759), GRUs with [pretrained word vectors](https://fasttext.cc/docs/en/english-vectors.html))
  - **image classification** (e.g., [ResNet](https://arxiv.org/abs/1512.03385), [Wide ResNet](https://arxiv.org/abs/1605.07146), [Inception](https://www.cs.unc.edu/~wliu/papers/GoogLeNet.pdf))
  - **text sequence labeling** (e.g., [Bidirectional LSTM-CRF](https://arxiv.org/abs/1603.01360) with optional pretrained word embeddings)
  - **graph node classification** (e.g., [GraphSAGE](https://cs.stanford.edu/people/jure/pubs/graphsage-nips17.pdf))
- perform multilingual text classification (e.g., [Chinese Sentiment Analysis with BERT](https://nbviewer.jupyter.org/github/amaiya/ktrain/blob/master/examples/text/ChineseHotelReviews-BERT.ipynb), [Arabic Sentiment Analysis with NBSVM](https://nbviewer.jupyter.org/github/amaiya/ktrain/blob/master/examples/text/ArabicHotelReviews-nbsvm.ipynb))
- load and preprocess text and image data from a variety of formats 
- inspect data points that were misclassified and [provide explanations](https://eli5.readthedocs.io/en/latest/) to help improve your model
- leverage a simple prediction API for saving and deploying both models and data-preprocessing steps to make predictions on new raw data
"

One of the main advantages for Ktrain is that it fairly closely follows the SOTA at least in NLP, so it is surprisingly easy to get hold on very recent models, like BERT as well as utilizing trending training schedules (learning rate estimation and "one cycle" policy).

One noteworthy conceptual element of Ktrain is the **usage of standard pipelines**, which make deployment way less risky. It is generally an essential practice to **make the data preparation steps also easily reproducible** beside the model itself. 

## What Ktrain is not?

First and foremost, Ktrain is a "one man show", so for now, it is not a mature, large scale project, hence with all it's convenience, it is not well documented. The code tries to be self-explanatory though.




# Usage example for NLP

Text Classification of [IMDb Movie Reviews](https://ai.stanford.edu/~amaas/data/sentiment/) Using [BERT](https://arxiv.org/pdf/1810.04805.pdf)

## Loading / preprocessing:

```
import ktrain
from ktrain import text as txt

(x_train, y_train), (x_test, y_test), preproc = txt.texts_from_folder('data/aclImdb', maxlen=500, 
                                                                     preprocess_mode='bert',
                                                                     train_test_names=['train', 'test'],
                                                                     classes=['pos', 'neg'])
```

## Loading the (pretrained) model

```
# load model
model = txt.text_classifier('bert', (x_train, y_train), preproc=preproc)

# wrap model and data in ktrain.Learner object
learner = ktrain.get_learner(model, 
                             train_data=(x_train, y_train), 
                             val_data=(x_test, y_test), 
                             batch_size=6)

```

## Using "learning rate finder" functionality to search for optimal training parameters

```
learner.lr_find()             # briefly simulate training to find good learning rate
learner.lr_plot()             # visually identify best learning rate
```

<img src="http://drive.google.com/uc?export=view&id=1hhYK-WhofcrK26G_XqnN0Bleurw8vX65">


## Training 

```
# using 1cycle learning rate schedule for 3 epochs
learner.fit_onecycle(2e-5, 3) 
```

<img style="float: left;" src="http://drive.google.com/uc?export=view&id=1SHxeCRIFCWgSOuVoa7BZJJOFcrW9S6uT" width=70% >

<div>
&nbsp;

&nbsp;

&nbsp;

&nbsp;
</div>


## "Explaining" model performance

One of the interesting thing to look at in case of performance is the 
```
learner.view_top_losses(n=1, preproc=preproc)

```

Output

```----------
id:8244 | loss:11.57 | true:neg | pred:pos)

is not like mickey rourke ever really disappeared he has had a steady string of appearances before he burst back on the scene he was memorable in domino sin city man on fire once upon a time in mexico and get carter but in his powerful dramatic performance in the wrestler 2008 we see a full blown presentation of the character only hinted at in get carter whenever we get to know him rourke remains a cool but sleazy muscle bound slim ball br br this is an leonard story and production leonard wrote such notable movies as western thriller 3 10 to yuma be cool jackie brown get 52 pick up and joe this means that we get tough guys some good some not so good br br it also means we get tight realistic plots with characters doing what is best for them in each situation weaving complications into violent conclusions is no different tough slim ball killer rourke stalks unhappily married witness lane think history of violence meets no country for old men it is not as intense bloody or gory as those two but it is almost as good if you like those two including david equally wonderful eastern promises you will like also br br director john has not done a lot of movies his last few were enjoyable if not successful proof captain and shakespeare in love br br diana lane hasn't had a powerful movie role since she and richard gere gave incredible performances in unfaithful lately she is charming and appealing in romantic stories such as nights in must love dogs and under the sun here she is right on mark balancing her sexy appeal with reserved tension br br this is a small part for rosario dawson yet dawson does a good job with it you see a lot more of lane including an underwear scene to rival sigourney weaver in aliens and nicole kidman in eyes wide shut br br while you are in the crime drama section also pick up kiss kiss bang bang and gone baby gone and before the devil knows your dead the last has wonderful performances by phillip seymour hoffman ethan hawke marisa tomei and albert finney br br flopped at the box office more is our luck it is certainly worth a 3 4 dollar rental if you like this genre 6 20 200```

The other very nice feature is, that Ktrain utilizes [LIME](https://arxiv.org/abs/1602.04938) as an explanation toolkit to reason about the causes of class assignment. (More on LIME [here](https://towardsdatascience.com/understanding-model-predictions-with-lime-a582fdff3a3b) and how it works in Ktrain [here](https://nbviewer.jupyter.org/github/amaiya/ktrain/blob/master/tutorial-05-explaining-predictions.ipynb))

<img src="http://drive.google.com/uc?export=view&id=14IvXNpPl4VY-XqWOkRNqYLlbxHy5YOTg">