ktrain is a lightweight wrapper for the deep learning library Keras to help build, train, and deploy neural networks. With only a few lines of code, ktrain allows you to easily and quickly:
- estimate an optimal learning rate for your model given your data using a Learning Rate Finder
- utilize learning rate schedules such as the triangular policy, the 1cycle policy, and SGDR to effectively minimize loss and improve generalization
- employ fast and easy-to-use pre-canned models for both text classification (e.g., BERT, NBSVM, fastText, GRUs with pretrained word vectors) and image classification (e.g., ResNet, Wide ResNet, Inception)
- load and preprocess text and image data from a variety of formats
- inspect data points that were misclassified to help improve your model
- leverage a simple prediction API for saving and deploying both models and data-preprocessing steps to make predictions on new raw data
Please see the following tutorial notebooks for a guide on how to use ktrain on your projects:
- Tutorial 1: Introduction
- Tutorial 2: Tuning Learning Rates
- Tutorial 3: Image Classification
- Tutorial 4: Text Classification
- Tutorial A1: Additional tricks, which covers topics such as examining misclassifications, inspecting intermediate output of Keras models for debugging, and built-in callbacks.
A Medium post providing a broad overview of ktrain is here:
Using ktrain on Google Colab? See this demo of Multiclass Text Classification with BERT.
Tasks such as text classification and image classification can be accomplished easily with only a few lines of code.
IMDb Movie Reviews Using BERTExample: Text Classification of
import ktrain from ktrain import text as txt # load data (x_train, y_train), (x_test, y_test), preproc = txt.texts_from_folder('data/aclImdb', maxlen=500, preprocess_mode='bert', train_test_names=['train', 'test'], classes=['pos', 'neg']) # load model model = txt.text_classifier('bert', (x_train, y_train)) # wrap model and data in ktrain.Learner object learner = ktrain.get_learner(model, train_data=(x_train, y_train), val_data=(x_test, y_test), batch_size=6) # find good learning rate learner.lr_find() # briefly simulate training to find good learning rate learner.lr_plot() # visually identify best learning rate # train using 1cycle learning rate schedule for 3 epochs learner.fit_onecycle(2e-5, 3)
Dogs and Cats Using a Pretrained ResNet50 modelExample: Classifying Images of
import ktrain from ktrain import vision as vis # load data (train_data, val_data, preproc) = vis.images_from_folder( datadir='data/dogscats', data_aug = vis.get_data_aug(horizontal_flip=True), train_test_names=['train', 'valid'], target_size=(224,224), color_mode='rgb') # load model model = vis.image_classifier('pretrained_resnet50', train_data, val_data, freeze_layers=80) # wrap model and data in ktrain.Learner object learner = ktrain.get_learner(model=model, train_data=train_data, val_data=val_data, workers=8, use_multiprocessing=False, batch_size=64) # find good learning rate learner.lr_find() # briefly simulate training to find good learning rate learner.lr_plot() # visually identify best learning rate # train using triangular policy with ModelCheckpoint and implicit ReduceLROnPlateau and EarlyStopping learner.autofit(1e-4, checkpoint_folder='/tmp')
Additional examples can be found here.
pip3 install ktrain
This code was tested on Ubuntu 18.04 LTS using Keras 2.2.4 with a TensorFlow 1.10 backend. There are a few portions of the code that may explicitly depend on TensorFlow, but such dependencies are kept to a minimum.
Creator: Arun S. Maiya
Email: arun [at] maiya [dot] net