Smooth Imitation Learning for Online Sequence Prediction [SIMILE]

This is my implementation of the Smooth Imitation Learning algorithm for Online Sequence Prediction (Hoang M. Le et.al, 2016). This algorithm allows one to train policies that are constrained to make smooth predictions in a continuous action space given sequential input from an exogenous environment and previous actions taken by the policy.
I previously used this algorithm to train a policy for automated video editing. When editing videos, it is important that the predictions are smooth in order to produce aesthetic videos. You can find more details about this project and results on my project page.
This implementation is intended to make training policies using the simile algorithm available to any application.

Installation

Clone the repository, create a virtual environment, then go to the base folder and run:

    pip install -r requirements.txt

Once the command finishes running, you're ready to use the library.

Getting Started

In order to run Simile in training or prediction mode, you should use train_simile.py or test_simile.py. Both scripts only require a config file as input. These config files contain parameters used for training and testing, as well as paths to your data. Setting-up these config files properly is key to using this library succesfully. I included two reference config files, config_train.ini and config_test.ini, which you can modify according to your application and needs. However, having a good understanding of how the algorithm works is very important to make parameter choices that best fit your data. Therefore, please consider reading the paper before anything.

A good way to get started is to make sure the code is running properly on your machine. Only for that purpose, I included a pre-trained model with this repository (inside directory ReleaseModel) and a test case example (inside directory Data). The parameters for this specific test case are defined in config_test.ini, so all you need to do is run:

    python test_simile.py

If everything is working as expected, you should find all resulting plots inside directory ReleaseModel/Plots.

Usage

Training

You will need to modify config_train.ini with parameters of your choice, and run:

    python train_simile.py

Alternatively, you may wish to use a config file located at a different directory by running:

    python train_simile.py -config Path/to/config_file

Test

You will need to modify config_test.ini with parameters of your choice and path to a trained model, and run:

    python test_simile.py

Alternatively, you may wish to use a config file located at a different directory by running:

    python test_simile.py -config Path/to/config_file

Preparing config files

The library takes config files as input. In the reference config files (config_train.ini and config_test.ini), you'll find headers defined between [] and their respective variables listed right below. Here you can find documentation about headers and variable names to be used with these config files.

You can also find a more detailed discussion about Simile and how it was a good fit for my application on my project page.

Preparing Data

The data is fed to the library in XML format. The path to these files are defined on the config files by train_file, valid_file, test_file.

These XML files should contain a list of paths to your data episodes, with the data arranged in this specific format as follows:

1. Each episode should be arranged in numpy arrays such as each row of the array contains both the features and corresponding expert demonstration in a single row:

   row : [ X : environment features | Y : expert demonstration (labels) ]

Moreover, each row of the array correspond to a single time event. For example, row 0 contains the environment features (X) and expert demonstration (Y) at time=0, while subsequent rows contain subsequent environment features (X) and expert demonstration (Y) at subsequent time steps, such as to form a final numpy array of shape (number_time_frames , [n_env_features + n_labels]). In other words, row numbers on the numpy array correspond to their respective time frames.

2. Each episode should be arranged in a single numpy array, and saved in a single dedicated pickle file.

3. The path to each pickle file (episode) should be listed on an XML file, and this XML will be fed to the library to train and test your model.

Notes:

You can find an example XML file at Data/test.xml and example episode arrays at Data/Files
You can find a helper script to help you list all pickle files inside a specified directory and save them to an XML file at Helpers/create_xml.py.

Reference

Hoang M. Le, Andrew Kang, Yisong Yue, Peter Carr: Smooth Imitation Learning for Online Sequence Prediction (ICML), 2016 [Link]

Author

Luciana Cendon
- Research Engineer working in the fields of Machine Learning and Computer Vision.
- Contact: luciana.hpcendon@gmail.com
- Linkedin: https://www.linkedin.com/in/luciana-cendon/

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Data		Data
Helpers		Helpers
Lib		Lib
ReleaseModel		ReleaseModel
LICENSE		LICENSE
README.md		README.md
Reference.md		Reference.md
config_test.ini		config_test.ini
config_train.ini		config_train.ini
requirements.txt		requirements.txt
test_simile.py		test_simile.py
train_simile.py		train_simile.py

License

lucianacendon/simile

Folders and files

Latest commit

History

Repository files navigation

Smooth Imitation Learning for Online Sequence Prediction [SIMILE]

Installation

Getting Started

Usage

Training

Test

Preparing config files

Preparing Data

Notes:

Reference

Author

About

Resources

License

Stars

Watchers

Forks

Languages