Skip to content

lucianacendon/simile

Repository files navigation

Smooth Imitation Learning for Online Sequence Prediction [SIMILE]

This is my implementation of the Smooth Imitation Learning algorithm for Online Sequence Prediction (Hoang M. Le et.al, 2016). This algorithm allows one to train policies that are constrained to make smooth predictions in a continuous action space given sequential input from an exogenous environment and previous actions taken by the policy.
I previously used this algorithm to train a policy for automated video editing. When editing videos, it is important that the predictions are smooth in order to produce aesthetic videos. You can find more details about this project and results on my project page.
This implementation is intended to make training policies using the simile algorithm available to any application.

Installation

Clone the repository, create a virtual environment, then go to the base folder and run:

    pip install -r requirements.txt

Once the command finishes running, you're ready to use the library.

Getting Started

In order to run Simile in training or prediction mode, you should use train_simile.py or test_simile.py. Both scripts only require a config file as input. These config files contain parameters used for training and testing, as well as paths to your data. Setting-up these config files properly is key to using this library succesfully. I included two reference config files, config_train.ini and config_test.ini, which you can modify according to your application and needs. However, having a good understanding of how the algorithm works is very important to make parameter choices that best fit your data. Therefore, please consider reading the paper before anything.

A good way to get started is to make sure the code is running properly on your machine. Only for that purpose, I included a pre-trained model with this repository (inside directory ReleaseModel) and a test case example (inside directory Data). The parameters for this specific test case are defined in config_test.ini, so all you need to do is run:

    python test_simile.py

If everything is working as expected, you should find all resulting plots inside directory ReleaseModel/Plots.

Usage

Training

You will need to modify config_train.ini with parameters of your choice, and run:

    python train_simile.py

Alternatively, you may wish to use a config file located at a different directory by running:

    python train_simile.py -config Path/to/config_file

Test

You will need to modify config_test.ini with parameters of your choice and path to a trained model, and run:

    python test_simile.py

Alternatively, you may wish to use a config file located at a different directory by running:

    python test_simile.py -config Path/to/config_file

Preparing config files

The library takes config files as input. In the reference config files (config_train.ini and config_test.ini), you'll find headers defined between [] and their respective variables listed right below. Here you can find documentation about headers and variable names to be used with these config files.

You can also find a more detailed discussion about Simile and how it was a good fit for my application on my project page.

Preparing Data

The data is fed to the library in XML format. The path to these files are defined on the config files by train_file, valid_file, test_file.

These XML files should contain a list of paths to your data episodes, with the data arranged in this specific format as follows:

1. Each episode should be arranged in numpy arrays such as each row of the array contains both the features and corresponding expert demonstration in a single row:

   row : [ X : environment features | Y : expert demonstration (labels) ]

Moreover, each row of the array correspond to a single time event. For example, row 0 contains the environment features (X) and expert demonstration (Y) at time=0, while subsequent rows contain subsequent environment features (X) and expert demonstration (Y) at subsequent time steps, such as to form a final numpy array of shape (number_time_frames , [n_env_features + n_labels]). In other words, row numbers on the numpy array correspond to their respective time frames.

2. Each episode should be arranged in a single numpy array, and saved in a single dedicated pickle file.

3. The path to each pickle file (episode) should be listed on an XML file, and this XML will be fed to the library to train and test your model.

Notes:

  • You can find an example XML file at Data/test.xml and example episode arrays at Data/Files
  • You can find a helper script to help you list all pickle files inside a specified directory and save them to an XML file at Helpers/create_xml.py.

Reference

Hoang M. Le, Andrew Kang, Yisong Yue, Peter Carr: Smooth Imitation Learning for Online Sequence Prediction (ICML), 2016 [Link]

Author

About

This repository contains my implementation of the Smooth Imitation Learning (Simile) algorithm for Online Sequence Prediction.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages