This is my implementation of the Smooth Imitation Learning algorithm for Online Sequence Prediction (Hoang M. Le et.al, 2016). This algorithm allows one to train policies that are constrained to make smooth predictions in a continuous action space given sequential input from an exogenous environment and previous actions taken by the policy.
I previously used this algorithm to train a policy for automated video editing. When editing videos, it is important that the predictions are smooth in order to produce aesthetic videos. You can find more details about this project and results on my project page.
This implementation is intended to make training policies using the simile algorithm available to any application.
Clone the repository, create a virtual environment, then go to the base folder and run:
pip install -r requirements.txt
Once the command finishes running, you're ready to use the library.
In order to run Simile in training or prediction mode, you should use train_simile.py
or test_simile.py
. Both scripts only require a config
file as input. These config files contain parameters used for training and testing, as well as paths to your data. Setting-up these config files properly is key to using this library succesfully. I included two reference config files, config_train.ini
and config_test.ini
, which you can modify according to your application and needs. However, having a good understanding of how the algorithm works is very important to make parameter choices that best fit your data. Therefore, please consider reading the paper before anything.
A good way to get started is to make sure the code is running properly on your machine. Only for that purpose, I included a pre-trained model with this repository (inside directory ReleaseModel
) and a test case example (inside directory Data
). The parameters for this specific test case are defined in config_test.ini
, so all you need to do is run:
python test_simile.py
If everything is working as expected, you should find all resulting plots inside directory ReleaseModel/Plots
.
You will need to modify config_train.ini
with parameters of your choice, and run:
python train_simile.py
Alternatively, you may wish to use a config file located at a different directory by running:
python train_simile.py -config Path/to/config_file
You will need to modify config_test.ini
with parameters of your choice and path to a trained model, and run:
python test_simile.py
Alternatively, you may wish to use a config file located at a different directory by running:
python test_simile.py -config Path/to/config_file
The library takes config files as input. In the reference config files (config_train.ini
and config_test.ini
), you'll find headers defined between []
and their respective variables listed right below. Here you can find documentation about headers and variable names to be used with these config files.
You can also find a more detailed discussion about Simile and how it was a good fit for my application on my project page.
The data is fed to the library in XML format. The path to these files are defined on the config files by train_file
, valid_file
, test_file
.
These XML files should contain a list of paths to your data episodes, with the data arranged in this specific format as follows:
1. Each episode should be arranged in numpy arrays such as each row of the array contains both the features and corresponding expert demonstration in a single row:
row : [ X : environment features | Y : expert demonstration (labels) ]
Moreover, each row of the array correspond to a single time event. For example, row 0 contains the environment features (X) and expert demonstration (Y) at time=0, while subsequent rows contain subsequent environment features (X) and expert demonstration (Y) at subsequent time steps, such as to form a final numpy array of shape (number_time_frames , [n_env_features + n_labels])
. In other words, row numbers on the numpy array correspond to their respective time frames.
2. Each episode should be arranged in a single numpy array, and saved in a single dedicated pickle file.
3. The path to each pickle file (episode) should be listed on an XML file, and this XML will be fed to the library to train and test your model.
- You can find an example XML file at
Data/test.xml
and example episode arrays atData/Files
- You can find a helper script to help you list all pickle files inside a specified directory and save them to an XML file at
Helpers/create_xml.py
.
Hoang M. Le, Andrew Kang, Yisong Yue, Peter Carr: Smooth Imitation Learning for Online Sequence Prediction (ICML), 2016 [Link]
- Luciana Cendon
- Research Engineer working in the fields of Machine Learning and Computer Vision.
- Contact: luciana.hpcendon@gmail.com
- Linkedin: https://www.linkedin.com/in/luciana-cendon/