Active Learning: Predictors, Recommenders and Labellers
Python Makefile Shell
Switch branches/tags
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.

README.rst

Acton - A scientific research assistant

Acton is a modular Python library for active learning. Acton is a suburb in Canberra, where Australian National University is located.

PyPI Build Status Documentation Status

Dependencies

Most dependencies will be installed by pip. You will need to manually install:

Setup

Install Acton using pip3:

pip install git+https://github.com/chengsoonong/acton.git

This provides access to a command-line tool acton as well as the acton Python library.

Acton CLI

The command-line interface to Acton is available through the acton command. This takes a dataset of features and labels and simulates an active learning experiment on that dataset.

Input

Acton supports three formats of dataset: ASCII, pandas, and HDF5. ASCII tables can be any file read by astropy.io.ascii.read, including many common plain-text table formats like CSV. pandas tables are supported if dumped to a file from DataFrame.to_hdf. HDF5 tables are either an HDF5 file with datasets for each feature and a dataset for labels, or an HDF5 file with one multidimensional dataset for features and one dataset for labels.

Output

Acton outputs a file containing predictions for each epoch of the simulation. These are encoded as specified in this notebook.

Quickstart

You will need a dataset. Acton currently supports ASCII tables (anything that can be read by astropy.io.ascii.read), HDF5 tables, and Pandas tables saved as HDF5. Here's a simple classification dataset that you can use.

To run Acton to generate a passive learning curve with logistic regression:

acton --data classification.txt --label col20 --feature col10 --feature col11 -o passive.pb --recommender RandomRecommender --predictor LogisticRegression

This command uses columns col10 and col11 as features, and col20 as labels, a logistic regression predictor, and random recommendations. It outputs all predictions for test data points selected randomly from the input data to passive.pb, which can then be used to construct a plot. To output an active learning curve using uncertainty sampling, change RandomRecommender to UncertaintyRecommender.

To show the learning curve, use acton.plot:

python3 -m acton.plot passive.pb

Look at the directory examples for more examples.

Acknowledgements

Matthew Alger was funded in late 2016 by CAASTRO.