NEXT is a system that makes it easy to develop, evaluate, and apply active learning.
Talks give a good brief introduction to NEXT at the highest level. For scientists and develoeprs, we most recommend the PyData Ann Arbor talk. It's an enhanced and refined version of the SciPy talk.
|PyData Ann Arbor||Scientists and developers||1 hour||https://www.youtube.com/watch?v=rTyu4QTXZTc|
|SciPy 2017||Scientific Python developers||30 minutes||https://www.youtube.com/watch?v=blPjDYCvppY|
|Simons Institute conference on Interactive Learning||Machine learning researchers||30 minutes||https://youtu.be/ESXgbZQ1ZTk?t=1732|
We give more detail on the items on launching experiments and getting setup in the SciPy 2017 proceedings: http://conference.scipy.org/proceedings/scipy2017/pdfs/scott_sievert.pdf.
We have an experimental AMI that can be used to run NEXT in a purely application based rather than development environment. Included in the AMI is a basic version of our frontend. The AMI is still highly experimental and we give no guarantees on it being up to date with the current code. For more info please visit here.
NEXT/next. Tests will be run from your local machine but
will ping an EC2 server to simulate a client.
Individual files can also be run with
will only run
test_api.py and allow relative imports (which allows
from next.utils import timeit).
stdout can be captured with the
-s flag for
pytest is installable with
pip install pytest and has a strict backwards
Getting the code
You can download the latest version of NEXT from github with the following clone command:
$ git clone https://github.com/nextml/NEXT.git
We are actively working to develop and improve NEXT, but users should be aware of the following caveats:
- NEXT currently supports only UNIX based OS (e.g. Windows compatibility is not yet available).
- An Amazon Web Services account is needed to launch NEXT on EC2; we have worked hard to make this process as simple as possible, at cost of ease of running the full NEXT stack on a local machine. We plan to make NEXT usable on a personal computer in the future.
Launching NEXT on EC2
First, you must set your Amazon Web Services (AWS) account credentials as enviornment variables. If you don't already have AWS account, you can follow our AWS account quickstart here or the official AWS account set-up guide here for an in-depth introduction. Make sure to have access to
- AWS access key id
- AWS secret access key
- Key Pair (pem file)
Make sure to note down the region that your key pair was made in. By default, the script assumes the region is Oregon (us-west-2). If you choose to use a different region, every time you use the
next_ec2.pyscript, make sure to specify the region
--region=us-west-2). For example, after selecting the regions "Oregon," the region
us-west-2is specified on the EC2 dashboard. If another region is used, an
--amioption has to be included. For ease, we recommend using the Oregon region.
Export your AWS credentials as environment variables using:
$ export AWS_SECRET_ACCESS_KEY=[your_secret_aws_access_key_here] $ export AWS_ACCESS_KEY_ID=[your_aws_access_key_id_here]
Note that you'll need to use your
AWS_ACCESS_KEY_IDagain later, so save them in a secure place for convenient reference later.
Install the local python packages needed for NEXT:
$ cd NEXT $ sudo pip install -r local_requirements.txt
Throughout the rest of this tutorial, we will be using the
startup script heavily. For more options and instructions, run
python next_ec2.py without any arguments. Additionally,
python next_ec2.py -h
will provide helper options.
For persistent data storage, we first need to create a bucket in AWS S3 using:
$ cd ec2 $ python next_ec2.py --key-pair=[keypair] --identity-file=[key-file] createbucket [cluster-name]
[keypair]is the name of your EC2 key pair
[key-file]is the private key file for your key pair
[cluster-name]is the custom name you create and assign to your cluster
This will print out another environment variable command
export AWS_BUCKET_NAME=[bucket_uid]. Copy and paste this command into your terminal.
You will also need to use your
bucket_uidlater, so save it in a file along side your
AWS_ACCESS_KEY_IDfor later reference.
Now you are ready to fire up the NEXT system using our
launch command. This
command will create a new EC2 instance, pull the NEXT repository to that
instance, install all of the relevant Docker images, and finally run all Docker
WARNING: Users should note that this script launches a single
m3.largemachine, the current default NEXT EC2 instance type. This instance type costs $0.14 per hour to run. For more detailed EC2 pricing information, refer to this AWS page. You can change specify the instance type you want to with the
$ python next_ec2.py --key-pair=[keypair] --identity-file=[key-file] launch [cluster-name]
Once your terminal shows a stream of many multi-colored docker appliances, you are successfully running the NEXT system!
Replicating NEXT adaptive learning experiments
Because NEXT aims to make it easy to reproduce empirical active learning results, we provide a simple command to initialize the experiments performed in this study.
First, in a new terminal, export your AWS credentials and use
get-master to obtain your public EC2 DNS.
$ export AWS_BUCKET_NAME=[your_aws_bucket_name_here] $ cd NEXT/ec2 $ python next_ec2.py --key-pair=[keypair] --identity-file=[key-file] get-master [cluster-name]
Then export this public EC2 DNS.
$ export NEXT_BACKEND_GLOBAL_HOST=[your_public_ec2_DNS_here] $ export NEXT_BACKEND_GLOBAL_PORT=8000
Now you can execute
run_examples.py to initialize and launch the NEXT experiments.
$ cd ../examples $ python run_examples.py
Once initialized, this script will return a link that you can distribute yourself or post as a HIT on Mechanical Turk. Visit:
[exp_key] are unique identifiers for each of the
respective Dueling Bandits Pure Exploration, Active Non-Metric Multidimensional
Scaling (MDS), and Tuple Bandits Pure Exploration experiments respectively. See
for a little more information.
Navigate to the
strange_fruit_triplet query link (the last one that printed
out to your terminal) and answer some questions! Doing so will provide the
system with data you can view and interact with in the next step.
Accessing NEXT experiment results, dashboards, and data visualizations
You can access interactive experiment dashboards and data visualizations at by clicking experiments at:
And obtain all logs for an experiment through our RESTful API, visit:
[exp_uid] corresponds to the unique Experiment ID shown on the experiment dashboard pages.
If you'd like to backup your database to access your data later, refer to this wiki for detailed steps.
Finally, you can terminate your EC2 instance and shutdown NEXT using:
$ cd ../ec2 $ python next_ec2.py --key-pair=[keypair] --identity-file=[key-file] destroy [cluster-name]