Skip to content
Score essays automatically with an easy web interface.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
alembic.ini Initial commit. Working engine, web app running, but the two are not … Mar 3, 2014
apt-packages.txt Added missing package Feb 5, 2015 Add web views, tasks, easy setup with vagrant Mar 25, 2014
pre-requirements.txt Initial commit. Working engine, web app running, but the two are not … Mar 3, 2014


Scan is an easy to use server that lets you score essays automatically. Follow the quickstart instructions to get everything up and running.

Note: This is a relatively new project. It has good unit test coverage, and has some manual testing (not just by me), but please test it yourself before using in anything critical.



The easiest way to get started is with a Vagrant virtual machine:

First, install VirtualBox.

Next, install Vagrant

Then clone this repo. If you are unfamiliar with git, please first install git and then look at the basics of cloning a repo.

git clone

Then we have to navigate to the directory and start up vagrant from the command line:

cd scan
vagrant up

This should take 20-30 minutes to download and install dependencies on newer machines.

Congrats! Visiting in your browser will now let you use Scan.

If you find yourself running out of memory on the virtualbox (if models fail to build), you will want to increase available memory by editing this line in the VagrantFile:

v.memory = 2048


Linux is currently the best supported platform, but it is also possible to install on windows.


xargs -a apt-packages.txt install -y
pip install -r pre-requirements.txt
pip install -r requirements.txt

Windows (untested)

  1. Install the scipy stack from here.
  2. Install scikit-learn from the same place. Full install instructions are here.
  3. pip install -r requirements.txt


Running the web server:


Running task worker, which does things like model creation and scoring:

celery -A app.celery worker --loglevel=debug -B

Running tests. Test coverage of the core algorithm is high, but not of the web portion.

nosetests --with-coverage --cover-package="core" --logging-level="INFO"

First Steps

Once you have everything setup and running, here are some steps to try:

  1. Create an account
  2. Login with your account
  3. Click on "questions" to get to the questions list
  4. Create a question using the form.
  5. Click on "view essays" under the question to see more details.
  6. Add essays using a csv file upload (there are two sample set of essays at data/test/censorship to use. train_2.csv will take some time to make a model, and train_2_short will be much faster.)
  7. Click on "create model and score essays"
  8. It may take some time to create the model, but you will get a status prompt that auto-refreshes.
  9. Add more essays and score them using the "score essay" button on each essay. You will have to manually refresh the page after doing this to see the score.

Note: You will want to test results on essays that do not have an actual score. Predicted scores on essays that have a score entered upfront will be misleadingly high!



Contributions are very welcome. Please fork and pull request to contribute.


Please open a github issue if you see a bug. If you have a general question, feel free to contact

You can’t perform that action at this time.