NDSB competition repository for scripting, note taking and writing submissions.
HTML Python Shell
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
data
misc
notebooks
plots
poster
report
run_settings
yaml_templates
.gitignore
LICENSE
README.md
average_predictions.py
calculate_normalisation_stats.py
check_test_score.py
crossval_cache.py
crossval_cache3.py
crossval_numAug.py
generate_hlf_cache.py
generate_hlf_cache_10aug.py
generate_hlf_cache_15aug.py
generate_hlf_cache_30aug.py
generate_hlf_cache_3aug.py
generate_hlf_cache_40aug.py
generate_hlf_cache_6aug.py
generate_hlf_cache_8aug.py
generate_hlf_cache_alt15aug.py
generate_hlf_testcache.py
generate_hlf_testcache_10aug.py
generate_hlf_testcache_15aug.py
generate_hlf_testcache_30aug.py
generate_hlf_testcache_3aug.py
generate_hlf_testcache_6aug.py
generate_hlf_testcache_8aug.py
generate_hlf_testcache_alt15aug.py
generate_local_cache.py
generate_local_cache_FAST.py
generate_local_cache_MSER.py
generate_local_cache_ORB.py
generate_prior_weighted_csv.py
image_hack.sh
logistic_sgd3.py
mlp.py
mlp3.py
mnist_train.py
model_backup.sh
pylearn2_cpu_model_function_builder.py
settings.json
start_script.sh
test.py
test_attr.py
test_hlf_cache.py
test_hlf_cache_with_local.py
test_priors.py
train.py
train_attr.py
train_bow.py

README.md

This repository is supposed to contain working code, scripts, etc related to the Neuroglycerin entry for the NDSB competition.

Installing tools

Ensure that neukrill-net-tools is installed by doing one of the following (editable/develop install so you can edit the code without needing to reload everything):

    pip install -e .
    python setup.py develop

See the tools repo for more details.

Startup Script for Theano

It's required to set some environment variables for Theano. We have a script for doing this (could in future ensure they're set in the Python code). First, make sure it's executable:

chmod +x start_script

Then source it (don't run it):

source start_script <gpu number 0-3>

Main Scripts

train.py - fit, cross-validate and dump a classifier model

test.py - read a pickled classifier model and output a submission csv covering the test_data

Settings

The settings are used to find the data, among other things. When starting to use the scripts, add your data path to the settings.json. Specifically this should be the directory where you unzipped the files, containing the train and test directories.

Alternatively, you could put the data in the data directory in your repository as this is already in the settings.json.