Skip to content
My work for the Spectrm challenge
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
Word Frequency Model.ipynb

Spectrm Challenge

My entry to the Spectrm Challenge. Uses a very simple word-frequencey model and achieves recall rates around 5-8%. For an explanation on how it works, see the Jupyter notebook.


spectrm-challenge-ryan only requires basic scientific python dependencies (numpy, scipy, pandas, matplotlib, nltk). I recommend using a pre-packaged distribution like Anaconda for multiprocessing/memory efficiency. However, you can use pip to set up the dependencies:

$ pip install -r requirements.txt

Running the model

Note: Though the matrices are pretty sparse, I used the regular numpy matrix implementation. That means generating a model can be quite memory intensive. All my tests were on a desktop with 10 cores and 16GB of memory. A nice improvement would to have this code use the scipy sparse matrix implementation.

To run the model on the unlabeled examples, simply run:

$ python

It knows the default locations. If you want to run on other datasets, for instance the training set, you can specify:

$ python challenge_data/train_dialogs.txt challenge_data/train_missing.txt
You can’t perform that action at this time.