Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
A Python project inspired by the research of Chloé Kiddon and Yuriy Brun. Part of the Funniest Computer Ever Open Source initiative
Python
branch: master
Failed to load latest commit information.
bot jsonp-ified the request
data including all necessary data by permission of Kiddon and Brun
faq updated to mention pip install
nao replacing whatever my nao behaviour.xar file was with whatever is in …
world_model Added Ben's initial world model efforts
.gitignore adding .gitignore file
README.markdown using the new extractor module have gotten all the tests to pass, yay
comedy_faq.py adding in some other comedy options
generateTrainTestData.py first commit without data directory
getVocabList.py first commit without data directory
joke.py adding in some other comedy options
manage.py adding django app support
preprocessData.py first commit without data directory
processSentence.py first commit without data directory
requirements.txt adding django app support
sentenceFeatures.py first commit without data directory
svmutil.patch added patch for libsvm and updated README
testTokeniseContents.py first commit without data directory
tokeniseContents.py first commit without data directory
train.py first commit without data directory
trial-faq.py adding in some other comedy options
twss.py using the new extractor module have gotten all the tests to pass, yay

README.markdown

TWSS

A Python project inspired by the research of Chloé Kiddon and Yuriy Brun. Paper available here.

Thanks to Henry Garner for original coding initiative, to Marcin Tolysz for awesome super-fast server modifications, and to Andrew Ng for the ML course

Part of the Funniest Computer Ever Open Source Initiative

This code supported Zarquon Squelchmama III in the chatbotbattles contest. We didn't come anywhere near close to winning (our coding was all a bit last minute) but we did get one TWSS in there:

Judge: Hello Zarquon
Zarquon: Greetings.
Judge: How are you?
Zarquon: How do you suppose?
Judge: I suppose you are good but I don't know
Zarquon: That's what she said!
Judge: Hehe. Very funny

Note we now have some other chatbot related initiatives in the faq and worldmodel directories. Ultimately hoping these will all be integrated into a single coherent whole :-)

Getting started

  1. libsvm with python bindings required: http://www.csie.ntu.edu.tw/~cjlin/libsvm

  2. Apply patch to allow svm_predict to produce quiet output cp svmutil.patch LIBSVM_HOME/python & cd LIBSVM_HOME/python & patch < svmutil.patch

[N.B. here's how I add libsvm to my PYTHONPATH: export PYTHONPATH="/Users/samueljoseph/Code/libsvm-3.12/python/:$PYTHONPATH"]

  1. Download TWSS source data into data directory in current project

  2. You can run some limited unit tests like so python testTokeniseContents.py

  3. Run python preprocessData.py to tokenise the files and create a shared vocabulary which is saved in data/vocab.txt. The resulting vector contains about 20k words. preprocessData will also split sentences and save the results in pickle files.

  4. Run python generateTrainTestData.py to create a training data set saved in data/train.pk and data/test.pk which are in the form of a array of dictionaries X and vector y, where X is training instances x features, and y is length #training-instances, and is 1 for TWSS and -1 for non-TWSS instances

  5. Run python train.py from the command line THIS MAY TAKE A FEW MINUTES

  6. Run python twss.py "<insert your sentence>" to have a little chat with the resulting system

Licence

Code is MIT Licence. Data is released under its own licence.

Something went wrong with that request. Please try again.