A suite of representation learning models for sentence embedding, and some tasks to evaluate them on.
PostScript TeX Matlab Shell Python Java Other
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
RunExperiments Partial new results, patching up bad runs. Nov 9, 2015
alcir-data Misc cleanup, new ALCIR feature. May 14, 2015
config Paper work Oct 5, 2015
join-table/data Sentence classification w/ TreeNNs May 3, 2015
layer-fns Add sentiment bigram support. Jul 15, 2015
minFunc Quick test implementation of 'First Past' lattice. May 12, 2015
pragbank-data Experiment updates, and nicer stats tables. May 26, 2015
propositionallogic
quantifiers More reverts May 3, 2015
sat-data Delete SAT instances; logging; paper push. Jun 14, 2015
sick-data
synset-relations More reverts May 3, 2015
testing Sentence classification w/ TreeNNs May 3, 2015
utils Add sentiment bigram support. Jul 15, 2015
word-relations More reverts. May 3, 2015
writing arXiv prep Nov 9, 2015
.gitignore Camera ready. Jun 19, 2015
AdaDeltaUpdate.m TODO tidying. Jun 1, 2015
AdaGradUpdate.m Working on GPU; still no faster than parallel CPU. May 6, 2015
Asymmetrize.m Sentence classification w/ TreeNNs May 3, 2015
CollectEmbeddingGradients.m Speedups (some in progress) May 11, 2015
ComputeBatchEntailmentCostAndGrad.m Add sentiment bigram support. Jul 15, 2015
ComputeBatchSentenceClassificationCostAndGrad.m Misc updates. May 27, 2015
ComputeEntailmentExampleCostAndGrad.m Updating paper to match details of rc3. May 23, 2015
ComputeLabelRanges.m Sentence classification w/ TreeNNs May 3, 2015
ComputeSentenceClassificationExampleCostAndGrad.m Optimizations. May 28, 2015
ComputeUnbatchedCostAndGrad.m Incremental paper update. May 30, 2015
FlushLogs.m Misc. cleanup Feb 6, 2015
GetMacroF1.m Cleanup for release. Dec 5, 2014
InitializeLSTMLayer.m Updating paper to match details of rc3. May 23, 2015
InitializeModel.m
InitializeNNLayer.m Working on GPU; still no faster than parallel CPU. May 6, 2015
InitializeNTNLayer.m Working on GPU; still no faster than parallel CPU. May 6, 2015
InitializeVocabFromFile.m Working on GPU; still no faster than parallel CPU. May 6, 2015
Lattice.m TODO tidying. Jun 1, 2015
LatticeBatch.m Misc updates. May 27, 2015
LoadAllDatasets.m Add sentiment bigram support. Jul 15, 2015
LoadEntailmentData.m Working on GPU; still no faster than parallel CPU. May 6, 2015
LoadSSTData.m Better memory management in SequenceBatch, etc. May 10, 2015
LoadSentenceClassificationData.m Misc cleanup Jun 10, 2015
LoadSentimentBigramData.m Add sentiment bigram support. Jul 15, 2015
LoadWordMap.m Misc cleanup, new ALCIR feature. May 14, 2015
LoadWordPairData.m Sentence classification w/ TreeNNs May 3, 2015
Log.m Batched LSTMs and RMSProp Apr 9, 2015
README.md
RMSPropUpdate.m Working on GPU; still no faster than parallel CPU. May 6, 2015
Sequence.m Delete SAT instances; logging; paper push. Jun 14, 2015
SequenceBatch.m Add sentiment bigram support. Jul 15, 2015
Symmetrize.m Sentence classification w/ TreeNNs May 3, 2015
TestAndLog.m Add sentiment bigram support. Jul 15, 2015
TestModel.m Working on GPU; still no faster than parallel CPU. May 6, 2015
TestModelCrossEnt.m Add sentiment bigram support. Jul 15, 2015
TiledEye.m Cleanup for release. Dec 5, 2014
TrainModel.m Add sentiment bigram support. Jul 15, 2015
TrainOnDataset.m Working on GPU; still no faster than parallel CPU. May 6, 2015
TrainSGD.m Add sentiment bigram support. Jul 15, 2015
TransferInitialization.m
Tree.m Optimizations. May 28, 2015
Uninformativize.m Sentence classification w/ TreeNNs May 3, 2015
fNormrnd.m
fOnes.m Working on GPU; still no faster than parallel CPU. May 6, 2015
fRand.m Working on GPU; still no faster than parallel CPU. May 6, 2015
fZeros.m Working on GPU; still no faster than parallel CPU. May 6, 2015
param2stack.m Batched LSTMs and RMSProp Apr 9, 2015
run.sh Long-delayed sync. Nov 9, 2015
stack2param.m More tidying up for release. Dec 23, 2013

README.md

vector-entailment

The code for the experiments reported on in Bowman (2014) and Bowman, Potts and Manning (2014)

WARNING: I have made every effort to ensure that this code is as inefficient as possible and violates every convention of MATLAB style. If the code could be made worse on either count, please contact me (sbowman@stanford.edu) and I will remedy the error.

MORE WARNING: Trees are represented as new-style objects, so you will need a fairly recent copy of MATLAB. R2012b works.

To get started, have a look at the job launch commands in RunExperiments (ideally from a release, since this changes all the time) and the config files in config/.

If you don't want to run jobs using PBS (or can't), you can replace the escaped commas (,) with plain commas in the commands in RunF14Experiments and pipe the commands into MATLAB, as here:

echo "cd quant; dataflag = 'fold5'; lambda = 0.001; dim = 15; td = 1; penult = 75; dropout = 1; tot = 0; name='tj'; relu = 1; TrainModel('', 1, @Join, name, dataflag, dim, penult, td, lambda, tot, relu, dropout, 32);" | matlab

The SICK data is from the SemEval 2014 SICK challenge: http://alt.qcri.org/semeval2014/task1/

minFunc is from Mark Schmidt, here: http://www.cs.ubc.ca/~schmidtm/Software/minFunc.html

The Denotation Graph data (used in this repo, but not distributed with it) is from here: http://shannon.cs.illinois.edu/DenotationGraph/

sick-data/wcmac_data.txt was collected as part of Bill MacCartney's 2009 Stanford dissertation.

sst-data/ contais data from the Stanford Sentiment Treebank. More information can be found in the README.txt file there.

Maintainer and lead author: Samuel Bowman, sbowman@stanford.edu