No description, website, or topics provided.
Scala
Latest commit 749f226 Feb 25, 2016 @dhgarrette Update README.md
Permalink
Failed to load latest commit information.
project added code Jun 15, 2014
src removed unnecessary dependencies Jun 17, 2014
.gitignore added sbt-launch jar Jun 15, 2014
LICENSE Update LICENSE Nov 5, 2015
README.md Update README.md Feb 25, 2016
build.sbt removed unnecessary dependencies Jun 17, 2014
compile added sbt-launch jar Jun 15, 2014
sbt-launch-0.13.5.jar sbt 0.13.5 Jun 15, 2014

README.md

CCG Supertagging: 2014

This is the software used for the following publication:

Weakly-Supervised Bayesian Learning of a CCG Supertagger
Dan Garrette, Chris Dyer, Jason Baldridge, and Noah A. Smith
In Proceedings of CoNLL 2014

Data setup

Put the English, Chinese, and Italian data into the following directories:

 data/ccgbank
 data/ccgbank-chinese
 data/ccgbank-italian

The files should be arranged as follows:

$ ls data/ccgbank/AUTO
00  02  04  06  08  10  12  14  16  18  20  22  24
01  03  05  07  09  11  13  15  17  19  21  23
$ ls data/ccgbank-chinese/AUTO
00  02  04  06  08  10  20  22  24  26  28  30
01  03  05  07  09  11  21  23  25  27  29  31
$ ls data/ccgbank-italian/pro
civil_law.pro.txt   jrc_acquis.pro.txt  newspaper.pro.txt    

Running the experiments

First, compile the code and generate the run script:

$ ./compile
  • $tagset was tested with the following values: ccgfeat, ctbfeat, tutfeat
  • $tdcut was tested with the following values: 0.1, 0.01, 0.001, 0.0

  1. Uniform

    target/start dhg.ccg.run.Conll2014Run em   $tagset  --iterations 50             --tdcut $tdcut  --trinit un  --eminit un   --ccgtrinit-comb false --ccgtrinit-catprior false
    
  2. B08

    target/start dhg.ccg.run.Conll2014Run em   $tagset  --iterations 50             --tdcut $tdcut  --trinit un  --eminit un   --ccgtrinit-comb 0.95  --ccgtrinit-catprior cplx,w0.5
    
  3. FFBS B08

    target/start dhg.ccg.run.Conll2014Run ffbs $tagset  --samples 200 --burnin 100  --tdcut $tdcut  --trinit un  --eminit un   --ccgtrinit-comb 0.95  --ccgtrinit-catprior cplx,w0.5
    
  4. FFBS B08 + catgram

    target/start dhg.ccg.run.Conll2014Run ffbs $tagset  --samples 200 --burnin 100  --tdcut $tdcut  --trinit un  --eminit un   --ccgtrinit-comb 0.95  --ccgtrinit-catprior tdipcfg,t0.6,f0.5,m0.8,w0.5,l1000
    
  5. FFBS B08 + catgram + trinit

    target/start dhg.ccg.run.Conll2014Run ffbs $tagset  --samples 200 --burnin 100  --tdcut $tdcut  --trinit tde --eminit un   --ccgtrinit-comb 0.95  --ccgtrinit-catprior tdipcfg,t0.6,f0.5,m0.8,w0.5,l1000
    
  6. FFBS B08 + catgram + trinit + eminit

    target/start dhg.ccg.run.Conll2014Run ffbs $tagset  --samples 200 --burnin 100  --tdcut $tdcut  --trinit tde --eminit tde  --ccgtrinit-comb 0.95  --ccgtrinit-catprior tdipcfg,t0.6,f0.5,m0.8,w0.5,l1000