Skip to content
Yonatan Bisk edited this page Dec 5, 2015 · 2 revisions
Command Explanation
Grammar
ignorePunctuation=false Includes/Ignores punctuation from data
TAGSET=src/main/resources/english.pos.map POS tag file location
tagType=Fine Parse with {Coarse,Fine,Universal,Induced} tags
hasUniversalTags=true Add column for NAACL Shared Task input/output
NF=Full Parse with NF: {Full,Full_noPunct,Eisner,Eisner_Orig,None}
trainingRegimen=[readTrainingFiles, HDPArgumentModel, IO, Test] Operations for experiment
typeRaising=false Allow TypeRaising
lexTROnly=false Restrict TypeRaising to lexical items
allowXbXbX=false Allow for (X/X)\X and (X\X)/X
Grammar Induction
uniformPrior=false EM init w/ Uniform Trees
maxArity=3 Maximum lexical arity
maxModArity=2 Maximum lexical arity for Modifiers
induceValidOnly=true
complexArgs=false Allow complex arguments
complexTOP=false Allow TOP to complex arguments
ALPHA_SCHEME=false Should hyper-parameters be used as constants or X^a schemes?
alphaPower=[1000.0, 1000.0, 1000.0] variational hyperparameter
discount=0.0 PY Discount factor 0 <= d < 1
typeChangingRules=null Special Unary Type-Changing rules
Tagset Induction
BMMMClusters=null New BMMM Tag mapping
Training
source=supervised Training setups: induction, supervised
viterbi=false
maxItr=2000 Max # of EM/BW Iterations
threshold=0.01 EM/BW convergence threshold
trainK=1 TopK parses to be computed during training
smallRule=-25.0 Minimum prob/val allowed for a rule
NumClusters=45 Number of clusters to induce with HMM
Training Data
trainFile=[english.AUTO.example] Training file(s), comma delimited
shortestSentence=1 Shortest sentence to consider
longestSentence=200 Longest sentence to consider
Misc
Folder=ExperimentOutput Folder for output files
saveModelFile=ExperimentOutput2//Model file to write the model
loadModelFile=ExperimentOutput2/Model0 file to read the model
savedLexicon=Lexicon.txt.gz Lexicon to load
CondProb_threshold=0.01 Threshold for discarding categories based on cond prob
threadCount=2 Number of threads to use
api_key=key.txt API Key for push notification from notifymyandroid.com
Induction's Lexical Learning
lexFreq=5.0 # or percentage of words to learn
nounFreq=0.0 # or percentage of nouns to learn
verbFreq=0.0 # or percentage of verbs to learn
funcFreq=0.0 # or percentage of function words to learn
Testing
testFile=[english.AUTO.example] List of test files ( comma delimited )
TEX_LANGUAGE=other Are the sentences in chinese or other?
CONLL_DEPENDENCIES=CC_X1___CC_X2 Whether CoNLL style viterbi parses should be printed and how conjunction should be treated
longestTestSentence=100 Longest allowable test sentence (else: right branch)
testK=1 Number of parses to produces at test time
AUTO_TYPE=CCGBANK CCGBANK vs CANDC auto files
AUTO Conversion
AUTOFileToConvert=null Input file [1 auto per line] to convert
ConvertAUTO=TEX Format to convert AUTOs to
Knowledge Graph
hardBracketConstraints=false Use Hard Entity constraints when parsing
hardEntityNConstraints=false Use Hard Entity as N constraints when parsing
softBracketConstraints=true Use Soft Entity constraints when scoring
softEntityNConstraints=false Use Soft Entity as N constraints when scoring
softBracketWeighting=0.9 1-Penalty for violating entity constraints when scoring

Clone this wiki locally