Skip to content

Commit 6f5e1ce

Browse files
AngledLuffaStanford NLP
authored andcommitted
This value builds a model of comparable size to the old ctb7 models
1 parent f288b3a commit 6f5e1ce

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

scripts/chinese-segmenter/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ ctb9.train.chris6.ser.gz: dict-chris6.ser.gz
6565

6666
# train on train CTB9 + extras, with all external lexicons, without training lexicon, use the threshold to make it smaller
6767
ctb9.train-small.chris6.ser.gz: dict-chris6.ser.gz
68-
time java -mx60g edu.stanford.nlp.ie.crf.CRFClassifier -prop ctb9-chris6.prop -serDictionary $+ -sighanCorporaDict /u/nlp/data/chinese-segmenter/gale2007/ctb6/ -featureDiffThresh 0.015 -trainFile $(CTB9_ALL) -serializeTo $@ > $@.log 2> $@.err
68+
time java -mx60g edu.stanford.nlp.ie.crf.CRFClassifier -prop ctb9-chris6.prop -serDictionary $+ -sighanCorporaDict /u/nlp/data/chinese-segmenter/gale2007/ctb6/ -featureDiffThresh 0.005 -trainFile $(CTB9_ALL) -serializeTo $@ > $@.log 2> $@.err
6969

7070
# train on all CTB7, with all external lexicons, without training lexicon
7171
bolt.chris6.ser.gz: dict-chris6.ser.gz

scripts/chinese-segmenter/ctb9-chris6.prop

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,4 +84,4 @@ sighanPostProcessing = true
8484

8585
# This would make the resulting model smaller
8686
# It can also be set as a command line arg, which is what the Makefile does
87-
# featureDiffThresh=0.015
87+
# featureDiffThresh=0.005

0 commit comments

Comments
 (0)