Skip to content

Commit

Permalink
Merge branch 'master' of github.com:ottokart/punctuator2
Browse files Browse the repository at this point in the history
  • Loading branch information
ottokart committed Apr 4, 2017
2 parents 95a941e + 406dfa4 commit 3f7c794
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions README.md
Expand Up @@ -16,6 +16,8 @@ The model can be trained in two stages (second stage is optional):

Remember that all the scores given below are on _unsegmented_ text and we did not use prosodic features, so, among other things, the model has to detect sentence boundaries in addition to the boundary type (?QUESTIONMARK, .PERIOD or !EXCLAMATIONMARK) based entirely on textual features. The scores are computed on the test set.

Training speed with default settings, an optimal Theano installation and a modern GPU should be around 10000 words per second.

## English TED talks
Training set size: 2.1M words. First stage only. More details can be found in [this paper](http://www.isca-speech.org/archive/Interspeech_2016/pdfs/1517.PDF).
For comparison, our [previous model](https://github.com/ottokart/punctuator) got an overall F1-score of 50.8.
Expand Down

0 comments on commit 3f7c794

Please sign in to comment.