Missing space at end of strings in NUM_WORDS #759

Derek-Jones opened this Issue Jan 20, 2017 · 1 comment


None yet

2 participants


The following code in spacy/orth.pyx

NUM_WORDS = set('zero one two three four five six seven eight nine ten'
'eleven twelve thirteen fourteen fifteen sixteen seventeen'
'eighteen nineteen twenty thirty forty fifty sixty seventy'
'eighty ninety hundred thousand million billion trillion'
'quadrillion gajillion bazillion'.split())

is missing a space character after ten, seventeen, seventy, trillion.

At the moment ten is not recognised as a number, but teneleven is treated as like_number.

@ines ines added bug english labels Jan 20, 2017
ines commented Jan 20, 2017

Thanks – will be pushing the fix and regression test in a second! Also, now that I see it, this data should probably be moved to the English language data at some point in the future.

@ines ines closed this in 09ecc39 Jan 20, 2017
@ines ines added a commit that referenced this issue Jan 20, 2017
@ines ines Add regression test for #759 5f6f48e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment