Parse and convert numbers written in french into their digit representation.
Documentation Status

text2num is a python package that provides functions and parser classes for:

  • parsing numbers expressed as words in French and convert them to integer values;
  • detect ordinal, cardinal and decimal numbers in a stream of French words and get their decimal digit representations.


Tested on python 3.6, 3.7.


This sofware is distributed under the MIT license of which you should have received a copy (see LICENSE file in this repository).


text2num does not depend on any other third party package.

To install text2num in your (virtual) environment:

pip install text2num

That's all folks!

Usage examples

Parse and convert

>>> from text_to_num import text2num
>>> text2num('quatre-vingt-quinze')

>>> text2num('nonante-cinq')

>>> text2num('mille neuf cent quatre-vingt dix-neuf')

>>> text2num('dix-neuf cent quatre-vingt dix-neuf')

>>> text2num("cinquante et un million cinq cent soixante dix-huit mille trois cent deux")

>>> text2num('mille mille deux cents')
ValueError: invalid literal for text2num: 'mille mille deux cent'

Find and transcribe

Any numbers, even ordinals.

>>> from text_to_num import alpha2digit
>>> sentence = (
...         "Huit cent quarante-deux pommes, vingt-cinq chiens, mille trois chevaux, "
...         "douze mille six cent quatre-vingt-dix-huit clous.\n"
...         "Quatre-vingt-quinze vaut nonante-cinq. On tolère l'absence de tirets avant les unités : "
...         "soixante seize vaut septante six.\n"
...         "Nombres en série : douze quinze zéro zéro quatre vingt cinquante-deux cent trois cinquante deux "
...         "trente et un.\n"
...         "Ordinaux: cinquième troisième vingt et unième centième mille deux cent trentième.\n"
...         "Décimaux: douze virgule quatre-vingt dix-neuf, cent vingt virgule zéro cinq ; "
...         "mais soixante zéro deux."
...     )
>>> print(alpha2digit(sentence))
842 pommes, 25 chiens, 1003 chevaux, 12698 clous.
95 vaut 95. On tolère l'absence de tirets avant les unités : 76 vaut 76.
Nombres en série : 12 15 004 20 52 103 52 31.
Ordinaux: 5ème 3ème 21ème 100ème 1230ème.
Décimaux: 12,99, 120,05 ; mais 60 02.

Read the complete documentation on ReadTheDocs.


