pymtranslate

A probabilistic foreign language translator. Based off the IBM model 1 machine translation algorithm. Uses 2 identical texts, an english corpus and a foreign corpus, then computes the probability that a specific english word maps to a specific foreign word based on our statistical model.

Install

Works for Python 2.7 (have not tested w/ Python 3)

$ pip install pymtranslate

Usage:

The script requires:

An english corpus of text
A matching foreign corpus of text

Then use translate:

$ python
>>> from pymtranslate.translator import Translator
>>>
>>> english = 'pymtranslate/data/short.en'
>>> foreign = 'pymtranslate/data/short.de'
>>>
>>> t = Translator()
>>> t.train(english, foreign) # initialize transmissions and get expected max estimates
>>>
>>> t.translate('dog') # print probable translations

[('der', 0.1287760647333088), ('Hund', 0.8712239352666912)]
>>> t.translate('man') # print probable translations
no matches found

Example English Corpus

the dog
the cat
the bus

Example Foreign Corpus

le chien
le chat
l' autobus

If you attempt to translate a word that is not in our statistical model, the script will tell you that no match was found.

Notes:

There are various sized english/foreign corpus files provided in the data folder. Make sure you use the same sized files (i.e. 2kcorpus.en, 2kcorpus.de) otherwise your results will be skewed. Also remember, the larger the corpus you are trying to crunch, the more resources will be eaten up by your CPU. Machine Translation can be a RAM-intensive task, however you can often get more meaningful results with a larger corpus.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
pymtranslate		pymtranslate
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pymtranslate

pymtranslate

.gitignore

.gitignore

.travis.yml

.travis.yml

LICENSE.txt

LICENSE.txt

README.md

README.md

requirements.txt

requirements.txt

setup.cfg

setup.cfg

setup.py

setup.py

Repository files navigation

pymtranslate

Install

Usage:

Example English Corpus

Example Foreign Corpus

Notes:

About

Releases 1

Packages

Contributors 2

Languages

License

accraze/pymtranslate

Folders and files

Latest commit

History

Repository files navigation

pymtranslate

Install

Usage:

Example English Corpus

Example Foreign Corpus

Notes:

About

Topics

Resources

License

Stars

Watchers

Forks

Languages