Skip to content
/ pytropic Public

Train and predict string entropy based on character n-grams

License

Notifications You must be signed in to change notification settings

willf/pytropic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pytropic

PyPI version Python package

An python with a lot of entropy

Train and predict string entropy based on character n-grams

Features

  • Train a model on a corpus of text
  • multiple n-gram sizes
  • Can name models

Example

>>> from pytropic import pytropic

>>> en = pytropic.Model(name='English 3-gram', size=3)
>>> fr = pytropic.Model(name='French 3-gram', size=3)

>>> with open('./corpora/bible-english.txt') as f:
        en.train(f)
>>> with open('./corpora/bible-french.txt') as f:
        fr.train(f)

>>> t = {'en': en, 'fr': fr}

>>> min(t, key=lambda x: t[x].entropy("this is a test"))
Out: 'en'

>>> min(t, key=lambda x: t[x].entropy("c'est un test"))
Out: 'fr'

About

Train and predict string entropy based on character n-grams

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages