v0.1.43

@kylepjohnson kylepjohnson released this Sep 30, 2016 · 68 commits to master since this release

#386 Port the Collatinus Latin decliner by @PonteIneptique (Thibaut Clérice). (See more at #385.)
#390 Small addition to Latin word tokenizer by @diyclassics
#383 Docs added for Greek Accentuation package
• #380 Docs for Grk alphabet by @michaalbert
• #377 Added Sefaria corpus for dl
#376 add docstrings by @souravsingh
#372 English exception words to Latin tokenizer by @diyclassics
#371 Fix to path for cltk_data by @diyclassics
#370 Speed improvement to Latin macronizer by @TylerKirby
#364 Update punjabi.rst by @RatulGhosh
• #363 Punjabi docs by @nimitbhardwaj
#362 New Bengali corpus from @RatulGhosh
• #356 remove most corpus downloads from test by @kylepjohnson

Downloads

v0.1.42

@kylepjohnson kylepjohnson released this Aug 13, 2016 · 94 commits to master since this release

Fixes from @ryanfb #351 and @TylerKirby #350

Downloads

v0.1.41

@kylepjohnson kylepjohnson released this Aug 11, 2016 · 98 commits to master since this release

Behind-the-scenes changes for easy-to-use corpus reader by @diyclassics (Patrick Burns), from PR #347

Downloads

v0.1.40

@kylepjohnson kylepjohnson released this Aug 2, 2016 · 106 commits to master since this release

The release is for Tyler Kirby's ( @TylerKirby ) Latin macronizer. Use example:

In [1]: from cltk.prosody.latin.macronizer import Macronizer

In [2]: macronizer = Macronizer('tag_ngram_123_backoff')

In [3]: text = 'Quo usque tandem, O Catilina, abutere nostra patientia?'

In [4]: macronizer.macronize_text(text)
Out[4]: 'quō usque tandem , ō catilīnā , abūtēre nostrā patientia ?

In [5]: macronizer.macronize_tags(text)
Out[5]: [('quo', 'd--------', 'quō'), ('usque', 'd--------', 'usque'), ('tandem', 'd--------', 'tandem'), (',', 'u--------', ','), ('o', 'e--------', 'ō'), ('catilina', 'n-s---mb-', 'catilīnā'), (',', 'u--------', ','), ('abutere', 'v2sfip---', 'abūtēre'), ('nostra', 'a-s---fb-', 'nostrā'), ('patientia', 'n-s---fn-', 'patientia'), ('?', None, '?')]

Downloads

v0.1.39

@kylepjohnson kylepjohnson released this Jul 29, 2016 · 113 commits to master since this release

This PR allows users to define repositories/corpora that are not hosted at https://github.com/cltk. They can do so by putting a file at ~/cltk_data/distributed_corpora.yaml with markup like so for each repo:

example_distributed_latin_corpus:
    git_remote: git@github.com:kylepjohnson/latin_corpus_newton_example.git
    language: latin
    type: text

Downloads

v0.1.38

@kylepjohnson kylepjohnson released this Jul 16, 2016 · 146 commits to master since this release

Add syllabifier and tokenizer for Indian languages

By @soumyag213 who ported some code of @anoopkunchukuttan's indic_nlp_library.

#245

Thank you to both!!!

Downloads

v0.1.37

@kylepjohnson kylepjohnson released this Jul 11, 2016 · 149 commits to master since this release

Update tonos_oxia_converter() function.

Downloads

v0.1.36

@kylepjohnson kylepjohnson released this Jun 24, 2016 · 159 commits to master since this release

mv lapos note

Downloads

v0.1.34

@kylepjohnson kylepjohnson released this May 6, 2016 · 201 commits to master since this release

Triggering this release to keep up-to-date with what's on PyPI. Nothing added to this except small bug fix.

Downloads