Skip to content

libindic/indicngram

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

LibIndic N-gram Generator

Build Status Coverage Status

An n-gram generator for indic languages.

What is Ngram?

An n-gram model is a type of probabilistic model for predicting the next item in a sequence. n-grams are used in various areas of statistical natural language processing and genetic sequence analysis.

An n-gram is a subsequence of n items from a given sequence. The items in question can be phonemes, syllables, letters, words or base pairs according to the application.

An n-gram of size 1 is referred to as a "unigram"; size 2 is a "bigram" (or, less commonly, a "digram"); size 3 is a "trigram"; and size 4 or more is simply called an "n-gram".

Installation

  1. Clone the repository git clone https://github.com/libindic/indicngram.git
  2. Change to the cloned directory cd indicngram
  3. Run setup.py to create installable source python setup.py sdist
  4. Install using pip pip install dist/libindic-ngram*.tar.gz

Usage

Input Parameters: Text and value of N (default value 2)
Output: List of grams


>>> from libindic.ngram import Ngram
>>> ngram_generator = Ngram()
>>> ngram_gerator(<text>, <window size>)
Example
>>> from libindic.ngram import Ngram
>>> ngram_generator = Ngram()
>>> text = "Languages"
>>> grams = ngram_generator.letterNgram(text, 3)
>>> print(grams)
['Lan', 'ang', 'ngu', 'gua', 'uag', 'age', 'ges']
>>> for gram in grams:
...     print("".join(gram))

Lan
ang
ngu
gua
uag
age
ges

Tests

Run tests with python setup.py test

Read the docs for more.

About

n-gram genereator for indic languages

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published