# Odia with CLTK

You can now analyse Odia texts with CLTK!<br>


## Odia Alphabets

There are 14 vowels, 25 Structured and 11 Unstructured consonants in Odia language. See them by doing as follows:

In [1]:
from cltk.corpus.odia.alphabet import *
print("Vowels: ", VOWELS)
print("Structured consonants: ",STRUCTURED_CONSONANTS)
print("Unstructured consonants: ",UNSTRUCTURED_CONSONANTS)
print("Numerals: ",NUMERALS)

Vowels:  ['ଅ', 'ଆ', 'ଇ', 'ଈ', 'ଉ', 'ଊ', 'ଋ', 'ୠ', 'ଌ', 'ୡ', 'ଏ', 'ଐ', 'ଓ', 'ଔ']
Structured consonants:  ['କ', 'ଖ', 'ଗ', 'ଘ', 'ଙ', 'ଚ', 'ଛ', 'ଜ', 'ଝ', 'ଞ', 'ଟ', 'ଠ', 'ଡ', 'ଢ', 'ଣ', 'ତ', 'ଥ', 'ଦ', 'ଧ', 'ନ', 'ପ', 'ଫ', 'ବ', 'ଭ', 'ମ']
Unstructured consonants:  ['ଯ', 'ୟ', 'ର', 'ଲ', 'ଳ', 'ୱ', 'ଶ', 'ଷ', 'ସ', 'ହ', 'କ୍ଷ']
Numerals:  ['୦', '୧', '୨', '୩', '୪', '୫', '୬', '୭', '୮', '୯']


## Transliterations

We can transliterate Odia scripts to that of other Indic languages. Let us take an example Odia text and transliterate it to Hindi:

In [3]:
odia_text = "ମୁଁ ତୁମକୁ ସାହାଯ୍ୟ କରିପାରେ କି? "
print(odia_text)

ମୁଁ ତୁମକୁ ସାହାଯ୍ୟ କରିପାରେ କି? 


In [7]:
from cltk.corpus.sanskrit.itrans.unicode_transliterate import UnicodeIndicTransliterator
UnicodeIndicTransliterator.transliterate(odia_text,"or","hi")

'मुँ तुमकु साहाय्य़ करिपारे कि? '

We can also romanize the text as shown:

In [11]:
odia_text_two = "ଘାଗୁଡି"
from cltk.corpus.sanskrit.itrans.unicode_transliterate import ItransTransliterator
ItransTransliterator.to_itrans(odia_text_two,'or')


'ghaaguDi'

Similarly, we can indicize a text given in its ITRANS-transliteration

In [9]:
odia_text_itrans = 'sundara'
ItransTransliterator.from_itrans(odia_text_itrans,'or')

'ସୁନ୍ଦର'

## Syllabifier

We can use the indian_syllabifier to syllabify the odia sentences. To do this, we will have to import models as follows. The importing of `sanskrit_models_cltk` might take some time.

In [6]:
from cltk.corpus.utils.importer import CorpusImporter
phonetics_model_importer = CorpusImporter('sanskrit')
phonetics_model_importer.list_corpora
phonetics_model_importer.import_corpus('sanskrit_models_cltk') 

Now we import the syllabifier and syllabify as follows:

In [13]:
%%capture
from cltk.stem.sanskrit.indian_syllabifier import Syllabifier
odia_syllabifier = Syllabifier('oriya')
odia_syllables = odia_syllabifier.orthographic_syllabify('ସୁଗନ୍ଧ')

The syllables of the word ସୁଗନ୍ଧ will thus be:

In [14]:
print(odia_syllables)

['ସୁ', 'ଗ', 'ନ୍ଧ']
