Skip to content

Commit

Permalink
enhance "hindi" alphabet.py (#751)
Browse files Browse the repository at this point in the history
* add vocal categories and ॐ (omm) symbol

Indian Languages derived from Sanskrit have their alphabets arranged according to their vocal properties and they have such clear distinction. I have added those categories. also, symbol "ॐ" is also widely used in Indian languages.

* fix extra space

fix extra space i left by mistake in previous commmit

* Update alphabet.py
  • Loading branch information
blue-atom authored and kylepjohnson committed Apr 2, 2019
1 parent 9828ef7 commit 254ca70
Showing 1 changed file with 21 additions and 1 deletion.
22 changes: 21 additions & 1 deletion cltk/corpus/hindi/alphabet.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@


#the Semivowels are also in the script of hindi
SEMIVOWELS = ['य ','र ','ल' ,'व']
SEMIVOWELS = ['य','र','ल' ,'व']

#There are three sibilants:
SIBILANTS = ['श','ष','स']
Expand All @@ -31,3 +31,23 @@
# Anusvara is used for final velar nasal sound, Visarga adds voiceless breath after vowel and Candrabindu is used to nasalize vowels

MODIFIERS = ['◌্','◌ঁ','◌ং','◌ঃ']

# classification of alphabets according to how their sound is produced

VELAR_CONSONANTS = [ 'क' , 'ख' , 'ग' , 'घ' , 'ङ' ]

PALATAL_CONSONANTS = ['च' , 'छ' , 'ज' , 'झ' , 'ञ' ]

RETROFLEX_CONSONANTS = ['ट' , 'ठ' , 'ड' , 'ढ' , 'ण']

DENTAL_CONSONANTS = ['त' , 'थ' , 'द' , 'ध' , 'न' ]

LABIAL_CONSONANTS = ['प' , 'फ' , 'ब' , 'भ' , 'म']

SONORANT_CONSONANTS = ['य' , 'र' , 'ल' , 'व']

SIBILANT_CONSONANTS = ['श' , 'ष' , 'स']

GUTTURAL_CONSONANT = ['ह']

SIGNS= ['ॐ']

0 comments on commit 254ca70

Please sign in to comment.