Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Devanagari conjuct list for comparing fonts #354

Closed
drdhaval2785 opened this issue Aug 12, 2021 · 3 comments
Closed

Devanagari conjuct list for comparing fonts #354

drdhaval2785 opened this issue Aug 12, 2021 · 3 comments
Labels
Documentation How TXT , XML work

Comments

@drdhaval2785
Copy link
Contributor

Reference - sanskrit-lexicon/PWG#5

A discussion was initiated there regarding checking rendering of conjuncts in Siddhanta and Adhishila fonts.
A working list of conjuncts from MW was also used there to derive a display there.

For a fuller list of conjuncts for font testing, I have scraped the alphabets and conjuncts from vcp.txt and put them in this file. The entries are in conjunct:frequency format.

While the last entries may be errors, the entries occurring more than 5 times are most probably proper Sanskrit conjuncts.

We may check the fonts further using this dataset.

conjunct_frequency.txt

Tech note -
Regex used is conjuncts = re.split(r'[^kKgGNcCjJYwWqQRtTdDnpPbBmyrlvSzsh]', data)

@drdhaval2785
Copy link
Contributor Author

program

import re
import codecs
from collections import Counter

fin = codecs.open('/var/www/html/cologne/csl-orig/v02/vcp/vcp.txt', 'r', 'utf-8')
data = fin.read()
conjuncts = re.split(r'[^kKgGNcCjJYwWqQRtTdDnpPbBmyrlvSzsh]', data)

conj_counter = Counter(conjuncts)
desc = conj_counter.most_common()
for (a, b) in desc:
	print(a + 'a:'  + str(b))

@gasyoun
Copy link
Member

gasyoun commented Aug 12, 2021

Simple, yet lovely. I've made such for Rigveda before and Ulrich Stiehl even 10 years before I did. Remains interesting, thanks, Dhaval.

@drdhaval2785
Copy link
Contributor Author

Served its purpose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation How TXT , XML work
Projects
None yet
Development

No branches or pull requests

2 participants