Skip to content

eroux/Dunhuang-unicode

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Dunhuang-unicode

This repository contains a Unicode version of the current state of the Tibetan manuscript of Dunhuang. It comes from two main sources:

The tags in the text have been removed and the safest option have been taken, but it still contains mistakes and unknown blocks.

The aim is to be able to treat Dunhuang texts as a digital corpus that can be analyzed. The project has been initiated (and material provided) by Nathan Hill for this purpose.

This work is under CC0 license, roughly equivalent to Public Domain.

Archaic forms

The text still contains some archaic forms that make it difficult to analyze automatically, but are straightforward to "update":

  • འི འོ འང འམ are often (not always) separated from the syllable by a tsheg (་)
  • some reversed gigus appear instead of normal gigus
  • some anusvara appear instad of མ suffix

About

Unicode version of the Dunhuang manuscripts

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published