Skip to content

Convert Brill's Encyclopaedia of Islam (and possibly others) into proper Unicode

License

Notifications You must be signed in to change notification settings

Moarc/brilldecode

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 

Repository files navigation

What is Brillcode?

A very apt name given to the encoding we're dealing with.

The Brill Encyclopædia of Islam CD-ROM edition (2003) uses a custom font in which regular characters are replaced with the glyphs they need for transcription. The articles themselves are encoded in Win-1252, and the font is switched to this special font (Baskerville for Brill 02) as needed, using CSS. With modern stuff like webfonts the custom font could be included for machines that don't have this font installed, but copypasting from the articles would still result in a mess, and it just doesn't feel right, so I wrote a shitty script to convert them into proper Unicode. (it recently became a bit less shitty)

Further goals

  • iterating over the entire encyclopedia
  • (possibly) reading directly from the CD or an image of the CD.
  • output to something like slob

Postscript

I'm not sure if the Ba00/Ba01 fonts include something other than regular Win-1252 (it didn't seem so to me, the Ö's etc. are displayed properly with other fonts) - if so, I'll create another "conversion table" for those too.

About

Convert Brill's Encyclopaedia of Islam (and possibly others) into proper Unicode

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages