English #2

gerv opened this Issue Oct 29, 2010 · 0 comments


None yet
1 participant

gerv commented Oct 29, 2010

My understanding is that each language file should specify the list of characters required to write any text in that language, not just the most common texts, right? You want an "English" web page, whatever it says, to be renderable with a font which only supports the "English" set of characters.

If so, the list of characters required to write English text is much more than just a-z A-Z.

  • Numerals
  • Accented letters. Definitely é (café, résumé), è (blessèd), ï (naïve), ç (soupçon, façade) and almost certainly more. These are English words, even if they have been adopted from other languages. The correct English spelling of résumé is not "resume" - that's a different word.
  • Ligatures like æ (encyclopædia, fœtus) - these are not used in American English, but are in British English.
  • Apostrophe. You say you don't want to include punctuation, but "it's" and "its" are two different words, differentiated only by the apostrophe. And Mr O'Reilly is not Mr OReilly.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment