Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Foreign Wikipedia accent issues with French #77

Closed
braddockcg opened this issue Sep 4, 2013 · 2 comments
Closed

Foreign Wikipedia accent issues with French #77

braddockcg opened this issue Sep 4, 2013 · 2 comments
Labels

Comments

@braddockcg
Copy link
Collaborator

The Wikipedia ZIM file encodings are not consistent (sometimes UTF-8, sometimes Latin, sometimes god knows what) and this is causing problems.

A field report from Haiti:
"PS a few of the most advanced kids played with and searched thru article titles with IIAB's French Wikipedia on their XO-1s, which works but French accents are quite badly mangled (each accented character shows up as 2 "random" characters, rendering the text almost unreadable in many cases. Apparently regardless whether the XO's set to English, French or Haitian Creole"

@braddockcg
Copy link
Collaborator Author

"For example the very main page of http://schoolserver/iiab/zim/iframe/wikipedia_fr_all_09_2012 shows in both schools here in Haiti as:

Wikipédia est une encyclopédie multilingue, universelle et librement diffusable.

Which should appear as:

Wikipédia est une encyclopédie multilingue, universelle et librement diffusable.

@braddockcg
Copy link
Collaborator Author

It is due to inconsistent language encodings in the wikipedia dumps from the Kiwix project. We need to hack around it (and have indeed tried, but obviously not well enough).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant