Handle non-ascii characters #177

jfly · 2014-10-02T20:19:53Z

Sébastien just tried to generate scrmables for "Sébra Open", and the pdf title was "SÃ©bra Open".

jfly · 2014-10-16T18:27:22Z

Olivér just tried to use a password that contains an accented character, and he couldn't open it. I don't know if this is a bug on our side, or (less likely) a bug with his zip program (total commander).

campos20 · 2016-12-07T19:00:48Z

I think that if #159 is implemented, it may fix this as well.

campos20 · 2017-11-14T22:30:15Z

When encoding for Internationalization files were in ISO-8859-1, everything worked fine. Setting to UTF-8, this very same problem happened in FMC translation.

ISO

UTF8

jfly · 2017-11-15T02:57:19Z

@campos20, yikes! Let's continue this discussion over on #242. This issue is for non-ascii characters in the filename, whereas you're running into trouble with non-ascii characters in the pdfs.

campos20 · 2019-03-23T04:49:59Z

I try this from time to time with some new idea, but it's a lot of chained stuff to dig in, but I think I found something perhaps useful. When I open http://localhost:2014 (let's say legacy) and save the .html to my computer, this is what I get

<!DOCTYPE html>
<!-- saved from url=(0188)http://localhost:2014/scramble-legacy/#competitionName=T%C3%A9st&rounds=i('eventID'-'333'_'round'-'1'_'scrambleSetCount'-1_'scrambleCount'-5_'extraScrambleCount'-2_'copies'-1)!&version=1.0 -->
<html slick-uniqueid="3"><head><meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
<title>WCA Scramble Program</title>

There's a charset=windows-1252. I tried to find where this was generated, but I got nothing yet. I believe charset should be set to utf-8.

Edit to include some details that I've tried: I've put System.out.println in some places for debug and it looks like TNoodle's backed already get the distorted version. It's a bit strange how it's not distorted on the downloaded .zip, but's it's distorted if you put a println here.

campos20 · 2019-03-24T00:39:28Z

This code "fixed" encoding for PDFs, using legacy ui

gregorbg · 2020-01-20T08:06:09Z

This is fixed for the PDF library and file encodings now, but interestingly the ZIP filenames can be broken under certain circumstances. Keeping this open to continue investigation

gregorbg · 2020-02-23T00:43:20Z

Fixed as part of the Kotlin migrations -- strings are handled consistently as UTF-8 there and the translation files were converted accordingly.

jfly mentioned this issue Nov 15, 2017

Internationalization support in FMC #242

Merged

campos20 added a commit to campos20/tnoodle that referenced this issue Mar 23, 2019

windows-1252 to utf-8 for PDFs fix thewca#177

3e0a07e

campos20 mentioned this issue Mar 23, 2019

windows-1252 to utf-8 for PDFs fix #177 #398

Closed

gregorbg mentioned this issue Jan 20, 2020

Competition titles encoding can get mangled #410

Closed

gregorbg closed this as completed Feb 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle non-ascii characters #177

Handle non-ascii characters #177

jfly commented Oct 2, 2014

jfly commented Oct 16, 2014

campos20 commented Dec 7, 2016

campos20 commented Nov 14, 2017

jfly commented Nov 15, 2017

campos20 commented Mar 23, 2019 •

edited

campos20 commented Mar 24, 2019

gregorbg commented Jan 20, 2020

gregorbg commented Feb 23, 2020

Handle non-ascii characters #177

Handle non-ascii characters #177

Comments

jfly commented Oct 2, 2014

jfly commented Oct 16, 2014

campos20 commented Dec 7, 2016

campos20 commented Nov 14, 2017

jfly commented Nov 15, 2017

campos20 commented Mar 23, 2019 • edited

campos20 commented Mar 24, 2019

gregorbg commented Jan 20, 2020

gregorbg commented Feb 23, 2020

campos20 commented Mar 23, 2019 •

edited