You can clone with
HTTPS or Subversion.
Non-ASCII text content is displayed garbled when loaded from a Gephi project file.
The issue occurs since Project API 0.8.0.1. In fact, it occurred right away after I had updated the Project API plugin from 0.8.0 to 0.8.0.1.
This is reproducible on my Mac and possibly affects other, if not all, OS X systems.
Saving seems to work properly; the issue affects only loading. It does not matter if the project was saved using the last plugin version or the new one.
The issue is 100 % reproducible on my Mac. Just create a new project with a single node labeled
and save. As soon as the project is loaded again, the label says
My guess is that the issue affects systems whose Java runtime does NOT use UTF-8 as its default character encoding (e. g. the Java runtime on Mac OS).
So it might be reproducible on any Mac OS system, possibly even on a few other OSs.
My guess is that the issue might have been introduced with the fix for issue #465.
I have looked at the changes in 8f9dfd5 for a bit. It introduces an intermediate InputStreamReader (see line 103) whose purpose is to filter out certain characters. It is then piped to both the original input stream and the XMLStreamReader.
The original file input stream is not handed directly to the XMLStreamReader anymore. Instead it’s now the InputStreamReader’s duty to decode bytes to characters – and to use the proper character encoding (XMLStreamReader did both before the change).
However, the constructor in line 103 – InputStreamReader(java.io.InputStream) – by specification just asks the Java runtime for the default encoding, which happens to be MacRoman on the Mac.
As far as my guesswork is correct, the actual XML should be UTF-8 … so the InputStreamReader gets incorrect info about the encoding and creates the incorrect characters which it then hands further down to the XMLStreamReader.
Perhaps this could be fixed by just using InputStreamReader(java.io.InputStream, java.nio.charset.Charset) instead, a version of the constructor which takes the character encoding as an argument.
The issue went away when I downgraded to version 0.8.0 of the Project API plugin.
Issue #474 (November 15, 2011) mentions a similar problem with GEXF files.
Fix issue #488
Fix Issue #488 "Some text appears garbled when loaded from a project …
Thank you for your help.
I just commited the fix.
Just saw this bug report. That's a great example of collaboration. Well done!
Yeah, great work claui :D
By the way, I tried to merge with your commit here claui/gephi@92809b3 but I could not find how to do it with git (because it is not a branch of gephi master). Is it possible?
I think it would be better to create a pull requests for that. claui could just select which commit to include.
Hi mbastian and eduramiba,
Aww, thanks a lot for the kind words =)
Thanks for the pointer.
Should I also commit this change
to my fork … before I make the pull request?
I don’t really understand how this merging (magic ^^) works and am a bit anxious that I break something …
Oh, don't worry. You don't have to do anything now.
I was just wondering how to include your changes without rewriting them, and have your authorship visible in the log (this in git is a little different than bazaar).
Maybe I could have pulled it without request and pushed it to master.
The way to do this is using pull requests (http://help.github.com/send-pull-requests/).
Another way would be the following.
Thanks, I think I understand now