Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbled Japanese characters in GRASS plugin #14461

Closed
qgib opened this issue Nov 17, 2011 · 13 comments
Closed

Garbled Japanese characters in GRASS plugin #14461

qgib opened this issue Nov 17, 2011 · 13 comments
Labels
Bug Either a bug report, or a bug fix. Let's hope for the latter! GRASS
Milestone

Comments

@qgib
Copy link
Contributor

qgib commented Nov 17, 2011

Author Name: Masaru Narazaki Narazaki (Masaru Narazaki Narazaki)
Original Redmine Issue: 4547
Affected QGIS version: master
Redmine category:grass
Assignee: Giuseppe Sucameli


In Japan if we try to use GRASS plugin with Japanese, we can not find correct japanese leter because of the Garbring as adding files.
They say this phenomina begun at version 1.0 of QGIS.
Please collect this phenomina.


@qgib
Copy link
Contributor Author

qgib commented Nov 17, 2011

Author Name: Giovanni Manghi (@gioman)


  • fixed_version_id was changed from Version 1.6.0 to Version 1.8.0
  • subject was changed from Garbring Japanes character in GRASS plugin to Garbled Japanese characters in GRASS plugin

@qgib
Copy link
Contributor Author

qgib commented May 12, 2012

Author Name: Alexander Bruy (@alexbruy)


Looks like duplicate of #13224 (same issue for cyrillic)


  • version was configured as master
  • crashes_corrupts_data was configured as 0

@qgib
Copy link
Contributor Author

qgib commented Sep 4, 2012

Author Name: Paolo Cavallini (@pcav)


  • fixed_version_id was changed from Version 1.8.0 to Version 2.0.0

@qgib
Copy link
Contributor Author

qgib commented Oct 25, 2012

Author Name: Paolo Cavallini (@pcav)


So, this turned out a practically unsolvable problem in GRASS. Quoting Glynn Clements:

===
There are two issues for which there is no viable solution:

  1. OEM encoding.
  2. Shift-JIS.

Regarding #1: GRASS neither knows nor cares whether a string is in
ANSI or OEM encoding. Much of it doesn't care about encodings at all,
and just treats strings as sequences of bytes. Anything which needs to
care about the encoding (e.g. the GUI) will just use "the locale's
encoding", which on Windows means "the ANSI codepage". If you use the
OEM codepage for anything, you lose.

Suggestions as to how to determine whether a string uses the ANSI or
OEM page are welcome, if unlikely.

Regarding #2: On Windows, any byte within the range 0-127 is assumed
to represent the corresponding ASCII character. For encodings which
assign other characters to any byte within that range (either
individually or as part of a multi-byte sequence), that is likely to
cause problems.

The most obvious example is that any occurrence of the byte 0x5C
within a filename is assumed to be a directory separator.
Unfortunately, Shift-JIS uses 0x5C as the second byte of a multi-byte
sequence, meaning that Japanese filenames may be parsed incorrectly.

Neither EUC-JP nor UTF-8 have this problem (as these only re-purpose
codes above 128), but unfortunately Windows doesn't provide locales
which uses either of these encodings.

And I can't think of any solution which doesn't involve re-writing all
code which handles pathnames.

Similar issues may exist with the other punctuation characters which
are "mingled" with the alphabetic characters, i.e. "[\]^_{|}~" (e.g. |
is commonly used as a field separator, so tabular data which includes
Japanese text may be parsed incorrectly).

While such cases are probably less common than the pathname issue, a
fix is even less viable (i.e. fixing all string-handling code).

-- Glynn Clements glynn@gclements.plus.com

So the solution seems just to switch to EN, just for Windows.
Seems an easy fix.

@qgib
Copy link
Contributor Author

qgib commented Oct 25, 2012

Author Name: Minoru Akagi (@minorua)


In Japanese Windows environment, GRASS commands output xml text of interface description that begins with the following line.

@➇?xml version="1.0" encoding="CP932"?➉@

QDomDocument has ability to detect encoding, but it doesn't recognize most of codepage name "CPxxx". See http://qt-project.org/doc/qt-4.8/QTextCodec.html

I think it's not better to rely the current encoding conversion ability of QDomDocument. Since GRASS commands usually output text in system default encoding, we maybe should treat encoding name that Qt doesn't recognize as system encoding.


  • 5020 was configured as grassplugin1.patch

@qgib
Copy link
Contributor Author

qgib commented Oct 25, 2012

Author Name: Paolo Cavallini (@pcav)


  • pull_request_patch_supplied was changed from 0 to 1

@qgib
Copy link
Contributor Author

qgib commented Oct 25, 2012

Author Name: Marco Hugentobler (@mhugent)


  • assigned_to_id was configured as Radim Blazek

@qgib
Copy link
Contributor Author

qgib commented Nov 1, 2012

Author Name: Paolo Cavallini (@pcav)


May be a duplicate of #13224. Please close it if this is the case.

@qgib
Copy link
Contributor Author

qgib commented Nov 1, 2012

Author Name: Minoru Akagi (@minorua)


Okay, I attach a patch including patch for #13224 anew.


  • 5030 was configured as grassplugin2.patch

@qgib
Copy link
Contributor Author

qgib commented Nov 3, 2012

Author Name: Giuseppe Sucameli (@brushtyler)


Hi Minoru,
the patch looks good to me.

I'm adding a check so if we are not able to get the encoding from the XML declaration (using utf8 and the regular expression) then we'll let Qt detects the encoding of the XML (current behaviour).

This will make it working even whether the encoding name is not found, e.g. the encoding attribute is missing (though we are quite sure GRASS won't remove it) or the XML content is a UTF-16 or UTF-32 encoded string (the regexp doesn't match the text).

Since I cannot test it with Japanese lang, please, could you try the branch "grass_jp_enc":https://github.com/brushtyler/Quantum-GIS/tree/grass_jp_enc from my repo and report if it works?

@qgib
Copy link
Contributor Author

qgib commented Nov 3, 2012

Author Name: Minoru Akagi (@minorua)


Giuseppe Sucameli wrote:

Since I cannot test it with Japanese lang, please, could you try the branch "grass_jp_enc":https://github.com/brushtyler/Quantum-GIS/tree/grass_jp_enc from my repo and report if it works?

I've just tested your branch and got good result. Thanks!

@qgib
Copy link
Contributor Author

qgib commented Nov 4, 2012

Author Name: Giuseppe Sucameli (@brushtyler)


Fixed in changeset "c53c85813f4723b75d4e9326d2565fb51eaa8355".


  • status_id was changed from Open to Closed

@qgib
Copy link
Contributor Author

qgib commented Nov 4, 2012

Author Name: Giuseppe Sucameli (@brushtyler)


Thanks Minoru Akagi!
I hope we haven't broken other languages :)

Now that the change is in master, please could other people test it and report here?


  • assigned_to_id was changed from Radim Blazek to Giuseppe Sucameli

@qgib qgib added Bug Either a bug report, or a bug fix. Let's hope for the latter! GRASS labels May 24, 2019
@qgib qgib added this to the Version 2.0.0 milestone May 24, 2019
@qgib qgib closed this as completed May 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Either a bug report, or a bug fix. Let's hope for the latter! GRASS
Projects
None yet
Development

No branches or pull requests

1 participant