Odd conversion from UTF-8 to ISO-8859-1 #33

JoakimLofgren · 2016-02-25T22:15:52Z

Compatible ISO-8859-1 characters like åäö are converted into numeric HTML entities when they don't need to be...

In my opinion only characters that are unsupported or clashes with XML, e.g. <, > and others, should be converted into numeric HTML entities.

If I remember correctly this was not the case in the version 3.0.1 of this library.

Test out some solutions for convertion UTF8/Latin1 Possible solution(s) for gggeek#33

JoakimLofgren · 2016-02-25T22:19:40Z

I've started playing with some possible solutions (I referenced the commit to this issue).

@gggeek Any thoughts or concerns on my WIP stuff?

gggeek · 2016-02-28T19:19:51Z

Thanks for providing test cases.

Just to be sure that I understand correctly the situation and can reproduce the problem:

the problem arises when the lib receives latin-1 chars from external sources, and passes utf8 to the app / viceversa / both ways ? Is there a charset declaration in the xml prolog and/or a BOM involved?
you are providing test code; did you investigate the root of the problem?
are the 2 test classes in your code alternatives to do the exact same thing?

I ask because a few fixes were indeed made to the code in the area of charset handling, to make sure that some corner cases are properly handled, and I would expect the lib to actually fare better...

JoakimLofgren · 2016-02-28T19:37:05Z

The problem is when you have a UTF-8 file with å and then you try to send it as ISO-8859-1 (e.g. it goes into this case in the Charset file).

Well the charset declaration or the BOM is not the issue as there are no "garbled" characters.

I ran into the issue when trying to send XML with the Client. It seems to be that it HTML entity encodes characters which need not be.

The CharsetFixTest file is my attempt at writing a fix (with the code in the test file) and should be removed before being merged.

DianonForce · 2016-03-15T09:47:35Z

Hi Guys,
is there a posible solution or a workaround for this problem?
I have the problem, that i have to make a connection to an xmlrpc-server, wich need to get user information from me. Any idiot decided, that my username contains 'ö' insted of 'oe', so the 'ö' is converted in '#&246;' wich give me an error 'no such user' as respond from xmlrpc-server.

gggeek · 2016-03-16T12:43:01Z

Sorry, I have been too busy to work on this. I will do my best to find enough time the upcoming weekend

gggeek · 2016-03-26T22:37:47Z

@DianonForce the simplest workaround seems to me to set
$client->request_charset_encoding = 'UTF-8';
did you try it ?

…onverting utf8 to latin-1

gggeek · 2016-03-27T00:20:41Z

Fixed in version 4.0.1.
Thanks @JoakimLofgren, I incorporated your test case, albeit slightly reduced.
As for version 3.0.1, I checked the code for character encoding conversion and it was the same as it in in version 4, so if this was not happening before, it must have been a long time ago.

JoakimLofgren · 2016-03-27T13:42:51Z

Thanks @gggeek

I'll check my use case on Tuesday. :)

JoakimLofgren added a commit to JoakimLofgren/phpxmlrpc that referenced this issue Feb 25, 2016

WIP fix for charset encodeEntities

838391e

Test out some solutions for convertion UTF8/Latin1 Possible solution(s) for gggeek#33

gggeek added a commit that referenced this issue Mar 27, 2016

Fix for issue #33: excessive usage of numeric charset entities when c…

d8e180b

…onverting utf8 to latin-1

gggeek closed this as completed Mar 27, 2016

JoakimLofgren mentioned this issue Mar 27, 2016

Added option to handle other local charsets than ISO-8859-1 klarna/php-xmlrpc#12

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Odd conversion from UTF-8 to ISO-8859-1 #33

Odd conversion from UTF-8 to ISO-8859-1 #33

JoakimLofgren commented Feb 25, 2016

JoakimLofgren commented Feb 25, 2016

gggeek commented Feb 28, 2016

JoakimLofgren commented Feb 28, 2016

DianonForce commented Mar 15, 2016

gggeek commented Mar 16, 2016

gggeek commented Mar 26, 2016

gggeek commented Mar 27, 2016

JoakimLofgren commented Mar 27, 2016

Odd conversion from UTF-8 to ISO-8859-1 #33

Odd conversion from UTF-8 to ISO-8859-1 #33

Comments

JoakimLofgren commented Feb 25, 2016

JoakimLofgren commented Feb 25, 2016

gggeek commented Feb 28, 2016

JoakimLofgren commented Feb 28, 2016

DianonForce commented Mar 15, 2016

gggeek commented Mar 16, 2016

gggeek commented Mar 26, 2016

gggeek commented Mar 27, 2016

JoakimLofgren commented Mar 27, 2016