Problem with < character wrongly replaced with &lt; #502

Closed
iBotPeaches opened this Issue Mar 18, 2015 · 12 comments

Comments

Projects
None yet
1 participant
@iBotPeaches
Owner

iBotPeaches commented Mar 18, 2015

Original issue 391 created by huexxx on 2013-01-09T13:59:02.000Z:

What steps will reproduce the problem?

  1. Decompile an apk with apktool 1.5.1
  2. Look at /res/values xml files

What is the expected output? What do you see instead?

The expected output is the < character instead a XML encoded &lt;

What version of the product are you using? On what operating system?

apktool 1.5.1 over windows 7 x64

Please provide any additional information below.

The problem appeared trying to modify a LG settings.apk file. After modding it, on some menus inside the app the strings where malformed.

Looking into it, I discovered the following: some < are replaced with &lt; when the characters are part of xml code.

For example:

<item>&lt;font size="16" align="middle">Small&lt;/font></item>

... where it should be ...

<item><font size="16" align="middle">Small</font></item>

This example is on /res/values/arrays.xml, but is extended to lots of arrays.xml and strings.xml

I'm batch-replacing the chars but is not an easy solution.

Can U take a look at this? You can download the apk and the needed framework files from here: http://dl.dropbox.com/u/4629711/IZS/apks.zip

Thanks and best regards.

@iBotPeaches

This comment has been minimized.

Show comment
Hide comment
@iBotPeaches

iBotPeaches Mar 18, 2015

Owner

Comment #1 originally posted by huexxx on 2013-01-09T14:12:14.000Z:

Batch-replacing is very hard, because there are two different situations:

  • When < is part of xml code, it should appear as < instead of <
  • When < is part of a string, it should be kept as < to avoid a malformed XML.

Regards.

Owner

iBotPeaches commented Mar 18, 2015

Comment #1 originally posted by huexxx on 2013-01-09T14:12:14.000Z:

Batch-replacing is very hard, because there are two different situations:

  • When < is part of xml code, it should appear as < instead of <
  • When < is part of a string, it should be kept as < to avoid a malformed XML.

Regards.

@iBotPeaches

This comment has been minimized.

Show comment
Hide comment
@iBotPeaches

iBotPeaches Mar 18, 2015

Owner

Comment #2 originally posted by connor.tumbleson on 2013-01-09T14:18:57.000Z:

It must not be re-encoding the entities back to the right form.

A bit confusing though, as it says that the java will read the xml in the "escaped" form back to normal.

"Because the fromHtml(String) method will format all HTML entities, be sure to escape any possible HTML characters in the strings you use with the formatted text, using htmlEncode(String). For instance, if you'll be passing a string argument to String.format() that may contain characters such as "<" or "&", then they must be escaped before formatting, so that when the formatted string is passed through fromHtml(String), the characters come out the way they were originally written."

http://developer.android.com/guide/topics/resources/string-resource.html

ahh, but this is a for a string-array. Probably doesn't take affect there. This doesn't seem like intended use of 's but I'll see what I can do.

Owner

iBotPeaches commented Mar 18, 2015

Comment #2 originally posted by connor.tumbleson on 2013-01-09T14:18:57.000Z:

It must not be re-encoding the entities back to the right form.

A bit confusing though, as it says that the java will read the xml in the "escaped" form back to normal.

"Because the fromHtml(String) method will format all HTML entities, be sure to escape any possible HTML characters in the strings you use with the formatted text, using htmlEncode(String). For instance, if you'll be passing a string argument to String.format() that may contain characters such as "<" or "&", then they must be escaped before formatting, so that when the formatted string is passed through fromHtml(String), the characters come out the way they were originally written."

http://developer.android.com/guide/topics/resources/string-resource.html

ahh, but this is a for a string-array. Probably doesn't take affect there. This doesn't seem like intended use of 's but I'll see what I can do.

@iBotPeaches

This comment has been minimized.

Show comment
Hide comment
@iBotPeaches

iBotPeaches Mar 18, 2015

Owner

Comment #3 originally posted by huexxx on 2013-01-09T14:31:02.000Z:

Aha... then it seems that I cannot do nothing ATM... becuase it's a re-encoding issue.

If I have understood you well, the decoded xml is correct with all those <, but you have to modify the re-encoding procedure to correctly de-escape the '<' characters from inside a string-array .

Right, I'll wait for you solution. Thanks!

Owner

iBotPeaches commented Mar 18, 2015

Comment #3 originally posted by huexxx on 2013-01-09T14:31:02.000Z:

Aha... then it seems that I cannot do nothing ATM... becuase it's a re-encoding issue.

If I have understood you well, the decoded xml is correct with all those <, but you have to modify the re-encoding procedure to correctly de-escape the '<' characters from inside a string-array .

Right, I'll wait for you solution. Thanks!

@iBotPeaches

This comment has been minimized.

Show comment
Hide comment
@iBotPeaches

iBotPeaches Mar 18, 2015

Owner

Comment #4 originally posted by Vitos.Laszlo on 2013-01-15T09:25:38.000Z:

FWIW, I think you guys are off. It's up to the application's developer to escape the strings based on the application's needs and the way the resource string is used in the app. When decompiling/compiling, you must not change the strings, or you get the problem described in this report.

Owner

iBotPeaches commented Mar 18, 2015

Comment #4 originally posted by Vitos.Laszlo on 2013-01-15T09:25:38.000Z:

FWIW, I think you guys are off. It's up to the application's developer to escape the strings based on the application's needs and the way the resource string is used in the app. When decompiling/compiling, you must not change the strings, or you get the problem described in this report.

@iBotPeaches

This comment has been minimized.

Show comment
Hide comment
@iBotPeaches

iBotPeaches Mar 18, 2015

Owner

Comment #5 originally posted by connor.tumbleson on 2013-01-23T14:51:43.000Z:

The only bug I found in Apktool related to this is that & and < are double escaped on each run.

So & -> & -> &amp; -> &amp;amp;

on each run (same with <). If I prevent double encoding it'll fix some encoding problems I found in arrays.xml. Will work on that after I finish up some other bugs.

Owner

iBotPeaches commented Mar 18, 2015

Comment #5 originally posted by connor.tumbleson on 2013-01-23T14:51:43.000Z:

The only bug I found in Apktool related to this is that & and < are double escaped on each run.

So & -> & -> &amp; -> &amp;amp;

on each run (same with <). If I prevent double encoding it'll fix some encoding problems I found in arrays.xml. Will work on that after I finish up some other bugs.

@iBotPeaches

This comment has been minimized.

Show comment
Hide comment
@iBotPeaches

iBotPeaches Mar 18, 2015

Owner

Comment #6 originally posted by Vitos.Laszlo on 2013-01-23T16:32:37.000Z:

I'm still not seeing how this could possibly work. In ResArrayValue.serializeToResValuesXml() it calls for all items encodeAsResXmlItemValue() which for strings seems to end up in ResXmlEncoders.encodeAsXmlValue(string) which in turn goes on to escape all kinds of characters. What if the characters were not escaped in the original xml - just like in the original issue report? They will get encoded (at least once) which will result in the application (in this case, the LG settings.apk) misinterpreting them.

Instead - in case the strings get escaped when decompiling, the serialization process should unescape them (and not escape again), otherwise it should leave them alone.

Owner

iBotPeaches commented Mar 18, 2015

Comment #6 originally posted by Vitos.Laszlo on 2013-01-23T16:32:37.000Z:

I'm still not seeing how this could possibly work. In ResArrayValue.serializeToResValuesXml() it calls for all items encodeAsResXmlItemValue() which for strings seems to end up in ResXmlEncoders.encodeAsXmlValue(string) which in turn goes on to escape all kinds of characters. What if the characters were not escaped in the original xml - just like in the original issue report? They will get encoded (at least once) which will result in the application (in this case, the LG settings.apk) misinterpreting them.

Instead - in case the strings get escaped when decompiling, the serialization process should unescape them (and not escape again), otherwise it should leave them alone.

@iBotPeaches

This comment has been minimized.

Show comment
Hide comment
@iBotPeaches

iBotPeaches Mar 18, 2015

Owner

Comment #7 originally posted by huexxx on 2013-01-23T16:38:29.000Z:

IMHO the problems is the following:

  • '<' is well is well escaped on decoding.
  • '<' is not de-escaped on re-encoding (maybe because are inside an
    line on a array-string.
  • If you decompile again, on sources you can see that < has been
    replaced by &lt;, so is escaping '&', something that shows us that the
    problem is in de enconding (leaving '<' in its escaped form '<').

Where the problem is? I don't know. How to solve it? I don't know.

Owner

iBotPeaches commented Mar 18, 2015

Comment #7 originally posted by huexxx on 2013-01-23T16:38:29.000Z:

IMHO the problems is the following:

  • '<' is well is well escaped on decoding.
  • '<' is not de-escaped on re-encoding (maybe because are inside an
    line on a array-string.
  • If you decompile again, on sources you can see that < has been
    replaced by &lt;, so is escaping '&', something that shows us that the
    problem is in de enconding (leaving '<' in its escaped form '<').

Where the problem is? I don't know. How to solve it? I don't know.

@iBotPeaches

This comment has been minimized.

Show comment
Hide comment
@iBotPeaches

iBotPeaches Mar 18, 2015

Owner

Comment #8 originally posted by connor.tumbleson on 2013-01-23T16:44:11.000Z:

It has nothing to do with what the default is in the application.

XML can't handle those characters that have special meaning (& and <), our 2 options are CDATA or escaping them. CDATA is not an option so we must escape to & etc.

Look in AOSP -> https://github.com/android/platform_packages_apps_settings/blob/master/res/values/strings.xml#L583

It has & escaped so it can be built. If developers are using hacky methods instead of escaping special characters that is their fault and not Apktool's. I will fix the double encoding problem.

Owner

iBotPeaches commented Mar 18, 2015

Comment #8 originally posted by connor.tumbleson on 2013-01-23T16:44:11.000Z:

It has nothing to do with what the default is in the application.

XML can't handle those characters that have special meaning (& and <), our 2 options are CDATA or escaping them. CDATA is not an option so we must escape to & etc.

Look in AOSP -> https://github.com/android/platform_packages_apps_settings/blob/master/res/values/strings.xml#L583

It has & escaped so it can be built. If developers are using hacky methods instead of escaping special characters that is their fault and not Apktool's. I will fix the double encoding problem.

@iBotPeaches

This comment has been minimized.

Show comment
Hide comment
@iBotPeaches

iBotPeaches Mar 18, 2015

Owner

Comment #9 originally posted by Vitos.Laszlo on 2013-01-23T17:00:22.000Z:

Guys, just look at http://developer.android.com/guide/topics/resources/string-resource.html carefully.

In the "Styling with HTML markup" section it clearly states that "You can add styling to your strings with HTML markup" and then it goes on to give examples - notice that there's no escaping. Never once does it mention that XML can't handle < and it must be escaped (it might still be the case for & though).

Only after that it says that sometimes you must escape but only when it will be used as a format string or otherwise passed through fromHtml() in the application itself.

Owner

iBotPeaches commented Mar 18, 2015

Comment #9 originally posted by Vitos.Laszlo on 2013-01-23T17:00:22.000Z:

Guys, just look at http://developer.android.com/guide/topics/resources/string-resource.html carefully.

In the "Styling with HTML markup" section it clearly states that "You can add styling to your strings with HTML markup" and then it goes on to give examples - notice that there's no escaping. Never once does it mention that XML can't handle < and it must be escaped (it might still be the case for & though).

Only after that it says that sometimes you must escape but only when it will be used as a format string or otherwise passed through fromHtml() in the application itself.

@iBotPeaches

This comment has been minimized.

Show comment
Hide comment
@iBotPeaches

iBotPeaches Mar 18, 2015

Owner

Comment #10 originally posted by connor.tumbleson on 2013-03-11T14:12:39.000Z:

For the issue in the OP. That isn't a problem, that is that application's fault. No where does it say anywhere on the Android website where using tags is acceptable in 's of an array. You should use styles.xml for styling text, not hardcoding tags like that.

Apktool is a disassembler, its not going to "fix broken apks". The developer didn't write valid code in my opinion.

Not to mention I don't think thats an allowed HTML element. I think the only allowed ones are , , and thats only in strings.xml, I have no clue what the limits or bounds are of arrays.

At comment # 9, I think you didn't read it fully :p

" For instance, if you'll be passing a string argument to String.format() that may contain characters such as "<" or "&", then they must be escaped before formatting, so that when the formatted string is passed through fromHtml(String), the characters come out the way they were originally written. "

The only problem here is the double escaping of < and & which I found out is due to our XMLWriter escaping elements before writing the XML along with our ResDecoder during decompilation. So here it is

& (orig) -> & (res decoder) -> &amp; (xml writer) -> Thus on Phone when its passed into fromHTML, it comes in as "&amp;" then is treated as & instead of &(thus bug)

I do only do this in quick free blocks of time, so if anyone has more information pertaining XML and arrays then please share. I use AOSP and dev docs as references, so please stick to those two when making points.

Owner

iBotPeaches commented Mar 18, 2015

Comment #10 originally posted by connor.tumbleson on 2013-03-11T14:12:39.000Z:

For the issue in the OP. That isn't a problem, that is that application's fault. No where does it say anywhere on the Android website where using tags is acceptable in 's of an array. You should use styles.xml for styling text, not hardcoding tags like that.

Apktool is a disassembler, its not going to "fix broken apks". The developer didn't write valid code in my opinion.

Not to mention I don't think thats an allowed HTML element. I think the only allowed ones are , , and thats only in strings.xml, I have no clue what the limits or bounds are of arrays.

At comment # 9, I think you didn't read it fully :p

" For instance, if you'll be passing a string argument to String.format() that may contain characters such as "<" or "&", then they must be escaped before formatting, so that when the formatted string is passed through fromHtml(String), the characters come out the way they were originally written. "

The only problem here is the double escaping of < and & which I found out is due to our XMLWriter escaping elements before writing the XML along with our ResDecoder during decompilation. So here it is

& (orig) -> & (res decoder) -> &amp; (xml writer) -> Thus on Phone when its passed into fromHTML, it comes in as "&amp;" then is treated as & instead of &(thus bug)

I do only do this in quick free blocks of time, so if anyone has more information pertaining XML and arrays then please share. I use AOSP and dev docs as references, so please stick to those two when making points.

@iBotPeaches

This comment has been minimized.

Show comment
Hide comment
@iBotPeaches

iBotPeaches Mar 18, 2015

Owner

Comment #11 originally posted by connor.tumbleson on 2013-05-05T13:15:14.000Z:

Fixed in commit: f93a312

Will be included in Apktool 2.0

Owner

iBotPeaches commented Mar 18, 2015

Comment #11 originally posted by connor.tumbleson on 2013-05-05T13:15:14.000Z:

Fixed in commit: f93a312

Will be included in Apktool 2.0

@iBotPeaches

This comment has been minimized.

Show comment
Hide comment
@iBotPeaches

iBotPeaches Mar 18, 2015

Owner

Comment #12 originally posted by connor.tumbleson on 2013-10-12T20:35:07.000Z:

Issue 525 has been merged into this issue.

Owner

iBotPeaches commented Mar 18, 2015

Comment #12 originally posted by connor.tumbleson on 2013-10-12T20:35:07.000Z:

Issue 525 has been merged into this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment