Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF8 string too large on v53 #25

Closed
aeongdesu opened this issue Aug 31, 2022 · 12 comments · Fixed by #55
Closed

UTF8 string too large on v53 #25

aeongdesu opened this issue Aug 31, 2022 · 12 comments · Fixed by #55
Labels
bug Something isn't working

Comments

@aeongdesu
Copy link

aeongdesu commented Aug 31, 2022

ubuntu@fv48:~$ sh dex-tools-2.2-SNAPSHOT/d2j-dex2jar.sh pico.apk
dex2jar pico.apk -> ./pico-dex2jar.jar
GLITCH: 0004 L_a/a/_a;.<init>()V | not enough space for reading instruction
GLITCH: 000c L_/1;.<init>()V | not enough space for reading instruction
Applying workaround to method L_a/a/_a;#___a with original signature null by changing its types to java.lang.Object.
java.lang.IllegalArgumentException
        at org.objectweb.asm.ByteVector.putUTF8(ByteVector.java:213)
        at org.objectweb.asm.ClassWriter.newUTF8(ClassWriter.java:1092)
        at org.objectweb.asm.MethodWriter.<init>(MethodWriter.java:469)
        at org.objectweb.asm.ClassWriter.visitMethod(ClassWriter.java:793)
        at org.objectweb.asm.ClassVisitor.visitMethod(ClassVisitor.java:305)
        at org.objectweb.asm.commons.RemappingClassAdapter.visitMethod(RemappingClassAdapter.java:99)
        at org.objectweb.asm.ClassVisitor.visitMethod(ClassVisitor.java:305)
        at com.googlecode.d2j.dex.Dex2Asm.collectBasicMethodInfo(Dex2Asm.java:285)
        at com.googlecode.d2j.dex.Dex2Asm.convertMethod(Dex2Asm.java:611)
        at com.googlecode.d2j.dex.Dex2Asm.convertClass(Dex2Asm.java:469)
        at com.googlecode.d2j.dex.Dex2Asm.convertClass(Dex2Asm.java:380)
        at com.googlecode.d2j.dex.Dex2Asm.convertDex(Dex2Asm.java:508)
        at com.googlecode.d2j.dex.Dex2jar.doTranslate(Dex2jar.java:180)
        at com.googlecode.d2j.dex.Dex2jar.to(Dex2jar.java:280)
        at com.googlecode.dex2jar.tools.Dex2jarCmd.doCommandLine(Dex2jarCmd.java:112)
        at com.googlecode.dex2jar.tools.BaseCmd.doMain(BaseCmd.java:290)
        at com.googlecode.dex2jar.tools.Dex2jarCmd.main(Dex2jarCmd.java:33)

ubuntu@fv48:~$ java --version
openjdk 11.0.16.1 2022-08-12
OpenJDK Runtime Environment Temurin-11.0.16.1+1 (build 11.0.16.1+1)
OpenJDK 64-Bit Server VM Temurin-11.0.16.1+1 (build 11.0.16.1+1, mixed mode)

apk file: https://cdn.discordapp.com/attachments/757985854588190731/1014526587410075708/vr-_01.20.00.apk

i tried to convert dex -> java but it showed these errors^
it didn't work on pxb1988's dex2jar too.

i dont know about this as well, but is it possible to fix?

@aeongdesu aeongdesu changed the title not enough space for reading instruction on v53 IllegalArgumentException on v53 Aug 31, 2022
@ThexXTURBOXx
Copy link
Owner

ThexXTURBOXx commented Aug 31, 2022

Error reproducible on my end. However, I have a slightly different stacktrace (are you really using v53?):

dex2jar vr-_01.20.00.apk -> .\vr-_01.20.00-dex2jar.jar
GLITCH: 0004 L_a/a/_a;-><init>()V | not enough space for reading instruction
GLITCH: 000c L_/1;-><init>()V | not enough space for reading instruction
java.lang.IllegalArgumentException: UTF8 string too large
        at org.objectweb.asm.ByteVector.putUTF8(ByteVector.java:255)
        at org.objectweb.asm.SymbolTable.addConstantUtf8(SymbolTable.java:774)
        at org.objectweb.asm.MethodWriter.<init>(MethodWriter.java:601)
        at org.objectweb.asm.ClassWriter.visitMethod(ClassWriter.java:468)
        at org.objectweb.asm.ClassVisitor.visitMethod(ClassVisitor.java:365)
        at org.objectweb.asm.commons.ClassRemapper.visitMethod(ClassRemapper.java:187)
        at org.objectweb.asm.ClassVisitor.visitMethod(ClassVisitor.java:365)
        at com.googlecode.d2j.dex.Dex2Asm.collectBasicMethodInfo(Dex2Asm.java:352)
        at com.googlecode.d2j.dex.Dex2Asm.convertMethod(Dex2Asm.java:746)
        at com.googlecode.d2j.dex.Dex2Asm.convertClass(Dex2Asm.java:549)
        at com.googlecode.d2j.dex.Dex2Asm.convertClass(Dex2Asm.java:450)
        at com.googlecode.d2j.dex.Dex2Asm.convertDex(Dex2Asm.java:615)
        at com.googlecode.d2j.dex.Dex2jar.doTranslate(Dex2jar.java:146)
        at com.googlecode.d2j.dex.Dex2jar.to(Dex2jar.java:246)
        at com.googlecode.dex2jar.tools.Dex2jarCmd.doCommandLine(Dex2jarCmd.java:103)
        at com.googlecode.dex2jar.tools.BaseCmd.doMain(BaseCmd.java:297)
        at com.googlecode.dex2jar.tools.Dex2jarCmd.main(Dex2jarCmd.java:16)

I will see if I can do anything about that

@aeongdesu
Copy link
Author

@ThexXTURBOXx oh well... i was confused with original snapshot version 😔 anyways thank you, I'll wait

@aeongdesu aeongdesu changed the title IllegalArgumentException on v53 UTF8 string too large on v53 Aug 31, 2022
@ThexXTURBOXx ThexXTURBOXx added the bug Something isn't working label Sep 26, 2022
@stefan123t
Copy link

stefan123t commented Sep 22, 2023

@ThexXTURBOXx thanks for maintaining this tool-chain !
I found your repo to have an issue already open on UTF8 string too large error, which is not the case upstream.

I used this APK com.hm.hemaiInstall1 v1.1.10:
https://apkpure.com/s-miles-installer/com.hm.hemaiInstall1/versions

As the application I want to decompile is built in China and contains quite some UTF8 unicode strings I guess that the handling of UTF8 unicode is not yet working. Also when further decompiling the resulting jar files with the jd-gui I do see quit a couple of chinese characters, but they can not be copied from the resulting source. Some non-space characters may be included in such strings.

$ sh d2j-dex2jar.sh -f s-miles.apk 
dex2jar s-miles.apk -> ./s-miles-dex2jar.jar
java.lang.IllegalArgumentException: UTF8 string too large
	at org.objectweb.asm.ByteVector.putUTF8(ByteVector.java:255)
	at org.objectweb.asm.SymbolTable.addConstantUtf8(SymbolTable.java:774)
	at org.objectweb.asm.SymbolTable.addConstantUtf8Reference(SymbolTable.java:1007)
	at org.objectweb.asm.SymbolTable.addConstantString(SymbolTable.java:604)
	at org.objectweb.asm.SymbolTable.addConstant(SymbolTable.java:474)
	at org.objectweb.asm.MethodWriter.visitLdcInsn(MethodWriter.java:1280)
	at org.objectweb.asm.MethodVisitor.visitLdcInsn(MethodVisitor.java:562)
	at org.objectweb.asm.commons.MethodRemapper.visitLdcInsn(MethodRemapper.java:196)
	at org.objectweb.asm.tree.LdcInsnNode.accept(LdcInsnNode.java:75)
	at org.objectweb.asm.tree.InsnList.accept(InsnList.java:144)
	at org.objectweb.asm.tree.MethodNode.accept(MethodNode.java:749)
	at com.googlecode.d2j.dex.ExDex2Asm.convertCode(ExDex2Asm.java:36)
	at com.googlecode.d2j.dex.Dex2jar$2.convertCode(Dex2jar.java:126)
	at com.googlecode.d2j.dex.Dex2Asm.convertMethod(Dex2Asm.java:821)
	at com.googlecode.d2j.dex.Dex2Asm.convertClass(Dex2Asm.java:567)
	at com.googlecode.d2j.dex.Dex2Asm.convertClass(Dex2Asm.java:468)
	at com.googlecode.d2j.dex.Dex2Asm.convertDex(Dex2Asm.java:633)
	at com.googlecode.d2j.dex.Dex2jar.doTranslate(Dex2jar.java:181)
	at com.googlecode.d2j.dex.Dex2jar.doTranslate(Dex2jar.java:53)
	at com.googlecode.d2j.dex.Dex2jar.to(Dex2jar.java:281)
	at com.googlecode.dex2jar.tools.Dex2jarCmd.doCommandLine(Dex2jarCmd.java:104)
	at com.googlecode.dex2jar.tools.BaseCmd.doMain(BaseCmd.java:297)
	at com.googlecode.dex2jar.tools.Dex2jarCmd.main(Dex2jarCmd.java:16)

@ThexXTURBOXx
Copy link
Owner

I have taken a closer look at this issue and actually, string variables are limited to a size of 65535: https://gitlab.ow2.org/asm/asm/-/blob/master/asm/src/main/java/org/objectweb/asm/ByteVector.java?ref_type=heads#L254
This is also specified in the Java standard. I have pushed a workaround which still gives proper output for the rest of the files, but skips all files which have problems.

@Rabbit0w0
Copy link

Please update the link. It's no longer working.
Ps: Anyone with a sample is welcomed to send to my email

@ThexXTURBOXx
Copy link
Owner

@Rabbit0w0 Still works on my end: https://apkpure.com/s-miles-installer/com.hm.hemaiInstall1/downloading/V1.1.10

@stefan123t
Copy link

stefan123t commented Oct 20, 2024

@ThexXTURBOXx @Rabbit0w0 can we maybe switch from String to Stream in order to fix the lenght limitation. I have no clue on the location in the source, just my thought what to do when we expect these lengthy in memory strings in java.

@ThexXTURBOXx
Copy link
Owner

The core of the problem is that the final destination within the Java bytecode is affected by the length limitation. A real fix would need to somehow get around this.
MAYBE, splitting a large string into multiple strings, each with length <= 65535, works. However, this might introduce even more errors regarding the limit of data within a single Java bytecode class file.
The same is true for converting the strings to byte arrays first and converting these to strings at runtime.
Something that definitely works is splitting strings across multiple classes and concatenating them at runtime, but this would require a major rewrite of the whole dex2jar system, which I do not have the time for, sadly.
Maybe @Rabbit0w0 comes up with some better solution, though!

@Rabbit0w0
Copy link

Rabbit0w0 commented Oct 20, 2024

I see. There is a long constant string which is preventing asm from writing it to standard Java bytecode.
There is no need to rewrite the whole project. We just need to add a preprocessor which splits the long string and then concat it using StringBuilder (the way that Java 8 and lower uses).
I will try to write a workaround for this.
@ThexXTURBOXx May you please add a submodule/package for preprocessors as I am afraid that my coding style does not appeal to this project?

@ThexXTURBOXx
Copy link
Owner

I am a bit hesitant to add more submodules. In my opinion, it is fine if you just write the code, create a PR and I fix the style afterwards :)

@Rabbit0w0
Copy link

Rabbit0w0 commented Oct 20, 2024 via email

@stefan123t
Copy link

You are both awesome !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
4 participants