Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve unstrip support #108

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open

Improve unstrip support #108

wants to merge 18 commits into from

Conversation

burnedram
Copy link

@burnedram burnedram commented Sep 20, 2023

What

  • Fixes some instances of "Method unstripping failed"
  • Corrects other instances where no/a wrong translation was made
  • Fail on yet another set of instances where a translation is needed but there is none currently
    (I.e. there are a few methods that now fail to unstrip, but they crashed the CLR anyway due to faulty IL)
  • Improves debuggability of the entire unstrip pass
  • Saves a log of what went wrong during unstripping of each method
  • Fixes a few minor bugs that I came across

Why

I'm creating a Il2Cpp plugin for Dave the Diver and wanted to use IMGUI, and came across a few "Method unstripping failed".
So I set out to find the reason why those methods I needed didn't work and this is the result.

Translation logging

"Method unstripping failed" is a very opaque error.
This PR adds logging (to disk only) to the translation process describing why a method couldn't be translated:

  • Names of namespaces, classes, methods and scopes for easier filtering
  • A general error category
  • The instruction that couldn't be translated
  • A message describing what went wrong, if applicable

The logging is saved as a .json.gz in the output's parent folder.
(I.e. it is saved in the same folder that BepInEx saves LogOutput.log)

I choose json because it is very easy to use tools such as jq to query the rather large amount of methods that can't be translated currently.

Examples

Excerpt

{
  "name": "UnityEngine.Experimental.Audio.AudioSampleProvider::get_sampleRate",
  "fullName": "System.UInt32 UnityEngine.Experimental.Audio.AudioSampleProvider::get_sampleRate()",
  "scope": "UnityEngine.AudioModule.dll",
  "namespace": "UnityEngine.Experimental.Audio",
  "type": "AudioSampleProvider",
  "method": "get_sampleRate",
  "instruction": { // The instruction that couldn't be translated
    "description": "IL_0001: ldfld System.UInt32 UnityEngine.Experimental.Audio.AudioSampleProvider::<sampleRate>k__BackingField",
    "opCode": "ldfld",
    "operandType": "InlineField",
    "operandValueType": "FieldDefinition",
    "operand": "System.UInt32 UnityEngine.Experimental.Audio.AudioSampleProvider::<sampleRate>k__BackingField"
  },
  "result": "FieldProxy", // A general error category
  // A message describing what went wrong
  "reason": "Could not find getter for proxy property System.UInt32 UnityEngine.Experimental.Audio.AudioSampleProvider::<sampleRate>k__BackingField"
}

Logs from my development

The entire log file before I made any fixes/changes: unstrip_original.json.gz
Summary of error categories:

  • "FieldProxy": 2215
  • "Unimplemented": 198
  • "Unresolved": 638

The entire log file with all commits in this PR: unstrip.json.gz
Summary of error categories:

  • "FieldProxy": 2342
  • "Unimplemented": 50
  • "Unresolved": 525
  • "NonBlittableStruct": 4
  • "Stack": 18

These logs are of course very specific to the game and Unity version I've been using, but they document the progress that I've made.

Results

Il2CppInterop now correctly translates the IMGUI methods I need in Dave The Diver v1.0.0.1055.steam!
That were the metrics I went by atleast, but there's a whole slew of methods that now work correctly, including:

  • Any method call where a string was used for a System.Object parameter.
    (E.g. any method calling Debug.Log and associates)
    System.Object is (and was) translated into Il2CppSystem.Object, but string is obviously not such an object.
    A Il2CppSystem.String::op_Implicit(string) is injected after the instruction responsible for the string stack object.
  • Any method using array initializers.
    Very common, as string concatenation using + is compiled into this.
    E.g. "test" + 1 + "array" becomes:
    string.Concat(new string[] {[0] = "test", [1] = 1.ToString(), [2] = "array"})
    The setelem instruction family only works an actual arrays (e.g. string[]), but all arrays are translated into one of
    • Il2CppReferenceArray<T>
    • Il2CppStructArray<T>
    • Il2CppStringArray
      Thus setelem should be translated into callvirt instance Il2CppBaseArray<T>::set_Item(int, T)
  • Any method using string concatenation.
    Even if we correctly construct an Il2CppStringArray and correctly fill it with items,
    string.Concat(string[]) still takes a string[].
    Fixed by translating method calls on primitive and string types to redirect them to the Il2Cpp equivalent, e.g:
    Il2CppSystem.String.Concat(Il2CppStringArray)

Additionally, I've added some box/unbox support, ld(s)flda support, and branch retargeting tracking.
The branch retargeting was mostly working before, but it was mostly coincidental.
This PR allows branch targets to be a completely different instruction (which is very common due to the processes being a translation), and it allows us to emit/insert more that one instruction in the new method and still keep the branches intact.

I more than welcome any comments, as I have many assumptions on how the library is supposed to work!

Stelem -> Il2CppArrayBase::set_Item
Ldelem -> Il2CppArrayBase::get_Item
Ldlen -> Il2CppArrayBase::get_Length
Added due to the use of long branches in retargeter
E.g. string::Join becomes Il2CppSystem.String::Join
This is correct since we translate string[] to Il2CppStringArray, etc
ByReferenceType::ElementData is what you want when IsByReference is true.

Example: ref int[] is represented as ByReferenceType(ArrayType(typeof(int))
Using TypeReference::GetElementType() on the outer ByReferenceType
returns typeof(int), not ArrayType(typeof(int)).

ElementData always points to what is ByReference.
Until ResolveTypeInNewAssemblies supports generics, shortcircuit
generic variables and generic reference parameters
I blame Visual Studio's AI completion feature
@akarnokd
Copy link

akarnokd commented Dec 22, 2023

Hi. Sorry for pinging.

I've found this issue based on the exception message I get for Canvas.renderMode in an Unity 2023.1.9f1 IL2CPP translated game.

Since this PR isn't merged and isn't released with the BepInEx nightly, how can I try these changes to see if it fixes my problem?

Edit

Okay, figured it out. Checkout your fork, build.bat, open the BepInEx\core and replace the 4 IL2CppInterop dll of the nightly build with the various generated new dlls. Reset the BepInEx install, run the game. Now I got the method above not throwing by itself, but now I get a different runtime exception:

set_renderMode_Injected(MarshalledUnityObject.MarshalNullCheck((_0021_00210)this), value);
System.MissingMethodException: Method not found: 'IntPtr MarshalledUnityObject.MarshalNullCheck(!!0)'.
   at UnityEngine.Canvas.set_renderMode(RenderMode value)

Edit 2

Managed to get it working. I have to manually call that inject method with a properly cast target object:

var panel = new GameObject("MyOverlay");
var canvas = panel.AddComponent<Canvas>();
canvas.sortingOrder = 1000;
Canvas.set_renderMode_Injected(
    MarshalledUnityObject.MarshalNullCheck(canvas), 
    RenderMode.ScreenSpaceOverlay
);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants