Skip to content

[Marshal Methods] Consider reducing code duplication in the generated marshal methods LLVM IR #10583

@simonrozsival

Description

@simonrozsival

Android framework version

net10.0-android (Preview)

Affected platform version

.NET 10 GA

Description

There is a lot of code duplication in the marshal methods code. For each Java method, we need to generate 1 matching native method in LLVM IR. The shape of all of these methods is very similar and it looks like there is a lot of code duplication which could be easily reduced.

Note: This is a low priority issue and I have only a very rough idea of what the size savings for a non-trivial MAUI app might be, but they likely won't be groundbreaking. A first-party app with a non-trivial codebase has marshal_methods.arm64-v8.o which is 500 kB and it compresses down to 100 kB. I want to see if Copilot can handle this task on its own and if the changes aren't too drastic or too risky. There's also a chance that the LLVM compiler will inline all the calls anyway and that there won't be any size difference in the end.


Copilot analysis of marshal_methods.x86_64.ll for a medium sized MAUI app:

Looking at this LLVM IR code, I can see significant duplication in the pattern used across these SSL session methods. Let me analyze the structure and sketch out a more optimized version.

Current Pattern Analysis

Each method follows this identical pattern:

  1. Load a function pointer from a global variable
  2. Check if it's null
  3. If null, call get_function_pointer to initialize it
  4. Call the loaded function pointer with the provided arguments
  5. Return the result

The only differences between methods are:

  • The global variable name (e.g., @native_cb_getLocalCertificates_0_1_6000382)
  • The three integer constants passed to get_function_pointer
  • The return type and function signature

Optimization Strategy

Yes, we can dramatically reduce code size by creating shared trampolines grouped by signature. Here's a revised version:

; Shared trampoline for signature: ptr(ptr env, ptr klass)
define internal ptr @shared_trampoline_ptr_ptr_ptr(ptr noundef %env, ptr noundef %klass, ptr noundef %callback_global, i32 noundef %param1, i32 noundef %param2, i32 noundef %param3) local_unnamed_addr #3
{
	%cb1 = load ptr, ptr %callback_global, align 8, !tbaa !3
	%isNull = icmp eq ptr %cb1, null
	br i1 %isNull, label %loadCallback, label %callbackLoaded

loadCallback:
	%get_func_ptr = load ptr, ptr @get_function_pointer, align 8, !tbaa !3
	call void %get_func_ptr(i32 noundef %param1, i32 noundef %param2, i32 noundef %param3, ptr nonnull noundef align(8) dereferenceable(8) %callback_global)
	%cb2 = load ptr, ptr %callback_global, align 8, !tbaa !3
	br label %callbackLoaded

callbackLoaded:
	%fn = phi ptr [%cb2, %loadCallback], [%cb1, %0]
	%result = tail call noundef ptr %fn(ptr noundef %env, ptr noundef %klass)
	ret ptr %result
}

; Now each method becomes a simple wrapper:
define ptr @Java_xamarin_android_net_ServerCertificateCustomValidator_1TrustManager_1FakeSSLSession_n_1getLocalCertificates(ptr noundef %env, ptr noundef %klass) local_unnamed_addr #3
{
	%result = tail call ptr @shared_trampoline_ptr_ptr_ptr(
		ptr %env, 
		ptr %klass, 
		ptr @native_cb_getLocalCertificates_0_1_6000382, 
		i32 212, 
		i32 1, 
		i32 100664194)
	ret ptr %result
}

define ptr @Java_xamarin_android_net_ServerCertificateCustomValidator_1TrustManager_1FakeSSLSession_n_1getPeerCertificateChain(ptr noundef %env, ptr noundef %klass) local_unnamed_addr #3
{
	%result = tail call ptr @shared_trampoline_ptr_ptr_ptr(
		ptr %env, 
		ptr %klass, 
		ptr @native_cb_getPeerCertificateChain_0_1_6000383, 
		i32 212, 
		i32 1, 
		i32 100664195)
	ret ptr %result
}

define ptr @Java_xamarin_android_net_ServerCertificateCustomValidator_1TrustManager_1FakeSSLSession_n_1getPeerCertificates(ptr noundef %env, ptr noundef %klass) local_unnamed_addr #3
{
	%result = tail call ptr @shared_trampoline_ptr_ptr_ptr(
		ptr %env, 
		ptr %klass, 
		ptr @native_cb_getPeerCertificates_0_1_6000384, 
		i32 212, 
		i32 1, 
		i32 100664196)
	ret ptr %result
}

; Shared trampoline for signature: i32(ptr env, ptr klass)
define internal i32 @shared_trampoline_i32_ptr_ptr(ptr noundef %env, ptr noundef %klass, ptr noundef %callback_global, i32 noundef %param1, i32 noundef %param2, i32 noundef %param3) local_unnamed_addr #3
{
	%cb1 = load ptr, ptr %callback_global, align 8, !tbaa !3
	%isNull = icmp eq ptr %cb1, null
	br i1 %isNull, label %loadCallback, label %callbackLoaded

loadCallback:
	%get_func_ptr = load ptr, ptr @get_function_pointer, align 8, !tbaa !3
	call void %get_func_ptr(i32 noundef %param1, i32 noundef %param2, i32 noundef %param3, ptr nonnull noundef align(8) dereferenceable(8) %callback_global)
	%cb2 = load ptr, ptr %callback_global, align 8, !tbaa !3
	br label %callbackLoaded

callbackLoaded:
	%fn = phi ptr [%cb2, %loadCallback], [%cb1, %0]
	%result = tail call noundef i32 %fn(ptr noundef %env, ptr noundef %klass)
	ret i32 %result
}

; Shared trampoline for signature: i64(ptr env, ptr klass)
define internal i64 @shared_trampoline_i64_ptr_ptr(ptr noundef %env, ptr noundef %klass, ptr noundef %callback_global, i32 noundef %param1, i32 noundef %param2, i32 noundef %param3) local_unnamed_addr #3
{
	%cb1 = load ptr, ptr %callback_global, align 8, !tbaa !3
	%isNull = icmp eq ptr %cb1, null
	br i1 %isNull, label %loadCallback, label %callbackLoaded

loadCallback:
	%get_func_ptr = load ptr, ptr @get_function_pointer, align 8, !tbaa !3
	call void %get_func_ptr(i32 noundef %param1, i32 noundef %param2, i32 noundef %param3, ptr nonnull noundef align(8) dereferenceable(8) %callback_global)
	%cb2 = load ptr, ptr %callback_global, align 8, !tbaa !3
	br label %callbackLoaded

callbackLoaded:
	%fn = phi ptr [%cb2, %loadCallback], [%cb1, %0]
	%result = tail call noundef i64 %fn(ptr noundef %env, ptr noundef %klass)
	ret i64 %result
}

; Shared trampoline for signature: i1(ptr env, ptr klass)
define internal i1 @shared_trampoline_i1_ptr_ptr(ptr noundef %env, ptr noundef %klass, ptr noundef %callback_global, i32 noundef %param1, i32 noundef %param2, i32 noundef %param3) local_unnamed_addr #3
{
	%cb1 = load ptr, ptr %callback_global, align 8, !tbaa !3
	%isNull = icmp eq ptr %cb1, null
	br i1 %isNull, label %loadCallback, label %callbackLoaded

loadCallback:
	%get_func_ptr = load ptr, ptr @get_function_pointer, align 8, !tbaa !3
	call void %get_func_ptr(i32 noundef %param1, i32 noundef %param2, i32 noundef %param3, ptr nonnull noundef align(8) dereferenceable(8) %callback_global)
	%cb2 = load ptr, ptr %callback_global, align 8, !tbaa !3
	br label %callbackLoaded

callbackLoaded:
	%fn = phi ptr [%cb2, %loadCallback], [%cb1, %0]
	%result = tail call noundef i1 %fn(ptr noundef %env, ptr noundef %klass)
	ret i1 %result
}

; Shared trampoline for signature: void(ptr env, ptr klass)
define internal void @shared_trampoline_void_ptr_ptr(ptr noundef %env, ptr noundef %klass, ptr noundef %callback_global, i32 noundef %param1, i32 noundef %param2, i32 noundef %param3) local_unnamed_addr #3
{
	%cb1 = load ptr, ptr %callback_global, align 8, !tbaa !3
	%isNull = icmp eq ptr %cb1, null
	br i1 %isNull, label %loadCallback, label %callbackLoaded

loadCallback:
	%get_func_ptr = load ptr, ptr @get_function_pointer, align 8, !tbaa !3
	call void %get_func_ptr(i32 noundef %param1, i32 noundef %param2, i32 noundef %param3, ptr nonnull noundef align(8) dereferenceable(8) %callback_global)
	%cb2 = load ptr, ptr %callback_global, align 8, !tbaa !3
	br label %callbackLoaded

callbackLoaded:
	%fn = phi ptr [%cb2, %loadCallback], [%cb1, %0]
	tail call void %fn(ptr noundef %env, ptr noundef %klass)
	ret void
}

Size Reduction Estimate

Current: ~40 instructions × N methods = massive duplication
Optimized: ~5 instructions per method wrapper + shared trampolines

For this file with hundreds of similar methods, you could reduce code size by 60-80% for the marshaling layer. Each method becomes just a tail call with constants.

Implementation Note

This optimization would need to be done at the code generator level (in the .NET Android binding generator that produces this LLVM IR), not by hand-editing the IR. The LLVM optimizer might do some of this already with inlining heuristics, but making it explicit would guarantee the savings.

Steps to Reproduce

  1. Create new MAUI app dotnet new maui
  2. Build with Marshal Methods enabled (the default configuration i Release mode): dotnet build -c Release
  3. Inspect the generated code in marshal_methods.arm64-v8.ll and the corresponding object file

Did you find any workaround?

No response

Relevant log output

Metadata

Metadata

Labels

Area: Marshal MethodsIssues that only occur when enabling Marshal Methods.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions