Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for compressed assemblies in APK #4686

Merged
merged 2 commits into from
May 26, 2020

Conversation

grendello
Copy link
Contributor

@grendello grendello commented May 13, 2020

Currently, Xamarin.Android supports managed assembly compression in
the APK archive if application is bundled (with Mono's mkbundle) into
a native shared library. Managed assemblies are compressed using gzip
compression and placed in an array inside the data section of the shared
library. However, support for mkbundle is possibly going to be
removed and we realized it is a feature some developers appreciate since
the produced APKs are smaller and the impact on startup time isn't big
enough to worry.

This commit aims to be a replacement for mkbundle with a handful of
improvements thrown in. First of all, the compression is performed
using the managed implementation of the excellent LZ4
algorithm. This gives us a decent compression ratio and a much faster
(de)compression speed than gzip/zlib offer. Also, assemblies are stored
directly in the APK in their usual directory, which allows us to mmap
them on the runtime directly from the APK. The build process calculates
the size required to store the decompressed assemblies and adds a data
section to libxamarin-app.so which makes Android allocate all the
required memory when the DSO is loaded, thus removing the need of
dynamic memory allocation and making the startup faster.

Compression is supported only in Release builds and is enabled by
default, but it can be turned off by setting the
$(AndroidEnableAssemblyCompression) MSBuild property to False. If
there's a need to turn compression off for an individual assembly by
adding the AndroidSkipCompression metadata item to the assembly in
question using code similar to this, in the application's project file:

<AndroidCustomMetaDataForReferences Include="MyAssembly.dll">
   <AndroidSkipCompression>true</AssemblySkipCompression>
</AndroidCustomMetaDataForReferences>

The compressed assemblies still use their original name (e.g.
Mono.Android.dll) so that we don't have to perform any string matching
on the runtime in order to detect whether the assembly we are asked to
load is compressed or not. Instead, the compression code prepends a
short header to each .dll file (in pseudo-code):

uint32 magic = 0x5A4C4158; // 'XALZ', little-endian
uint32 index; // Index into an internal assembly descriptor table
uint32 uncompressed_length;

The decompression code looks at the mmapped data and checks whether the
above header is present. If yes, the assembly is decompressed,
otherwise it's loaded as-is.

It is important to remember that the assemblies are compressed on the
build time using LZ4 block compression which requires assembly data to
be entirely loaded into memory (we do this instead of using the LZ4
frame format to make decompression on the run time faster) before
compression. The compression output also requires a separate buffer,
thus memory consumption will roughly be 1.5x the assembly size.
However, since we use a byte buffer pool, memory consumption will not be
a sum of all the assemblies but rather the size of the biggest one in
the set.

~ Application Size ~

A Xamarin.Forms "Hello World" application APK shrunk by 27% with this
commit:

Before After Δ
23305194 16813034 -27,85%

Size comparison between this commit and APKs created with
$(BundleAssemblies) == True depends on the number of enabled ABI
targets in the application. For each ABI, $(BundleAssemblies) == True
creates a separate shared library, so the amount of space consumed
increases by the size of the bundle shared library. The new compression
scheme shares the compressed assemblies among all the enabled ABIs, thus
effectively creating smaller multi-ABI APKs.

In the tables below, Before refers to the APK created with
$(BundleAssemblies) == True, After refers to the APK build with the
new compression scheme.

All ABIs enabled:

Before After Δ
27130240 16813034 -38,03%

Single ABI enabled:

Before After Δ
7783449 8746878 +11,01%

~ Startup Performance ~

Startup time of the same application isn't affected too much by
decompression (comparison between uncompressed application and one
compressed using the new scheme):

~ Before ~

App configuration: Release

Xamarin.Android

  • Version: 10.4.100-12
  • Branch: master
  • Commit: 3f438e4

~ After ~

App configuration: Release

Xamarin.Android

  • Version: 10.4.100-18
  • Branch: compress-assemblies
  • Commit: cec90e9

Device

  • Model: Pixel 3 XL
  • Native architecture: arm64-v8a
  • SDK version: 29

~ Application Displayed Time ~

Before After Δ Notes
795.800 793.800 -0.25% ✓ preload enabled; 32-bit build
777.100 780.500 +0.44% ✗ preload disabled; 32-bit build
779.000 791.500 +1.58% ✗ preload enabled; 64-bit build
776.000 781.400 +0.69% ✗ preload disabled; 64-bit build

Comparison of startup times between the $(BundleAssemblies) == True
scheme and the new one with the same device and application as
above (once again Before refers to the $(BundleAssemblies)
application):

Before After Δ Notes
855.600 793.800 -7.22% ✓ preload enabled; 32-bit build
843.000 780.500 -7.41% ✓ preload disabled; 32-bit build
849.400 791.500 -6.82% ✓ preload enabled; 64-bit build
841.600 781.400 -7.15% ✓ preload disabled; 64-bit build

@grendello grendello force-pushed the compress-assemblies branch 7 times, most recently from 7225496 to a339792 Compare May 15, 2020 18:54
@jonpryor
Copy link
Member

@JonDouglas chimed in today and believes that we should have assembly compression enabled by default.

@grendello grendello force-pushed the compress-assemblies branch 2 times, most recently from 5d80083 to ea8f74f Compare May 19, 2020 08:21
@grendello
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@grendello grendello force-pushed the compress-assemblies branch 3 times, most recently from 3503e1c to f48cd41 Compare May 20, 2020 10:58
@grendello grendello changed the title Compress assemblies Add support for compressed assemblies in APK May 20, 2020
@grendello grendello marked this pull request as ready for review May 20, 2020 11:01
@grendello grendello force-pushed the compress-assemblies branch 4 times, most recently from 4f07ef4 to 61d4e74 Compare May 20, 2020 17:43
@brendanzagaeski
Copy link
Contributor

Draft release notes

During the finalization of this PR before merge, when you get a chance, you can add a release note for it to Documentation/release-notes/4686.md as part of the PR. Here's a rough draft that has some ?? items to indicate missing info. The other info might also not be 100% accurate, so feel free to edit liberally. Thanks!

### Smaller app package sizes

This version introduces compression of managed assemblies by default for Release
configuration builds, resulting in significantly smaller APK and App Bundle
sizes.  Assemblies are compressed with the [LZ4][lz4] algorithm during builds
and then decompressed on device during app startup.

For a small example Xamarin.Forms application, this reduced the APK size from
about 23 megabytes to about 17 megabytes while only increasing the time to
display the first page of the app from about 780 milliseconds to about 790
milliseconds in the least favorable configuration.

If needed, the new behavior can been disabled for a particular project by
setting the `AndroidEnableAssemblyCompression` MSBuild property to `false` in
the _.csproj_ file:

```xml
<PropertyGroup>
  <AndroidEnableAssemblyCompression>false</AndroidEnableAssemblyCompression>
</PropertyGroup>
```

> [!NOTE]
> This feature is intended to replace the older **Bundle assemblies into native
> code** Visual Studio Enterprise feature for purposes of app size savings.  The
> `AndroidEnableAssemblyCompression` property takes precedence if both features
> are enabled.  Project authors who no longer need the **Bundle assemblies into
> native code** feature enabled can now disable it or remove the
> `BundleAssemblies` MSBuild property from the _.csproj_ file:
>
> ```diff
>  <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' ">
>    <DebugSymbols>True</DebugSymbols>
>    <DebugType>portable</DebugType>
>    <Optimize>True</Optimize>
>    <OutputPath>bin\Release\</OutputPath>
>    <DefineConstants>TRACE</DefineConstants>
>    <ErrorReport>prompt</ErrorReport>
>    <WarningLevel>4</WarningLevel>
>    <AndroidManagedSymbols>true</AndroidManagedSymbols>
>    <AndroidUseSharedRuntime>False</AndroidUseSharedRuntime>
>    <AndroidLinkMode>SdkOnly</AndroidLinkMode>
>    <EmbedAssembliesIntoApk>True</EmbedAssembliesIntoApk>
> -  <BundleAssemblies>true</BundleAssemblies>
>  </PropertyGroup>
> ```

#### Background information

For comparison, for the small test Xamarin.Forms application, the **Bundle
assemblies into native code** feature reduces the APK size from about 23
megabytes to about ?? megabytes while ??increasing?? the time to display the
first page of the app from about 780 milliseconds to about ?? milliseconds in
the least favorable configuration.

[lz4]: https://github.com/lz4/lz4

// }

data.DestinationPath = $"{data.SourcePath}.lz4";
data.SourceSize = (uint)fi.Length;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yet another place I think we should consider using checked()

@@ -5,13 +5,13 @@
"Size": 3684
},
"classes.dex": {
"Size": 2200624
"Size": 2198984
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat odd that this shrank

Copy link
Contributor Author

@grendello grendello May 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

¯\_(ツ)_/¯

@jonpryor
Copy link
Member

I'm not quite sure I understand the use/interaction of @(AndroidCustomMetaDataForReferences) item group. I had thought that the %(AndroidSkipCompression) metadata would be provided via e.g. %(Reference.AndroidSkipCompression) or %(ProjectReference.AndroidSkipCompression). Can those also be used? Or is @(AndroidCustomMetaDataForReferences) needed?

@grendello
Copy link
Contributor Author

I'm not quite sure I understand the use/interaction of @(AndroidCustomMetaDataForReferences) item group. I had thought that the %(AndroidSkipCompression) metadata would be provided via e.g. %(Reference.AndroidSkipCompression) or %(ProjectReference.AndroidSkipCompression). Can those also be used? Or is @(AndroidCustomMetaDataForReferences) needed?

Either can be used. The item group simply makes it possible for the assemblies you don't control - coming from nugets for instance - as you wouldn't be able to add metadata items to them in any other way.

@jonpryor
Copy link
Member

Single ABI enabled

I find it surprising that $(BundleAssemblies) results in a smaller .apk size when only a single ABI is included in the .apk. Not only that, but that $(BundleAssemblies is 11% smaller!

Any idea why this is the case?

byte[] sourceBytes = null;
byte[] destBytes = null;
try {
sourceBytes = bytePool.Rent ((int)fi.Length);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thought on the checked() front, 4GB is "too big", but what about "2GB+1"? That would be enough for this (int) cast to return a negative value, which can't be good.

Presumably bytePool.Rent() would throw, but given that Compress() could return CompressionResult.InputTooBig, an "unhandled exception" might be less than ideal.

@grendello
Copy link
Contributor Author

Single ABI enabled

I find it surprising that $(BundleAssemblies) results in a smaller .apk size when only a single ABI is included in the .apk. Not only that, but that $(BundleAssemblies is 11% smaller!

Any idea why this is the case?

gzip compression ratio is better than lz4 but much slower - at level -9 the former is able to process around 15 MB/s while the latter 46 MB/s. On the same machine, gzip decompresses at around 235 MB/s while lz4 at 550 MB/s

Currently, `Xamarin.Android` supports managed assembly compression in
the APK archive if application is bundled (with Mono's `mkbundle`) into
a native shared library.  Managed assemblies are compressed using gzip
compression and placed in an array inside the data section of the shared
library.  However, support for `mkbundle` is possibly going to be
removed and we realized it is a feature some developers appreciate since
the produced APKs are smaller and the impact on startup time isn't big
enough to worry.

This commit aims to be a replacement for `mkbundle` with a handful of
improvements thrown in.  First of all, the compression is performed
using the [managed implementation][0] of the excellent [LZ4][1]
algorithm.  This gives us a decent compression ratio and a much faster
(de)compression speed than gzip/zlib offer.  Also, assemblies are stored
directly in the APK in their usual directory, which allows us to `mmap`
them on the runtime directly from the APK.  The build process calculates
the size required to store the decompressed assemblies and adds a data
section to `libxamarin-app.so` which makes Android allocate all the
required memory when the DSO is loaded, thus removing the need of
dynamic memory allocation and making the startup faster.

Compression is supported only in `Release` builds and is enabled by
default, but it can be turned off by setting the
`$(AndroidEnableAssemblyCompression)` MSBuild property to `False`. If
there's a need to turn compression off for an individual assembly by
adding the `AndroidSkipCompression` metadata item to the assembly in
question using code similar to this, in the application's project file:

    <AndroidCustomMetaDataForReferences Include="MyAssembly.dll">
       <AndroidSkipCompression>true</AssemblySkipCompression>
    </AndroidCustomMetaDataForReferences>

The compressed assemblies still use their original name (e.g.
`Mono.Android.dll`) so that we don't have to perform any string matching
on the runtime in order to detect whether the assembly we are asked to
load is compressed or not.  Instead, the compression code prepends a
short header to each .dll file (in pseudo-code):

    uint32 magic = 0x5A4C4158; // 'XALZ', little-endian
    uint32 index; // Index into an internal assembly descriptor table
    uint32 uncompressed_length;

The decompression code looks at the mmapped data and checks whether the
above header is present.  If yes, the assembly is decompressed,
otherwise it's loaded as-is.

It is important to remember that the assemblies are compressed on the
build time using LZ4 block compression which requires assembly data to
be entirely loaded into memory (we do this instead of using the LZ4
frame format to make decompression on the run time faster) before
compression.  The compression output also requires a separate buffer,
thus memory consumption will roughly be 1.5x the assembly size.
However, since we use a byte buffer pool, memory consumption will not be
a sum of all the assemblies but rather the size of the biggest one in
the set.

~ Application Size ~

A Xamarin.Forms "Hello World" application APK shrunk by 27% with this
commit:

| Before   | After    | Δ         |
|----------|----------|-----------|
| 23305194 | 16813034 | -27,85%   |

Size comparison between this commit and APKs created with
`$(BundleAssemblies) == True` depends on the number of enabled ABI
targets in the application. For each ABI, `$(BundleAssemblies) == True`
creates a separate shared library, so the amount of space consumed
increases by the size of the bundle shared library.  The new compression
scheme shares the compressed assemblies among all the enabled ABIs, thus
effectively creating smaller multi-ABI APKs.

In the tables below, `Before` refers to the APK created with
`$(BundleAssemblies) == True`, `After` refers to the APK build with the
new compression scheme.

All ABIs enabled:

| Before   | After    | Δ         |
|----------|----------|-----------|
| 27130240 | 16813034 | -38,03%   |

Single ABI enabled:

| Before   | After    | Δ         |
|----------|----------|-----------|
| 7783449  | 8746878  | +11,01%   |

~ Startup Performance ~

Startup time of the same application isn't affected too much by
decompression (comparison between uncompressed application and one
compressed using the new scheme):

~ Before ~

App configuration: **Release**

Xamarin.Android
  - Version: **10.4.100-12**
  - Branch: **master**
  - Commit: **3f438e46d7b166a3a3ef54c9ffafb5f426760468**

~ After ~

App configuration: **Release**

Xamarin.Android
  - Version: **10.4.100-18**
  - Branch: **compress-assemblies**
  - Commit: **cec90e936478f9afbbc31b43e52164ecd5182c79**

Device
  - Model: **Pixel 3 XL**
  - Native architecture: **arm64-v8a**
  - SDK version: **29**

~ Application Displayed Time ~

| Before  | After   | Δ        | Notes                          |
| ------- | ------- | -------- | ------------------------------ |
| 795.800 | 793.800 | -0.25% ✓ | preload enabled; 32-bit build  |
| 777.100 | 780.500 | +0.44% ✗ | preload disabled; 32-bit build |
| 779.000 | 791.500 | +1.58% ✗ | preload enabled; 64-bit build  |
| 776.000 | 781.400 | +0.69% ✗ | preload disabled; 64-bit build |

Comparison of startup times between the `$(BundleAssemblies) == True`
scheme and the new one with the same device and application as
above (once again `Before` refers to the `$(BundleAssemblies)`
application):

| Before  | After   | Δ        | Notes                          |
| ------- | ------- | -------- | ------------------------------ |
| 855.600 | 793.800 | -7.22% ✓ | preload enabled; 32-bit build  |
| 843.000 | 780.500 | -7.41% ✓ | preload disabled; 32-bit build |
| 849.400 | 791.500 | -6.82% ✓ | preload enabled; 64-bit build  |
| 841.600 | 781.400 | -7.15% ✓ | preload disabled; 64-bit build |

[0]: https://www.nuget.org/packages/K4os.Compression.LZ4/
[1]: https://github.com/lz4/lz4
[2]: https://quixdb.github.io/squash-benchmark/#results-table
Weird intermixing of spaces following tabs…
@jonpryor
Copy link
Member

jonpryor commented May 26, 2020

Draft commit message:

Currently, Xamarin.Android supports compression of managed assemblies
within the `.apk` if the app is built with
[`$(BundleAssemblies)`=True][0], with the compressed assembly data
stored within `libmonodroid_bundle_app.so` using gzip compression and
placed in an array inside the data section of the shared library.

There are two problems with this approach:

 1. `mkbundle` emits C code, which requires a C compiler which requires
    the full Android NDK, and thus requires Visual Studio Enterprise.

 2. Reliance on Mono's `mkbundle` results in possible issues around
    [filename globbing][1] such that
    `Xamarin.AndroidX.AppCompat.Resources.dll` is improperly treated
    as a [satellite assembly][2].

Because of (2), we are planning on [removing support][3] for
`$(BundleAssemblies)` in .NET 6 ([née .NET 5][4]), which resulted in
[some pushback][5] because `.apk` size is very important for some
customers, and the startup overheads we believed to be inherent to
`$(BundleAssemblies)` turned out to be somewhat over-estimated.

To resolve the above issues, add an assembly compression mechanism
that doesn't rely on `mkbundle` and the NDK: separately compress the
assemblies and store the compressed data within the `.apk`.
Compression is performed using the [managed implementation][6] of the
excellent [LZ4][7] algorithm.  This gives us a decent compression ratio
and a much faster (de)compression speed than gzip/zlib offer.  Also,
assemblies are stored directly in the APK in their usual directory,
which allows us to [**mmap**(2)][8] them in the runtime directly from
the `.apk`.  The build process calculates the size required to store
the decompressed assemblies and adds a data section to
`libxamarin-app.so` which causes *Android* to allocate all the required
memory when the DSO is loaded, thus removing the need of dynamic memory
allocation and making the startup faster.

Compression is supported only in `Release` builds and is enabled by
default, but it can be turned off by setting the
`$(AndroidEnableAssemblyCompression)` MSBuild property to `False`.
Compression can be disabled for an individual assembly by setting the
`%(AndroidSkipCompression)` MSBuild item metadata to True for the
assembly in question, e.g. via:

	<AndroidCustomMetaDataForReferences Include="MyAssembly.dll">
	  <AndroidSkipCompression>true</AssemblySkipCompression>
	</AndroidCustomMetaDataForReferences>

The compressed assemblies still use their original name, e.g.
`Mono.Android.dll`, so that we don't have to perform any string
matching on the runtime in order to detect whether the assembly we are
asked to load is compressed or not.  Instead, the compression code
*prepends* a short header to each `.dll` file (in pseudo C code):

	struct CompressedAssemblyHeader {
	    uint32_t magic;                 // 0x5A4C4158; 'XALZ', little-endian
	    uint32_t descriptor_index;      // Index into an internal assembly descriptor table
	    uint32_t uncompressed_length;   // Size of assembly, uncompressed
	};

The decompression code looks at the `mmap`ed data and checks whether
the above header is present.  If yes, the assembly is decompressed,
otherwise it's loaded as-is.

It is important to remember that the assemblies are compressed at
build time using LZ4 block compression, which requires assembly data
to be entirely loaded into memory before compression; we do this
instead of using the LZ4 frame format to make decompression at runtime
faster.  The compression output also requires a separate buffer, thus
memory consumption at *build* time will be roughly 1.5x the size of the
largest assembly, which is reused across all assemblies.


~~ Application Size ~~

A Xamarin.Forms "Hello World" application `.apk` shrinks by 27% with
this approach for a single ABI:

|    Before (bytes) |   LZ4 (bytes) |     Δ     |
|------------------:|--------------:|:---------:|
|        23,305,194 |    16,813,034 |  -27.85%  |

Size comparison between this commit and `.apk`s created with
`$(BundleAssemblies)` =True depends on the number of enabled ABI
targets in the application.  For each ABI, `$(BundleAssemblies)`=True
creates a separate shared library, so the amount of space consumed
increases by the size of the bundle shared library.

The new compression scheme shares the compressed assemblies among all
the enabled ABIs, thus effectively creating smaller multi-ABI `.apk`s.

In the tables below, `mkbundle` refers to the APK created with
`$(BundleAssemblies)`=True, `lz4` refers to the `.apk` build with
the new compression scheme:

|                                  ABIs |  mkbundle (bytes) |   LZ4 (bytes) |    Δ    |
|--------------------------------------:|------------------:|--------------:|---------|
|   armeabi-v7a, arm64-v8a, x86, x86_64 |        27,130,240 |    16,813,034 | -38.03% |
|                             arm64-v8a |         7,783,449 |     8,746,878 | +11.01% |

The single API case is ~11% larger because gzip offers better
compression, at the cost of higher runtime startup overhead.


~~ Startup Performance ~~

When launching the Xamarin.Forms "Hello World" application on a
Pixel 3 XL, the use of LZ4-compressed assemblies has at worst a ~1.58%
increase in the Activity Displayed time (64-bit app w/ assembly
preload enabled), while slightly faster on 32-bit apps, but is *always*
faster than the mkbundle startup time for all configurations:

|                                   |           |               |           |  LZ4 vs  |   LZ4 vs   |
|                       Description | None (ms) | mkbundle (ms) |  LZ4 (ms) |  None Δ  | mkbundle Δ |
|----------------------------------:|----------:|--------------:|----------:|:--------:|:----------:|
|     preload enabled; 32-bit build |     795.8 |         855.6 |     783.8 | -0.25% ✓ |  -7.22% ✓  |
|    preload disabled; 32-bit build |     777.1 |         843.0 |     780.5 | +0.44% ✗ |  -7.41% ✓  |
|     preload enabled; 64-bit build |     779.0 |         843.0 |     791.5 | +1.58% ✗ |  -6.82% ✓  |
|    preload disabled; 64-bit build |     776.0 |         841.6 |     781.5 | +0.69% ✗ |  -7.15% ✓  |


[0]: https://docs.microsoft.com/en-us/xamarin/android/deploy-test/release-prep/?tabs=windows#bundle-assemblies-into-native-code
[1]: https://github.com/xamarin/AndroidX/issues/64
[2]: https://github.com/mono/mono/blob/9b4736d4c271e9d4e04cafa258ddd58961f1a39f/mcs/tools/mkbundle/mkbundle.cs#L1315-L1317
[3]: https://github.com/xamarin/AndroidX/issues/64#issuecomment-609970584
[4]: https://devblogs.microsoft.com/dotnet/announcing-net-5-preview-4-and-our-journey-to-one-net/
[5]: https://github.com/xamarin/AndroidX/issues/64#issuecomment-610002467
[6]: https://www.nuget.org/packages/K4os.Compression.LZ4/
[7]: https://github.com/lz4/lz4
[8]: https://linux.die.net/man/2/mmap

@jonpryor jonpryor merged commit d236af5 into dotnet:master May 26, 2020
@grendello grendello deleted the compress-assemblies branch May 26, 2020 19:01
jonpryor pushed a commit that referenced this pull request May 26, 2020
…#4686)

Currently, Xamarin.Android supports compression of managed assemblies
within the `.apk` if the app is built with
[`$(BundleAssemblies)`=True][0], with the compressed assembly data
stored within `libmonodroid_bundle_app.so` using gzip compression and
placed in an array inside the data section of the shared library.

There are two problems with this approach:

 1. `mkbundle` emits C code, which requires a C compiler which requires
    the full Android NDK, and thus requires Visual Studio Enterprise.

 2. Reliance on Mono's `mkbundle` results in possible issues around
    [filename globbing][1] such that
    `Xamarin.AndroidX.AppCompat.Resources.dll` is improperly treated
    as a [satellite assembly][2].

Because of (2), we are planning on [removing support][3] for
`$(BundleAssemblies)` in .NET 6 ([née .NET 5][4]), which resulted in
[some pushback][5] because `.apk` size is very important for some
customers, and the startup overheads we believed to be inherent to
`$(BundleAssemblies)` turned out to be somewhat over-estimated.

To resolve the above issues, add an assembly compression mechanism
that doesn't rely on `mkbundle` and the NDK: separately compress the
assemblies and store the compressed data within the `.apk`.
Compression is performed using the [managed implementation][6] of the
excellent [LZ4][7] algorithm.  This gives us a decent compression ratio
and a much faster (de)compression speed than gzip/zlib offer.  Also,
assemblies are stored directly in the APK in their usual directory,
which allows us to [**mmap**(2)][8] them in the runtime directly from
the `.apk`.  The build process calculates the size required to store
the decompressed assemblies and adds a data section to
`libxamarin-app.so` which causes *Android* to allocate all the required
memory when the DSO is loaded, thus removing the need of dynamic memory
allocation and making the startup faster.

Compression is supported only in `Release` builds and is enabled by
default, but it can be turned off by setting the
`$(AndroidEnableAssemblyCompression)` MSBuild property to `False`.
Compression can be disabled for an individual assembly by setting the
`%(AndroidSkipCompression)` MSBuild item metadata to True for the
assembly in question, e.g. via:

	<AndroidCustomMetaDataForReferences Include="MyAssembly.dll">
	  <AndroidSkipCompression>true</AssemblySkipCompression>
	</AndroidCustomMetaDataForReferences>

The compressed assemblies still use their original name, e.g.
`Mono.Android.dll`, so that we don't have to perform any string
matching on the runtime in order to detect whether the assembly we are
asked to load is compressed or not.  Instead, the compression code
*prepends* a short header to each `.dll` file (in pseudo C code):

	struct CompressedAssemblyHeader {
	    uint32_t magic;                 // 0x5A4C4158; 'XALZ', little-endian
	    uint32_t descriptor_index;      // Index into an internal assembly descriptor table
	    uint32_t uncompressed_length;   // Size of assembly, uncompressed
	};

The decompression code looks at the `mmap`ed data and checks whether
the above header is present.  If yes, the assembly is decompressed,
otherwise it's loaded as-is.

It is important to remember that the assemblies are compressed at
build time using LZ4 block compression, which requires assembly data
to be entirely loaded into memory before compression; we do this
instead of using the LZ4 frame format to make decompression at runtime
faster.  The compression output also requires a separate buffer, thus
memory consumption at *build* time will be roughly 1.5x the size of the
largest assembly, which is reused across all assemblies.

~~ Application Size ~~

A Xamarin.Forms "Hello World" application `.apk` shrinks by 27% with
this approach for a single ABI:

|    Before (bytes) |   LZ4 (bytes) |     Δ     |
|------------------:|--------------:|:---------:|
|        23,305,194 |    16,813,034 |  -27.85%  |

Size comparison between this commit and `.apk`s created with
`$(BundleAssemblies)` =True depends on the number of enabled ABI
targets in the application.  For each ABI, `$(BundleAssemblies)`=True
creates a separate shared library, so the amount of space consumed
increases by the size of the bundle shared library.

The new compression scheme shares the compressed assemblies among all
the enabled ABIs, thus effectively creating smaller multi-ABI `.apk`s.

In the tables below, `mkbundle` refers to the APK created with
`$(BundleAssemblies)`=True, `lz4` refers to the `.apk` build with
the new compression scheme:

|                                  ABIs |  mkbundle (bytes) |   LZ4 (bytes) |    Δ    |
|--------------------------------------:|------------------:|--------------:|---------|
|   armeabi-v7a, arm64-v8a, x86, x86_64 |        27,130,240 |    16,813,034 | -38.03% |
|                             arm64-v8a |         7,783,449 |     8,746,878 | +11.01% |

The single API case is ~11% larger because gzip offers better
compression, at the cost of higher runtime startup overhead.

~~ Startup Performance ~~

When launching the Xamarin.Forms "Hello World" application on a
Pixel 3 XL, the use of LZ4-compressed assemblies has at worst a ~1.58%
increase in the Activity Displayed time (64-bit app w/ assembly
preload enabled), while slightly faster on 32-bit apps, but is *always*
faster than the mkbundle startup time for all configurations:

|                                   |           |               |           |  LZ4 vs  |   LZ4 vs   |
|                       Description | None (ms) | mkbundle (ms) |  LZ4 (ms) |  None Δ  | mkbundle Δ |
|----------------------------------:|----------:|--------------:|----------:|:--------:|:----------:|
|     preload enabled; 32-bit build |     795.8 |         855.6 |     783.8 | -0.25% ✓ |  -7.22% ✓  |
|    preload disabled; 32-bit build |     777.1 |         843.0 |     780.5 | +0.44% ✗ |  -7.41% ✓  |
|     preload enabled; 64-bit build |     779.0 |         843.0 |     791.5 | +1.58% ✗ |  -6.82% ✓  |
|    preload disabled; 64-bit build |     776.0 |         841.6 |     781.5 | +0.69% ✗ |  -7.15% ✓  |

[0]: https://docs.microsoft.com/en-us/xamarin/android/deploy-test/release-prep/?tabs=windows#bundle-assemblies-into-native-code
[1]: dotnet/android-libraries#64
[2]: https://github.com/mono/mono/blob/9b4736d4c271e9d4e04cafa258ddd58961f1a39f/mcs/tools/mkbundle/mkbundle.cs#L1315-L1317
[3]: dotnet/android-libraries#64 (comment)
[4]: https://devblogs.microsoft.com/dotnet/announcing-net-5-preview-4-and-our-journey-to-one-net/
[5]: dotnet/android-libraries#64 (comment)
[6]: https://www.nuget.org/packages/K4os.Compression.LZ4/
[7]: https://github.com/lz4/lz4
[8]: https://linux.die.net/man/2/mmap
@scottkdavis
Copy link

By enabling this by default, you have broken every post-package obfuscator. Mine was spitting out an cryptic "invalid assembly" error. It took 3 days to figure out this was a problem with the packaging of the APK, not a problem with my obfuscator. I use a robust obfuscator that renames the public interfaces/methods of my two .Net standard DLLs, and propagates those name changes up to the referencing DLLs above. It is more complex than simple obfuscators and sadly only runs post-packaging. It would be really nice if Microsoft would incorporate security directly into the platform, rather than forcing me to use an external tool. It is easy to forget about external tools that might suffer from a packaging change. Its unfortunate when those tools are for something as important as securing your apps. Everything is working again after I discovered the tag. This should be part of the Visual Studio project UI, with a info bubble that says, "This may need to be disabled to allow for post-packaging obfuscation or instrumentation tools."

@jonathanpeppers
Copy link
Member

@scottkdavis can you file a new issue describing how you hook into MSBuild? Include a diagnostic MSBuild log.

It is possible you are referencing a "private" MSBuild target that is prefixed with an underscore. This means it could be renamed/reordered, etc. We have some documented extension points that should be used instead.

@scottkdavis
Copy link

Hello @jonathanpeppers thanks for the response. This is post-package. My obfuscator takes the APK as the input, obfuscates the DLLs, then repackages and resigns the APK. I have a very robust obfuscator that renames all public methods and classes and maps those changes into the upstream DLLs. I tried to make this part of the build process years ago, but my project is structured like this:

1: Android csproj -> 2:Xamarin Forms UI csproj -> 3:Core logic and service layer csproj

When I tried to incorporate the public renaming into the build process, DLL3's public interfaces are renamed when DLL2 is compiled, then renamed again when DLL1 is compiled. The naming map is different (random) and therefore the double renaming breaks things. When incorporated into the build process, the obfuscator doesn’t know if another project is also going to do renaming as part of the compile since they run in separate processes. That’s why the obfuscator must run after packaging so a single obfuscation process can handle the public renaming and mapping across all DLLs at once.

I had written a long response about app security, my past meetings with the Xamarin product team about the lack of good security practices among Xamarin app developers, and how to help developers write secure apps. I deleted it, that is a whole other conversation. Happy to help any way I can, but the point of my first comment was to alert the team to the fact that this change breaks post-packaging tools and it can be a very expensive process to discover what happened. I realize this impacts very few developers since in all the conference talks I've given on app security, I’ve only met one other Xamarin developer who is obfuscating their app. I've never met anyone who understood the importance of stopping a hacker from using the public methods on their DLLs to automate activity, or robo-call their servers. Its unlikely most developers will ever care about securing their apps until it is just part of Visual Studio.

No further action needed for me. I have my solution working. I was simply passing on a side-effect you may not have anticipated. Best wishes.

@jonathanpeppers
Copy link
Member

@scottkdavis if your product post processes APK files, you should be able to use the same K4os.Compression.LZ4 NuGet package used here:

https://github.com/xamarin/xamarin-android/blob/b17a8134af6be692ad7dccf75c96cfd58a0f4ae0/src/Xamarin.Android.Build.Tasks/Utilities/AssemblyCompression.cs#L81

You should be able to lz4-decompress, run your obfuscation, lz4-compress and put things back the way they were. I would recommend adding this feature, as it is enabled for Release builds of Xamarin.Android apps by default going forward.

Does your product obfuscate AOT'd assemblies? I would think you would hit similar problems there. Does it support Android App Bundles, as well?

@scottkdavis
Copy link

Hi Jonathan, I should have been more clear. When I say "my obfuscator" I meant the one I have purchased. I don't have access into their pipeline for processing files. I however have forwarded on the compression information to them.

I sincerely appreciate all your responses to help me find an optimal solution. I'm fine with distributing uncompressed assemblies for now. The ROI of changing my build process again is negative.

Thank you for the attention you have give this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants