Massive allocations in unmanaged memory on Dictionary creation #54688

Takoooooo · 2021-06-24T15:41:26Z

Description

I wanted to use https://github.com/thecoderok/Unidecode.NET this project in my code, but I have realized it allocates a lot of memory. After some investigations, I have understood what the reason for all those allocations was this file
https://github.com/thecoderok/Unidecode.NET/blob/master/src/Unidecoder.Characters.cs
At peak, Dictionary initialization would consume ~65MB of memory(.net5,x64, Release), and ~99% of this memory is unmanaged, which is quite strange for me. For example "Hello World" console app(.net5,x64, Release) on my machine would consume ~8 MB of memory.
Steps to reproduce? Just initialize the dictionary with the data from https://github.com/thecoderok/Unidecode.NET/blob/master/src/Unidecoder.Characters.cs

Configuration

.net5,x64,Win10

Data

(With dictionary initialization)

Just "hello world" app to compare.

Analysis

From dotMemory, I can see what when the app just starts it starts to allocate memory to the heap(prb initializing dictionary) and also starts to massively allocate unmanaged memory. When the allocation to the heap ends some unmanaged memory is also being cleared and stabilizes on 30MB.

dotnet-issue-labeler · 2021-06-24T15:41:29Z

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

huoyaoyuan · 2021-06-24T17:10:48Z

It seems that diagnostics tools show string content as unmanaged memory.
However, the type has ~1.5K strings. Certainly a string shouldn't consume 30KB memory.

GrabYourPitchforks · 2021-06-24T21:27:53Z

I temporarily pathed this to the VM because those folks probably have the best awareness of what might be reporting this. Or at least can route it appropriately. :)

jkotas · 2021-06-25T01:31:37Z

The dictionary initialization method in the library is huge. The JITed code for it is about 1.5MB.

The unmanaged memory allocation that you are seeing is coming from the JIT. The JIT needs the unmanaged memory to create intermediate representation of the method. It is expected that the intermediate representation for 1.5MB method is going to take 10s MB.

Large auto-generated collection initializers are known source of bad performance and crashes. See for example: #8980.

You should open an issue against the library instead. The library should be fixed to use data-driven approach for initialization of the Dictionary.

jkotas · 2021-06-25T01:40:26Z

Looks like there is an issue on this already: thecoderok/Unidecode.NET#14

Takoooooo · 2021-06-25T07:50:14Z

The dictionary initialization method in the library is huge. The JITed code for it is about 1.5MB.

The unmanaged memory allocation that you are seeing is coming from the JIT. The JIT needs the unmanaged memory to create intermediate representation of the method. It is expected that the intermediate representation for 1.5MB method is going to take 10s MB.

Large auto-generated collection initializers are known source of bad performance and crashes. See for example: #8980.

You should open an issue against the library instead. The library should be fixed to use data-driven approach for initialization of the Dictionary.

I`m sorry, but what do you mean by the "data-driven approach for initialization of the Dictionary"?

huoyaoyuan · 2021-06-25T08:12:20Z

Store the corresponding data in embedded binary file, primitive array of constants, or ReadOnlySpan<byte> that backed by a constant array. Then use a loop to convert the data into dictionary.

Takoooooo · 2021-06-25T09:39:49Z

Store the corresponding data in embedded binary file, primitive array of constants, or ReadOnlySpan<byte> that backed by a constant array. Then use a loop to convert the data into dictionary.

I tried making one array of integers and a jagged array of strings and to add them in for loop to the dictionary, but it doesn't really solve the issue.Still~65MB

huoyaoyuan · 2021-06-25T12:04:24Z

a jagged array of strings

This is still codeful. String is not considered primitive in this case.

You can inspect the output assembly with ILSpy, and examine the IL size of method body.

Takoooooo added the tenet-performance Performance related issue label Jun 24, 2021

dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jun 24, 2021

GrabYourPitchforks added the area-VM-coreclr label Jun 24, 2021

jkotas closed this as completed Jun 25, 2021

jkotas added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed area-VM-coreclr labels Jun 25, 2021

ghost locked as resolved and limited conversation to collaborators Jul 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Massive allocations in unmanaged memory on Dictionary creation #54688

Massive allocations in unmanaged memory on Dictionary creation #54688

Takoooooo commented Jun 24, 2021

dotnet-issue-labeler bot commented Jun 24, 2021

huoyaoyuan commented Jun 24, 2021

GrabYourPitchforks commented Jun 24, 2021

jkotas commented Jun 25, 2021 •

edited

Loading

jkotas commented Jun 25, 2021

Takoooooo commented Jun 25, 2021

huoyaoyuan commented Jun 25, 2021 •

edited

Loading

Takoooooo commented Jun 25, 2021 •

edited

Loading

huoyaoyuan commented Jun 25, 2021 •

edited

Loading

Massive allocations in unmanaged memory on Dictionary creation #54688

Massive allocations in unmanaged memory on Dictionary creation #54688

Comments

Takoooooo commented Jun 24, 2021

Description

Configuration

Data

Analysis

dotnet-issue-labeler bot commented Jun 24, 2021

huoyaoyuan commented Jun 24, 2021

GrabYourPitchforks commented Jun 24, 2021

jkotas commented Jun 25, 2021 • edited Loading

jkotas commented Jun 25, 2021

Takoooooo commented Jun 25, 2021

huoyaoyuan commented Jun 25, 2021 • edited Loading

Takoooooo commented Jun 25, 2021 • edited Loading

huoyaoyuan commented Jun 25, 2021 • edited Loading

jkotas commented Jun 25, 2021 •

edited

Loading

huoyaoyuan commented Jun 25, 2021 •

edited

Loading

Takoooooo commented Jun 25, 2021 •

edited

Loading

huoyaoyuan commented Jun 25, 2021 •

edited

Loading