Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash with 0x8007000E error on .NET 7.0 #85556

Open
AlphaBs opened this issue Apr 29, 2023 · 18 comments
Open

crash with 0x8007000E error on .NET 7.0 #85556

AlphaBs opened this issue Apr 29, 2023 · 18 comments

Comments

@AlphaBs
Copy link

AlphaBs commented Apr 29, 2023

Description

dotnet --version, dotnet build commands crash with 0x8007000E error.

ubuntu@localhost:~$ dotnet --version
GC heap initialization failed with error 0x8007000E
Failed to create CoreCLR, HRESULT: 0x8007000E

Reproduction Steps

Install dotnet 7.0 sdk with dotnet_install.sh script and run dotnet --version command on terminal

Expected behavior

print dotnet version

Actual behavior

ubuntu@localhost:~$ dotnet --version
GC heap initialization failed with error 0x8007000E
Failed to create CoreCLR, HRESULT: 0x8007000E

I also run command with strace:
stracelog.txt

Regression?

works perfectly on .NET 6.0.4 SDK

Known Workarounds

I solved the problem with this

ubuntu@localhost:~/.dotnet$ DOTNET_GCHeapHardLimit=1C0000000 dotnet --version
7.0.203

However why this occurs on .NET 7.0? On .NET 6.0 works without any problem

Configuration

ubuntu jammy, ARM64, run on Termux (Android 13)

Other information

ubuntu@localhost:~/.dotnet$ ulimit -v
unlimited                                                           
ubuntu@localhost:~/.dotnet$ cat /proc/meminfo
MemTotal:        7475488 kB                                          MemFree:          214228 kB
MemAvailable:    1438284 kB                                          Buffers:             756 kB
Cached:          1464412 kB                                          SwapCached:        22804 kB
Active:          1207476 kB                                          Inactive:        1771804 kB
Active(anon):     639508 kB                                          Inactive(anon):  1270248 kB
Active(file):     567968 kB                                          Inactive(file):   501556 kB
Unevictable:       19004 kB                                          Mlocked:           16752 kB
RbinTotal:        327680 kB                                          RbinAlloced:        7168 kB
RbinPool:              0 kB                                          RbinFree:             80 kB
RbinCached:       320432 kB                                          ZeroedFree:            0 kB
SwapTotal:       4194300 kB                                          SwapFree:        1015972 kB
Dirty:               620 kB                                          Writeback:             0 kB
AnonPages:       1845132 kB                                          Mapped:           902216 kB
Shmem:             60352 kB                                          KReclaimable:     315004 kB                                          Slab:             589340 kB
SReclaimable:     146800 kB                                          SUnreclaim:       442540 kB
KernelStack:       97984 kB
ShadowCallStack:   24532 kB
PageTables:       209516 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     7768204 kB
Committed_AS:   238931928 kB
VmallocTotal:   263061440 kB                                         VmallocUsed:      257124 kB
VmallocChunk:          0 kB                                          Percpu:            12544 kB
AnonHugePages:         0 kB                                          ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB                                          FileHugePages:         0 kB
FilePmdMapped:         0 kB                                          HugepagePool:          0 kB
CmaTotal:         487424 kB                                          CmaFree:            4976 kB
dma_heap_pool:    104732 kB                                          system:           569356 kB
kgsl_pool:         63656 kB
KgslSharedmem:   1044500 kB
zram0:            850656 kB
ubuntu@localhost:~/.dotnet$ dotnet --info
GC heap initialization failed with error 0x8007000E
Failed to create CoreCLR, HRESULT: 0x8007000E

Host:
  Version:      7.0.5
  Architecture: arm64
  Commit:       8042d61b17
                                                                     .NET SDKs installed:
  7.0.203 [/home/ubuntu/.dotnet/sdk]                                 
.NET runtimes installed:                                               Microsoft.AspNetCore.App 7.0.5 [/home/ubuntu/.dotnet/shared/Microsoft.AspNetCore.App]                                                     Microsoft.NETCore.App 7.0.5 [/home/ubuntu/.dotnet/shared/Microsoft.NETCore.App]                                                         
Other architectures found:                                             None
                                                                     Environment variables:
  Not set                                                            
global.json file:                                                      Not found
                                                                     Learn more:                                                            https://aka.ms/dotnet/info

Download .NET:
  https://aka.ms/dotnet/download
ubuntu@localhost:~/.dotnet$ dotnet --version
GC heap initialization failed with error 0x8007000E                  Failed to create CoreCLR, HRESULT: 0x8007000E

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Apr 29, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Apr 29, 2023
@danmoseley danmoseley added area-GC-coreclr and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Apr 29, 2023
@ghost
Copy link

ghost commented Apr 29, 2023

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

dotnet --version, dotnet build commands crash with 0x8007000E error.

ubuntu@localhost:~$ dotnet --version
GC heap initialization failed with error 0x8007000E
Failed to create CoreCLR, HRESULT: 0x8007000E

Reproduction Steps

Install dotnet 7.0 sdk with dotnet_install.sh script and run dotnet --version command on terminal

Expected behavior

print dotnet version

Actual behavior

ubuntu@localhost:~$ dotnet --version
GC heap initialization failed with error 0x8007000E
Failed to create CoreCLR, HRESULT: 0x8007000E

I also run command with strace:
stracelog.txt

Regression?

works perfectly on .NET 6.0.4 SDK

Known Workarounds

I solved the problem with this

ubuntu@localhost:~/.dotnet$ DOTNET_GCHeapHardLimit=1C0000000 dotnet --version
7.0.203

However why this occurs on .NET 7.0? On .NET 6.0 works without any problem

Configuration

ubuntu jammy, ARM64, run on Termux (Android 13)

Other information

ubuntu@localhost:~/.dotnet$ ulimit -v
unlimited                                                           
ubuntu@localhost:~/.dotnet$ cat /proc/meminfo
MemTotal:        7475488 kB                                          MemFree:          214228 kB
MemAvailable:    1438284 kB                                          Buffers:             756 kB
Cached:          1464412 kB                                          SwapCached:        22804 kB
Active:          1207476 kB                                          Inactive:        1771804 kB
Active(anon):     639508 kB                                          Inactive(anon):  1270248 kB
Active(file):     567968 kB                                          Inactive(file):   501556 kB
Unevictable:       19004 kB                                          Mlocked:           16752 kB
RbinTotal:        327680 kB                                          RbinAlloced:        7168 kB
RbinPool:              0 kB                                          RbinFree:             80 kB
RbinCached:       320432 kB                                          ZeroedFree:            0 kB
SwapTotal:       4194300 kB                                          SwapFree:        1015972 kB
Dirty:               620 kB                                          Writeback:             0 kB
AnonPages:       1845132 kB                                          Mapped:           902216 kB
Shmem:             60352 kB                                          KReclaimable:     315004 kB                                          Slab:             589340 kB
SReclaimable:     146800 kB                                          SUnreclaim:       442540 kB
KernelStack:       97984 kB
ShadowCallStack:   24532 kB
PageTables:       209516 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     7768204 kB
Committed_AS:   238931928 kB
VmallocTotal:   263061440 kB                                         VmallocUsed:      257124 kB
VmallocChunk:          0 kB                                          Percpu:            12544 kB
AnonHugePages:         0 kB                                          ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB                                          FileHugePages:         0 kB
FilePmdMapped:         0 kB                                          HugepagePool:          0 kB
CmaTotal:         487424 kB                                          CmaFree:            4976 kB
dma_heap_pool:    104732 kB                                          system:           569356 kB
kgsl_pool:         63656 kB
KgslSharedmem:   1044500 kB
zram0:            850656 kB
ubuntu@localhost:~/.dotnet$ dotnet --info
GC heap initialization failed with error 0x8007000E
Failed to create CoreCLR, HRESULT: 0x8007000E

Host:
  Version:      7.0.5
  Architecture: arm64
  Commit:       8042d61b17
                                                                     .NET SDKs installed:
  7.0.203 [/home/ubuntu/.dotnet/sdk]                                 
.NET runtimes installed:                                               Microsoft.AspNetCore.App 7.0.5 [/home/ubuntu/.dotnet/shared/Microsoft.AspNetCore.App]                                                     Microsoft.NETCore.App 7.0.5 [/home/ubuntu/.dotnet/shared/Microsoft.NETCore.App]                                                         
Other architectures found:                                             None
                                                                     Environment variables:
  Not set                                                            
global.json file:                                                      Not found
                                                                     Learn more:                                                            https://aka.ms/dotnet/info

Download .NET:
  https://aka.ms/dotnet/download
ubuntu@localhost:~/.dotnet$ dotnet --version
GC heap initialization failed with error 0x8007000E                  Failed to create CoreCLR, HRESULT: 0x8007000E

Author: AlphaBs
Assignees: -
Labels:

area-GC-coreclr, untriaged

Milestone: -

@EgorBo
Copy link
Member

EgorBo commented May 1, 2023

mmap(NULL, 274877915136, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)

looks like it tries to reserve 256Gb of memory?

@danmoseley
Copy link
Member

Does trying with 8.0 preview 3 give the same result?

@mangod9
Copy link
Member

mangod9 commented May 1, 2023

its possible Termux doesnt support a large reservation size, something similar was hit with RISC-V recently. Setting the hardlimit to something smaller makes the GC reserve a smaller size.

@mangod9 mangod9 removed the untriaged New issue has not been triaged by the area owner label May 1, 2023
@mangod9 mangod9 added this to the 8.0.0 milestone May 1, 2023
@mangod9
Copy link
Member

mangod9 commented May 1, 2023

@janvorli to check if there is a way to figure out the max reservation size for an OS.

@AlphaBs
Copy link
Author

AlphaBs commented May 1, 2023

Does trying with 8.0 preview 3 give the same result?

yes. same result with same error on .NET 8.0.100-preview.3.23178.7
mmap(NULL, 274877911040, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)

@janvorli
Copy link
Member

janvorli commented May 2, 2023

The max reservation size is also influenced by the virtual memory limit. @AlphaBs can you please check it using ulimit -a bash command? See the "virtual memory" line.

@AlphaBs
Copy link
Author

AlphaBs commented May 2, 2023

The max reservation size is also influenced by the virtual memory limit. @AlphaBs can you please check it using ulimit -a bash command? See the "virtual memory" line.

ubuntu@localhost:~$ ulimit -a
real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) 0
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 20
file size                   (blocks, -f) unlimited
pending signals                     (-i) 16382
max locked memory           (kbytes, -l) 64
max memory size             (kbytes, -m) unlimited
open files                          (-n) 1024
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) 8192
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) 25045
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited

@janvorli
Copy link
Member

janvorli commented May 2, 2023

@mangod9 the /proc/meminfo reports VmallocTotal, which is the total size of vmalloc virtual address space. I can see from the dumped info in this issue that it is ~ 250GB while on my Linux box, it is 32TB. Maybe we could somehow base our maximum reservation limit on that, although it is a kernel side allocation limit.

@mangod9
Copy link
Member

mangod9 commented May 2, 2023

hmm, yeah guess we need to limit it based on the VM alloc limits. @AlphaBs, assume you have a workaround to manually set the heap hard limit for now?

@AlphaBs
Copy link
Author

AlphaBs commented May 3, 2023

hmm, yeah guess we need to limit it based on the VM alloc limits. @AlphaBs, assume you have a workaround to manually set the heap hard limit for now?

yes. I just added export DOTNET_GCHeapHardLimit=1C0000000 to .bashrc. It works well.

As I tweaked that variable, I found that the limit point was roughly at b26000000. (47882174464 bytes, 47GB)

I also found that this limit is not fixed, but varies from run to run (perhaps depending on memory usage of machine during execution).

here is the result of running the command repeatedly, with DOTNET_GCHeapHardLimit=b26000000`.

ubuntu@localhost:~$ dotnet --version
GC heap initialization failed with error 0x8007000E
Failed to create CoreCLR, HRESULT: 0x8007000E
ubuntu@localhost:~$ free
               total        used        free      shared  buff/cache   available
Mem:         7475488     5316836      345408       44060     1813244     2088116
Swap:        4194300     2724232     1470068
ubuntu@localhost:~$ dotnet --version
7.0.203
ubuntu@localhost:~$ free
               total        used        free      shared  buff/cache   available
Mem:         7475488     5313228      348764       44060     1813496     2091708
Swap:        4194300     2724232     1470068
ubuntu@localhost:~$ dotnet --version
GC heap initialization failed with error 0x8007000E
Failed to create CoreCLR, HRESULT: 0x8007000E
ubuntu@localhost:~$ dotnet --version
GC heap initialization failed with error 0x8007000E
Failed to create CoreCLR, HRESULT: 0x8007000E
ubuntu@localhost:~$ free
               total        used        free      shared  buff/cache   available
Mem:         7475488     5325684      262436       46936     1887368     1957008
Swap:        4194300     2826736     1367564
ubuntu@localhost:~$ dotnet --version
7.0.203
ubuntu@localhost:~$

During the execution the usage of memory:

Screenshot_20230503_153440_Termux.jpg

As you can see, the error doesn't always occur with b26000000 limit. But I don't know what this magic number b26000000 means.

Any ideas??

@AlphaBs
Copy link
Author

AlphaBs commented May 3, 2023

is it normal? very simple console program consume 220GB virtual memory.

Screenshot_20230503_154050_Termux.jpg

testcsharp.cs

Console.WriteLine("Hello, World!");
int a = 0;
while (true) a++;

@janvorli
Copy link
Member

janvorli commented May 3, 2023

is it normal? very simple console program consume 220GB virtual memory.

It just reserves the virtual address space. Virtual address space is per process, that means that each process in the system can reserve upto those 220GB of that space (on your device). So unless you set the ulimit for the virtual memory, this is essentially "free". The application can then map physical memory into the reserved memory as it needs. The amount of physical memory it has used can be seen in the "RES" column. You can see in your screenshot above that it has used about 23MB of memory.

We reserve the virtual memory so that GC can have continuous range of address space that other memory allocations in the process would not touch.

The reason why you have started seeing this issue in .NET 7 relative to .NET 6 is that we have substantially enlarged the amount of reserved virtual address space because of a new significant enhancement of the GC implementation.

But I don't know what this magic number b26000000 means

This is a hexadecimal number representing number of bytes the GC heap can reserve. In decimal, it means 47882174464 bytes.

I also found that this limit is not fixed, but varies from run to run

There might be some variations depending on other allocations the application and the 3rd party native libraries it uses made. I would recommend using e.g. 2/3 of this value to make it reliable. That would mean setting the env variable e.g. to 700000000 (which is 30064771072 decimal).

@stevefan1999-personal
Copy link

stevefan1999-personal commented Jul 17, 2023

Yes. I also see this problem running dotnet under proot on Android. And because of that, C# Dev Kit extension on VSCode Android does not work because we can't supply DOTNET_GCHeapHardLimit.

@woachk
Copy link

woachk commented Oct 15, 2023

Android devices ship with a 39-bit VA size across the board.

@am11
Copy link
Member

am11 commented Jan 29, 2024

I am also seeing this error with valgrind --tool=massif when profiling NativeAOT app heap memory. Setting DOTNET_GCHeapHardLimit=1C0000000 fixes the issue (and can visualize the output). Adapting to environment constraints at run-time will certainly improve the user-experience in these chroot -like scenarios.

ps - using un-prefixed hex is not a good choice and confusing for environment variable IMHO. DOTNET_GCHeapHardLimit should accept both 0x<HEX> and <DECIMAL> values, and throw helpful error message for invalid value.

@janvorli
Copy link
Member

@am11 the fact that the values are always in hex is a historical thing, so it is hard to change that without breaking someone. However, prefixing the value with 0x works too already. The thing is that we call strtoul to convert the string to value. I didn't know that until few days ago when someone told me it works and re-reading strtoul documentation has uncovered that when you pass in base 16, the 0x can be in the string and it is just skipped.

@am11
Copy link
Member

am11 commented Jan 29, 2024

@janvorli, I tried this simple repro:

FROM --platform=linux/aarch64 alpine:latest

RUN apk add build-base curl clang llvm-dev valgrind bash zlib-dev icu-libs
RUN curl -sSL https://dot.net/v1/dotnet-install.sh | bash /dev/stdin --quality daily --channel "9.0" --install-dir "$HOME/.dotnet9"

RUN ~/.dotnet9/dotnet new console --aot -n consoleapp1
WORKDIR consoleapp1
RUN cat > "$HOME/.nuget/NuGet/NuGet.Config" <<EOF
<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <packageSources>
    <add key="nuget.org" value="https://api.nuget.org/v3/index.json" protocolVersion="3" />
    <add key="dotnet9" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet9/nuget/v3/index.json" />
  </packageSources>
</configuration>
EOF

RUN ~/.dotnet9/dotnet publish -o dist -c Release

# cache commands

# Fails
RUN echo 'valgrind --tool=massif dist/consoleapp1; echo $?' > run.sh

# Fails
RUN echo 'DOTNET_GCHeapHardLimit=0x1C0000000 valgrind --tool=massif dist/consoleapp1; echo $?' >> run.sh

# Works
RUN echo 'DOTNET_GCHeapHardLimit=1C0000000 valgrind --tool=massif dist/consoleapp1; echo $?' >> run.sh

ENTRYPOINT ["/bin/sh", "run.sh"]

first two valgrind commands always fail with 255, only the last form (without 0x) succeeds:

# build and tag once
$ docker build -t consoleapp1-valgrind .
# run
$ docker run --rm consoleapp1-valgrind

==7== Massif, a heap profiler
==7== Copyright (C) 2003-2017, and GNU GPL'd, by Nicholas Nethercote
==7== Using Valgrind-3.21.0 and LibVEX; rerun with -h for copyright info
==7== Command: dist/consoleapp1
==7== 
==7== 
255
==8== Massif, a heap profiler
==8== Copyright (C) 2003-2017, and GNU GPL'd, by Nicholas Nethercote
==8== Using Valgrind-3.21.0 and LibVEX; rerun with -h for copyright info
==8== Command: dist/consoleapp1
==8== 
==8== 
255
==9== Massif, a heap profiler
==9== Copyright (C) 2003-2017, and GNU GPL'd, by Nicholas Nethercote
==9== Using Valgrind-3.21.0 and LibVEX; rerun with -h for copyright info
==9== Command: dist/consoleapp1
==9== 
Hello, World!
==9== 
0

(Few weeks older version of dotnet 9 was showing GC heap initialization failed with error 0x8007000E, today's build has no error message; just the exit code 255 🤔)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants