Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The createdump tool creates double crash dumps when running in a Linux container on Kubernetes #78956

Closed
kristjanjogi-msft opened this issue Nov 29, 2022 · 6 comments
Assignees
Milestone

Comments

@kristjanjogi-msft
Copy link

kristjanjogi-msft commented Nov 29, 2022

Description

A bug in the createdump tool which causes double crash dumps to be created for a single crash when running in a Linux container in Kubernetes.

Reproduction Steps

https://github.com/kristjanjogi-msft/createdump-doubledump-bug

Prerequites

  1. .NET 7 SDK
  2. Docker Desktop with Linux containers and Kubernetes support

Steps:

  1. Clone the repository:
    git clone https://github.com/kristjanjogi-msft/createdump-doubledump-bug.git

  2. cd createdump-doubledump-bug

  3. Build and publish the crashing application:
    dotnet publish -c Release -o published CrashingApplication/CrashingApplication.csproj

  4. Build the Docker image:
    docker build -t crashingapplication:1.0 .

  5. Run a Kubernetes pod in which the createdump tool creates two crash dumps for a single crash:
    kubectl apply -f doubledump.yaml

  6. View the evidence, two crash dumps instead of one are created:
    kubectl logs doubledump

Hello, World!
Unhandled exception. System.Exception: Crash
   at Program.<Main>$(String[] args) in [redacted]\createdump-doubledump-bug\CrashingApplication\Program.cs:line 4
[createdump] Gathering state for process 1 dotnet
[createdump] Crashing thread 00000001 signal 00000006
[createdump] Writing full dump to file /tmp/crashdump.1.1669709005
[createdump] Written 220954624 bytes (53944 pages) to core file
[createdump] Target process is alive
[createdump] Dump successfully written in 186ms
[createdump] Gathering state for process 1 dotnet
[createdump] Crashing thread 00000001 signal 0000000b
[createdump] Writing full dump to file /tmp/crashdump.1.1669709006
[createdump] Written 220954624 bytes (53944 pages) to core file
[createdump] Target process is alive
[createdump] Dump successfully written in 222ms

Expected behavior

One crash dump is created for a single crash.

Actual behavior

Two crash dumps are created for a single crash.

Regression?

No response

Known Workarounds

No response

Configuration

root@doubledump:/app# dotnet --info

Host:
  Version:      7.0.0
  Architecture: x64
  Commit:       d099f075e4

.NET SDKs installed:
  No SDKs were found.

.NET runtimes installed:
  Microsoft.AspNetCore.App 7.0.0 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 7.0.0 [/usr/share/dotnet/shared/Microsoft.NETCore.App]

Other architectures found:
  None

Environment variables:
  Not set

global.json file:
  Not found

Learn more:
  https://aka.ms/dotnet/info

Download .NET:
  https://aka.ms/dotnet/download

Other information

No response

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Nov 29, 2022
@ghost
Copy link

ghost commented Nov 29, 2022

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

A bug in the createdump tool which causes double crash dumps to be created for a single crash when running in a Linux container in Kubernetes.

Reproduction Steps

https://github.com/kristjanjogi-msft/createdump-doubledump-bug

Reproduction prerequites

  1. .NET 7 SDK
  2. Docker Desktop with Linux containers and Kubernetes support

Reproduction steps:

  1. Clone the repository:
    git clone https://github.com/kristjanjogi-msft/createdump-doubledump-bug.git

  2. cd createdump-doubledump-bug

  3. Build and publish the crashing application:
    dotnet publish -c Release -o published CrashingApplication/CrashingApplication.csproj

  4. Build the Docker image:
    docker build -t crashingapplication:1.0 .

  5. Run a Kubernetes pod in which the createdump tool creates two crash dumps for a single crash:
    kubectl apply -f doubledump.yaml

  6. View the evidence, two crash dumps instead of one are created:
    kubectl logs doubledump

Hello, World!
Unhandled exception. System.Exception: Crash
   at Program.<Main>$(String[] args) in [redacted]\createdump-doubledump-bug\CrashingApplication\Program.cs:line 4
[createdump] Gathering state for process 1 dotnet
[createdump] Crashing thread 00000001 signal 00000006
[createdump] Writing full dump to file /tmp/crashdump.1.1669709005
[createdump] Written 220954624 bytes (53944 pages) to core file
[createdump] Target process is alive
[createdump] Dump successfully written in 186ms
[createdump] Gathering state for process 1 dotnet
[createdump] Crashing thread 00000001 signal 0000000b
[createdump] Writing full dump to file /tmp/crashdump.1.1669709006
[createdump] Written 220954624 bytes (53944 pages) to core file
[createdump] Target process is alive
[createdump] Dump successfully written in 222ms

Expected behavior

One crash dump is created for a single crash.

Actual behavior

Two crash dumps are created for a single crash.

Regression?

No response

Known Workarounds

No response

Configuration

root@doubledump:/app# dotnet --info

Host:
  Version:      7.0.0
  Architecture: x64
  Commit:       d099f075e4

.NET SDKs installed:
  No SDKs were found.

.NET runtimes installed:
  Microsoft.AspNetCore.App 7.0.0 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 7.0.0 [/usr/share/dotnet/shared/Microsoft.NETCore.App]

Other architectures found:
  None

Environment variables:
  Not set

global.json file:
  Not found

Learn more:
  https://aka.ms/dotnet/info

Download .NET:
  https://aka.ms/dotnet/download

Other information

No response

Author: kristjanjogi-msft
Assignees: -
Labels:

area-Diagnostics-coreclr

Milestone: -

@tommcdon tommcdon added this to the 8.0.0 milestone Nov 30, 2022
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Nov 30, 2022
@tommcdon
Copy link
Member

Related to #69380

@tommcdon
Copy link
Member

tommcdon commented Dec 5, 2022

@hoyosjs discovered that the first dump is caused by SIGABORT and the second is caused by SIGSEGV

@mikem8361
Copy link
Member

@kristjanjogi-msft would it be possible to attach/send me the two dumps generated (i.e. /tmp/crashdump.1.1669709005 & /tmp/crashdump.1.1669709006). The second one might help quickly track down where SIGSEGV is coming from which might be in runtime shutdown somewhere.

Thanks.

@kristjanjogi-msft
Copy link
Author

@kristjanjogi-msft would it be possible to attach/send me the two dumps generated (i.e. /tmp/crashdump.1.1669709005 & /tmp/crashdump.1.1669709006). The second one might help quickly track down where SIGSEGV is coming from which might be in runtime shutdown somewhere.

Thanks.

@mikem8361 Sent response via email.

mikem8361 added a commit to mikem8361/runtime that referenced this issue Jan 11, 2023
Issue: dotnet#78956

After a core dump is generated because of a unhandled managed exception
abort() is called but a SIGSEGV is generated in libpthread.so which is
caught by the runtime and a second core dump is generated. The fix is
to uninstall/uninitialize all the signal handlers, not just SIGABORT.
mikem8361 added a commit that referenced this issue Jan 11, 2023
Issue: #78956

After a core dump is generated because of a unhandled managed exception
abort() is called but a SIGSEGV is generated in libpthread.so which is
caught by the runtime and a second core dump is generated. The fix is
to uninstall/uninitialize all the signal handlers, not just SIGABORT.
@mikem8361
Copy link
Member

Fixed with PR #80474.

github-actions bot pushed a commit that referenced this issue Jan 16, 2023
Issue: #78956

After a core dump is generated because of a unhandled managed exception
abort() is called but a SIGSEGV is generated in libpthread.so which is
caught by the runtime and a second core dump is generated. The fix is
to uninstall/uninitialize all the signal handlers, not just SIGABORT.
carlossanlop pushed a commit that referenced this issue Jan 17, 2023
Issue: #78956

After a core dump is generated because of a unhandled managed exception
abort() is called but a SIGSEGV is generated in libpthread.so which is
caught by the runtime and a second core dump is generated. The fix is
to uninstall/uninitialize all the signal handlers, not just SIGABORT.

Co-authored-by: Mike McLaughlin <mikem@microsoft.com>
mikem8361 added a commit to mikem8361/runtime that referenced this issue Feb 7, 2023
Issue: dotnet#78956

After a core dump is generated because of a unhandled managed exception
abort() is called but a SIGSEGV is generated in libpthread.so which is
caught by the runtime and a second core dump is generated. The fix is
to uninstall/uninitialize all the signal handlers, not just SIGABORT.
@ghost ghost locked as resolved and limited conversation to collaborators Feb 10, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants