Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process.MainModule.FileName incorrect with kernel 5.0.0-27-generic (causing MSBuild to be broken) #30774

Closed
qmfrederik opened this issue Sep 5, 2019 · 35 comments

Comments

@qmfrederik
Copy link
Contributor

See dotnet/core#3312 and dotnet/core#3309 .

There are various reports of Process.GetCurrentProcess().MainModule.FileName returning an incorrect value when running .NET Core inside a container on Ubuntu 19.04 using kernel 5.0.0-27-generic. Reverting back to kernel 5.0.0-25-generic fixes this issue.

The same issue also reproduces on the Azure variant of Ubuntu, with kernel version 5.0.0-1018.

The simplest steps to repro that I know of are in dotnet/core#3309 .

@qmfrederik
Copy link
Contributor Author

Opening an issue here as the MSBuild team believes this is not related to MSBuild itself. I don't have time to dig into this myself (I just reverted to the previous kernel) but thought I'd be good to have a corefx issue open for this.

@danmoseley
Copy link
Member

Interesting. Perhaps @tmds has thoughts about what we may be doing wrong here.

@danmoseley danmoseley changed the title Process.MainModule.FileName incorrect with kernel 5.0.0-27-generic Process.MainModule.FileName incorrect with kernel 5.0.0-27-generic (causing MSBuild to be broken) Sep 5, 2019
@tmds
Copy link
Member

tmds commented Sep 5, 2019

We read this from the /proc/<pid>/maps file.
Using kernel 5.0.0-27-generic on Ubuntu 18.04 in a container, every pathname is / instead of the mapped file path.

5.0.0-25-generic (working):

$ sudo docker run ubuntu cat /proc/1/maps
56324a3f5000-56324a3fd000 r-xp 00000000 fc:01 665673                     /bin/cat
56324a5fc000-56324a5fd000 r--p 00007000 fc:01 665673                     /bin/cat
56324a5fd000-56324a5fe000 rw-p 00008000 fc:01 665673                     /bin/cat
56324a8eb000-56324a90c000 rw-p 00000000 00:00 0                          [heap]
7f8ab5f91000-7f8ab6178000 r-xp 00000000 fc:01 666073                     /lib/x86_64-linux-gnu/libc-2.27.so
7f8ab6178000-7f8ab6378000 ---p 001e7000 fc:01 666073                     /lib/x86_64-linux-gnu/libc-2.27.so
7f8ab6378000-7f8ab637c000 r--p 001e7000 fc:01 666073                     /lib/x86_64-linux-gnu/libc-2.27.so
7f8ab637c000-7f8ab637e000 rw-p 001eb000 fc:01 666073                     /lib/x86_64-linux-gnu/libc-2.27.so
7f8ab637e000-7f8ab6382000 rw-p 00000000 00:00 0 
7f8ab6382000-7f8ab63a9000 r-xp 00000000 fc:01 666055                     /lib/x86_64-linux-gnu/ld-2.27.so
7f8ab6583000-7f8ab65a7000 rw-p 00000000 00:00 0 
7f8ab65a9000-7f8ab65aa000 r--p 00027000 fc:01 666055                     /lib/x86_64-linux-gnu/ld-2.27.so
7f8ab65aa000-7f8ab65ab000 rw-p 00028000 fc:01 666055                     /lib/x86_64-linux-gnu/ld-2.27.so
7f8ab65ab000-7f8ab65ac000 rw-p 00000000 00:00 0 
7ffc25954000-7ffc25975000 rw-p 00000000 00:00 0                          [stack]
7ffc259f1000-7ffc259f4000 r--p 00000000 00:00 0                          [vvar]
7ffc259f4000-7ffc259f5000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

5.0.0-27-generic (broken):

$ sudo docker run ubuntu cat /proc/1/maps
55de0e052000-55de0e05a000 r-xp 00000000 fc:01 665673                     /
55de0e259000-55de0e25a000 r--p 00007000 fc:01 665673                     /
55de0e25a000-55de0e25b000 rw-p 00008000 fc:01 665673                     /
55de0fcb3000-55de0fcd4000 rw-p 00000000 00:00 0                          [heap]
7f7a8d881000-7f7a8da68000 r-xp 00000000 fc:01 666073                     /
7f7a8da68000-7f7a8dc68000 ---p 001e7000 fc:01 666073                     /
7f7a8dc68000-7f7a8dc6c000 r--p 001e7000 fc:01 666073                     /
7f7a8dc6c000-7f7a8dc6e000 rw-p 001eb000 fc:01 666073                     /
7f7a8dc6e000-7f7a8dc72000 rw-p 00000000 00:00 0 
7f7a8dc72000-7f7a8dc99000 r-xp 00000000 fc:01 666055                     /
7f7a8de73000-7f7a8de97000 rw-p 00000000 00:00 0 
7f7a8de99000-7f7a8de9a000 r--p 00027000 fc:01 666055                     /
7f7a8de9a000-7f7a8de9b000 rw-p 00028000 fc:01 666055                     /
7f7a8de9b000-7f7a8de9c000 rw-p 00000000 00:00 0 
7ffc744bd000-7ffc744de000 rw-p 00000000 00:00 0                          [stack]
7ffc7452d000-7ffc74530000 r--p 00000000 00:00 0                          [vvar]
7ffc74530000-7ffc74531000 r-xp 00000000 00:00 0                          [vdso]

@stephentoub
Copy link
Member

@tmds, presumably then this is a Linux bug?

@olcayseker
Copy link

if it is, then sdk and runtime containers should be updated accordingly to prevent any other duplicate issues.

@danmoseley
Copy link
Member

@MichaelSimons

@tmds
Copy link
Member

tmds commented Sep 6, 2019

@tmds, presumably then this is a Linux bug?

Yes, I think so too, specific to the 5.0.0-27-generic on Ubuntu.
I don't have issues on Fedora with 5.1.15 kernel.

@qmfrederik
Copy link
Contributor Author

OK, I opened https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1843018 . I hope that's the right location to report Ubuntu kernel bugs. If anyone knows about a better way of contacting the Ubuntu team, let me know :).

@qmfrederik
Copy link
Contributor Author

PS: There's a "Does this bug affect you?" link at the top of the page (green text, below the but title) that those who are interested can use to upvote this issue in Launchpad.

@olcayseker
Copy link

I can confirm that, Linux host users should downgrade or upgrade kernel to build successfully. For me, i upgraded my Ubuntu host to kernel 5.2.12 and everything is fine now. There is nothing to do for sdk and runtime containers actually.

@MichaelSimons
Copy link
Member

I logged a Docker issue for this as well because they are involved - moby/moby#39875

@danmoseley
Copy link
Member

@MichaelSimons can we close this issue now then?

@mikeharder
Copy link

@danmosemsft, @MichaelSimons: Just so you know, I believe this is still an issue on Azure Ubuntu VMs.

@danmoseley
Copy link
Member

@MichaelSimons I imagine the only reason for us to have an issue at this point is if we need to adjust the kernel versions of any docker images we own, which would be better tracked in your repo.

@MichaelSimons
Copy link
Member

@danmosemsft - A Docker container has no kernel inside it; it just installed and started on the kernel which is used on the host. Because of this, I am not seeing a workaround that can be applied to our docker images. Are there any potential alternative implementations of Process.MainModule.FileName?

Users affected by this, please chime in on the respective issues https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1843018 and moby/moby#39875.

@qmfrederik
Copy link
Contributor Author

@MichaelSimons /proc/PID/exe should be a link to the executable of the current process.

@qmfrederik
Copy link
Contributor Author

I just checked, this seems to work on the latest Ubuntu kernel:

vagrant@vagrant:~$ uname -r
5.0.0-27-generic
vagrant@vagrant:~$ sudo docker run ubuntu ls -l /proc/1/exe
lrwxrwxrwx 1 root root 0 Sep  9 15:59 /proc/1/exe -> /bin/ls

Perhaps Process.MainModule.FileName could use this as a fallback?

@danmoseley
Copy link
Member

Can someone more familiar with the Linux ecosystem please answer this: why would we invest in a workaround for a single kernel version that seems to have a significant, subsequently fixed bug in it? It takes time to do, would exist in the code indefinitely, and presumably would need back porting to several .NET Core releases to be useful.

Perhaps the Linux community will backport a fix to this kernel. Meanwhile I think the answer here if someone hits this is "that kernel has a serious bug. please use a different kernel version". Right?

@danmoseley
Copy link
Member

BTW historically changes to what we parse from the proc filesystem have had a bug tail because we've discovered edge cases, eg., behaviors varying by kernel version, or in a recent case, we were pulling the process name but it turned out it would not return more than 15 characters. So I'd rather not change what we do here.

@tmds
Copy link
Member

tmds commented Sep 9, 2019

imho this should be fixed in the Ubuntu kernel, because the issue is unique to that kernel.

@stephentoub
Copy link
Member

stephentoub commented Sep 9, 2019

I agree. We shouldn't do unnatural things to try to work around this in Process.

@danmoseley
Copy link
Member

OK, I'm going to close this then. Thanks @tmds, @qmfrederik

@qmfrederik
Copy link
Contributor Author

Those hitting this issue, please note that the 5.0.0-27-generic kernel (the current Ubuntu kernel) contains a couple of security fixes, see https://usn.ubuntu.com/4114-1/ and https://usn.ubuntu.com/4115-1/. You may want consider that when reverting your kernel.

@marcwittke
Copy link

To those having the mentioned error under their azure vm running Ubuntu:
Kernel 5.0.0-1018-azure is affected as well. 5.0.0-1016-azure works.

I enabled the grub timeout following this guide, used the serial console in the azure portal to select the working kernel from the boot menu, and then uninstalled the bad kernel version by calling apt purge linux-image-5.0.0-1018-azure and apt purge linux-headers-5.0.0-1018-azure.

@ScottGuymer
Copy link

Im also seeing this issue on 5.0.0-1020-azure on a Ubuntu 18.04-LTS VM created today so its not fixed yet.

Does this mean that this would be broken in all azure VMs created with that image?

@wayne-o
Copy link

wayne-o commented Sep 26, 2019

I am getting this on @github actions! Building my docker images is now broken using github actions :/

@rainersigwald
Copy link
Member

I've escalated internally to the folks who own the GitHub Actions images, but don't yet have anything concrete to share.

@ferrywlto
Copy link

ferrywlto commented Sep 27, 2019

I am getting this on @github actions! Building my docker images is now broken using github actions :/

Google pointed me here after I teared my hairs out when trying to figure out why my GitHub Actions broken. :(

changed my dockerimage.yml to runs-on: ubuntu-16.04 temporary solved the problem.

@DocBrown101
Copy link

DocBrown101 commented Sep 27, 2019

I updated my Ubuntu host (18.04.3 LTS) to kernel version 5.2.17 and now it works!

With these instructions it was very easy!
https://www.omgubuntu.co.uk/2017/02/ukuu-easy-way-to-install-mainline-kernel-ubuntu

@danmoseley
Copy link
Member

To those having the mentioned error under their azure vm running Ubuntu:
Kernel 5.0.0-1018-azure is affected as well. 5.0.0-1016-azure works.
Im also seeing this issue on 5.0.0-1020-azure on a Ubuntu 18.04-LTS VM created today so its not fixed yet.

If someone finds a later version does work, that would be good to share here.

@Jan-H-Hu
Copy link

hello together

I had the same problem, a kernel update also helped me.
With the version Linux 5.2.2-050202-generic it works again.

How to Install Linux Kernel 5.2.2 in Ubuntu:
http://ubuntuhandbook.org/index.php/2019/07/install-linux-kernel-5-2-ubuntu-linux-mint/

@wayne-o
Copy link

wayne-o commented Sep 28, 2019 via email

@rainersigwald
Copy link
Member

No ETA as of yet, but:

@daniel-lerch
Copy link

The new Linux Kernel 5.0.0-31 on Ubuntu 18.04.3 LTS with HWE enabled fixes the bug for me.

@rolandoldengarm
Copy link

For anyone who has the same issue running Docker builds on Azure Hosted Linux agents, switching to Ubuntu 16.04 fixes the issue.
ubuntu:latest has this issue in our case since today.

@msftgits msftgits transferred this issue from dotnet/corefx Feb 1, 2020
@msftgits msftgits added this to the 3.0 milestone Feb 1, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 12, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests