-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to exec inside a running task with exec driver #13538
Comments
This seems to be an issue specific to the entries I've listed in
|
Hi @mr-karan! I tried to reproduce this problem and I wasn't able to, but this was on machine running on Ubuntu 22.04 rather than Pop_OS because I couldn't find a Vagrant box I trusted with Pop. The only obvious difference was that for my
That
But according to the kernel docs for the devpts filesystem:
If I look at
And then I'll look inside my container:
The shared executor mounts That changing the
|
This one has been open for a bit without the information we'd need to figure things out. I'm going to close this for now but please feel free to reopen if you have more info! |
@tgross Hi, can we please re-open this? Sorry this skipped my attention. I am still facing the above issue. This time, I was able to reproduce this on Ubuntu 22.04 (Minimal version) as well.
2a. With the default
2b. With the modified
Strangely, this issue isn't limited to just But here, I get a proper error message instead of simply EOF:
Contents of /dev inside this raw_exec app
So, I found a bit of oddity here, (not sure what I am seeing is correct or not). The
NOTE: This seems to be an intermittent issue, doesn't always happen so it can be a bit hard to debug this. I just restarted alloc and I am able to exec in it normally. These are the commands I ran from within the container (
Please let me know any additional details you'd like me to mention, I'll be happy to provide. |
Re-opening, but possible dupe of #12877? |
Possible but as I noted above, this isn't just limited to |
I've had a look at #14372 and I'm reasonably confident after a conversion with my colleagues that this issue will be covered by that fix as well. |
@tgross Awesome, good to know that! We can close this if you want :) |
Let's keep open until we've verified that. |
Bump. Was hoping a fix for this would arrive in 1.4.x. Not being able to exec inside tasks is quite a bummer for us. Is there any more debugging information I can help to provide here? Thanks! |
Just curious is there any updates on this issue / is there any workaround which does not involve restarting the allocation? |
Hi unfortunately no. I still haven't been able to reproduce either, but we have a fairly strong suspicion it's related to cgroups v2 issues. We've got someone planning to dig into that as part of our next major release cycle. |
I suspect this issue #17200 has revealed the problem we're running into here, although I never got a reproduction so it's hard to be sure. Unfortunately the |
This might be fixed by #17535, but because I never was able to repro this I can't really be sure. |
Nomad version
Operating system and Environment details
Issue
I've a fairly simple job that is running with
exec
driver. Howeveralloc exec
fails with an EOF errorReproduction steps
sleep.nomad
(job file below)Expected Result
Since the alloc status is running, you should be able to exec inside it:
Actual Result
However, when you actually try to exec:
Job file (if appropriate)
Nomad Client logs (if appropriate)
Looks like it's trying to open
/dev/ptmx
for some reason that I am unsure of.nomad
is running assudo
here (as noted in the above command to run the server+client node)The text was updated successfully, but these errors were encountered: