[intermittent]: podman run -a stdin&stderr: read unixpacket: connection reset by peer #3302

edsantiago · 2019-06-11T19:03:11Z

This is a combination that I can find no realistic use for, but even so I think merits attention because it might present itself in other more real-world situations:

# echo true | podman run -a stdin -a stderr --tty=false alpine sh
Error: error attaching to container 22e7b45a00508b0e60a27288bb6c745100a2ee42335800de1d8e246cfe24bd48: read unixpacket @->/var/run/libpod/socket/22e7b45a00508b0e60a27288bb6c745100a2ee42335800de1d8e246cfe24bd48/attach: read: connection reset by peer

Reproduces maybe one in five attempts; the rest of the time it succeeds. It also fails with echo ls / and echo ls /sdf, but does not (seem to) fail with </dev/null (redirection, not pipe).

--log-level=debug adds nothing of value AFAICT—both pass and fail look identical to my eye—but I will provide on request.

podman-1.4.0-2.fc29 and fc30; but I think I've seen it before, just never taken time to pursue it. I can look in logs if necessary.

The text was updated successfully, but these errors were encountered:

rhatdan · 2019-06-12T07:15:26Z

Must be a race condition.

edsantiago · 2019-06-19T21:55:16Z

Still present in podman-1.4.2-1.fc29 and fc30.

There's another failure which I think must be related. Instead of sh, run something that outputs to stderr:

# echo hi | podman run -a stdin -a stderr --interactive --tty=false  alpine ls /nonesuch
ls: /nonesuch: No such file or directory

At this point it hangs - AFAICT forever. ^D has no effect. podman ps in a separate window shows the container running. ^C yields:

^CERRO[0000] container not running
container not running
ERRO[0178] Error forwarding signal 2 to container 0af91336dd3700b2acd9889e92cd91e558a016510efc9dc5f5bd669ef921d69a: error sending si
gnal to container 0af91336dd3700b2acd9889e92cd91e558a016510efc9dc5f5bd669ef921d69a: `/usr/bin/runc kill 0af91336dd3700b2acd9889e92cd
91e558a016510efc9dc5f5bd669ef921d69a 2` failed: exit status 1

"container not running" even though a few seconds ago podman ps showed it running and, right now, after the ^C, podman ps still shows it running. But:

# podman exec 0af9 date
ERRO[0000] exec failed: cannot exec a container that has stopped
exec failed: cannot exec a container that has stopped
Error: exit status 1
# podman ps
CONTAINER ID  IMAGE                            COMMAND       CREATED        STATUS            PORTS  NAMES
0af91336dd37  docker.io/library/alpine:latest  ls /nonesuch  6 minutes ago  Up 6 minutes ago         gifted_zhukovsky

This one happens less frequently -- one in ten times, maybe -- but is still worth someone looking at, pretty please?

mheon · 2019-06-19T22:18:45Z

The exec thing will hopefully not be a problem after the wide-reaching refactor by @haircommander lands. Which will be by end of week. We're going to force it in if we need to.

For the rest... I'm going to bet it has something to do with the lack of a terminal. Terminal-less attach is poorly tested and rarely used.

edsantiago · 2019-06-19T22:21:52Z

Excellent - thank you.

rhatdan · 2019-08-05T20:46:44Z

Well the refactor has happened thanks @haircommander , @edsantiago could you confirm it works so we can close this issue.

edsantiago · 2019-08-06T17:03:10Z

I still see it - albeit only after tens of iterations - with podman 1.4.4-4.fc30 (rpm) and master @ b5618d9 (hand-built). Do I need a new conmon?

rhatdan · 2019-08-08T12:44:57Z

@haircommander Could answer @edsantiago question?

haircommander · 2019-08-08T14:07:59Z

the first issue as posted is not helped with a new conmon, and I also have observed a failure with:
echo hi | podman run -a stdin -a stderr --interactive --tty=false --rm alpine ls /nonesuch
However, the failure is similar to the issue as posted: intermittent

Error: error attaching to container 417eda8d842a3037d9be62108df9d20e85f03a963641b486a2b5ac467c31784b: read unixpacket @->/run/user/1000/libpod/tmp/socket/417eda8d842a3037d9be62108df9d20e85f03a963641b486a2b5ac467c31784b/attach: read: connection reset by peer

I did not observe a hang like you did, however

edsantiago · 2019-08-12T21:06:46Z

Issue still present in podman-1.5.0-2.fc30.x86_64 with, presumably, new conmon and everything.

fkaempfer · 2019-08-31T17:36:08Z

I get the same error consistently when running

podman run -it  --rm --name=alpine alpine:latest cat

And while it is open

echo hi | podman exec -i alpine cat
Error: read unixpacket @->/run/user/1000/libpod/tmp/socket/e7e03cd4fdc245d1dd391c443ad3fa532a4b92d18f62ba53ba79112968a1af3a/attach: read: connection reset by peer

It works fine when you run a command without pipe.

Is this the same issue? Can somebody reproduce this? I was wondering if something is wrong on my end (Fedora 30 with podman both from master or official, rootless)

EDIT: This seems to be a regression. On another machine it worked fine with f29 and podman 1.1.2, but as soon as I dnf update to podman 1.5.1 the error appears.
EDIT2: It also works fine with 1.4.4

rhatdan · 2019-09-01T10:21:36Z

$ echo hi | podman run -i alpine cat
hi

But I get the same error

$ podman run -d --name alpine alpine sleep 1000
8377f8b78b4d93d437f132d85ce6e2b656bba508e1ca979b77bc3d0678544367
$ echo hi | podman exec -i alpine cat
Error: read unixpacket @->/run/user/3267/libpod/tmp/socket/9d2580cac3db7a5251b34e0beb6b6412cb14482ea230bd65d8f77e7ef353b3aa/attach: read: connection reset by peer

rhatdan · 2019-09-01T10:22:34Z

@mheon @haircommander PTAL

tinyzimmer · 2019-09-11T15:12:41Z

I don't know if its addressable (might even be desired) because sometimes the command inside the container is actually succeeding, just with nothing on stdin.

But thought I might mention that when this happens, the exit code is 0. Was very confusing to track down in an automated workflow.

edsantiago · 2019-09-24T17:00:24Z

ping, still seeing this in podman-1.6.0-2.fc30

maflcko · 2019-09-25T22:35:57Z

This can be worked around by downgrading the package to 1.4.4, it seems:

$ podman --version
podman version 1.5.1
$ podman run -d --name alpine alpine sleep 1000
$ echo hi | podman exec -i alpine cat
Error: read unixpacket @->/run/user/1000/libpod/tmp/socket/a87821f16138588f14f583486683fc1bce72258be4bde94d549db5ec2f544d6f/attach: read: connection reset by peer

$ podman --version
podman version 1.4.4
$ podman run -d --name alpine alpine sleep 1000
$ echo hi | podman exec -i alpine cat
hi
$ rpm -q podman
podman-1.4.4-4.fc30.x86_64

edsantiago · 2019-09-25T22:38:56Z

This can be worked around by downgrading the package to 1.4.4,

Unlikely: I filed against 1.4.0, and the problem has never been fixed. I suspect if you try enough times, you'll run into it again. After all, it's an intermittent problem.

mheon · 2019-09-25T22:43:02Z

I think it was run only prior to 1.5 when it expanded to exec

…

On Wed, Sep 25, 2019, 18:39 Ed Santiago ***@***.***> wrote: This can be worked around by downgrading the package to 1.4.4, Unlikely: I filed against 1.4.0, and the problem has never been fixed. I suspect if you try enough times, you'll run into it again. After all, it's an intermittent problem. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#3302>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AB3AOCAXEOZANSEKDJTLJ2TQLPSAJANCNFSM4HXCO4LA> .

mheon · 2019-10-01T20:18:27Z

Both @haircommander and I are looking at this one. We're pretty sure it has something to do with the attach socket - it seems like it's being closed prematurely, potentially on Conmon's side?

mheon · 2019-10-03T20:23:10Z

Debugging further:

We're getting an EOF on the container STDERR fd in Conmon, but I'm not sure if it's actually the container - the only things I see being copied there are Conmon debug logs?

haircommander · 2019-10-03T20:26:43Z

@mheon I think that EOF comes from the piped input. If you do the same command but with no pipe cat hello, the go routine redirecting stdin to conmon doesn't actually get past a buf.Read(), as it's waiting for enough bytes, where the EOF makes the read terminate. I think stdin actually gets passed down through the os.Exec() call, and the go routine reading stdin is just a way to figure out when to terminate podman (which it is doing prematurely, it seems). All suspicions, and I haven't gotten much more time to check out more

mheon · 2019-10-03T20:28:40Z

@haircommander I think that Conmon might not be handling -i properly in this case - it seems like the exec session is finishing immediately (cat decided to output nothing and exit), when it should be waiting for input.

haircommander · 2019-10-03T20:30:52Z

there's an option --leave-stdin-open that doesn't close stdin immediately when it thinks input is done, but that didn't help. I would not be surprised if -i wasn't being handled right. That said, I'm not even sure we pass -i down to conmon in exec as is.

github-actions · 2019-11-03T00:08:08Z

This issue had no activity for 30 days. In the absence of activity or the "do-not-close" label, the issue will be automatically closed within 7 days.

rhatdan · 2019-11-03T11:01:49Z

@haircommander Did your merge fix this issue?
@edsantiago Could you check if this issue is fixed now?

haircommander · 2019-11-04T15:14:22Z

I'm not sure this issue is fixed yet

edsantiago · 2019-11-04T17:03:30Z

Still present in podman-1.6.2-2.fc30 and in master @ efc7f15

maflcko · 2019-11-12T19:17:00Z

Would be nice to get this fixed in the next couple of months, because it prevents affected users from upgrading to fedora 31, as I believe there haven't been any compiled and tested releases of pre-1.5 versions of podman for fedora 31.

rhatdan assigned haircommander and mheon Sep 1, 2019

maflcko mentioned this issue Sep 25, 2019

After upgrade from 1.4.4 to 1.5.1: Error: read unixpacket #4112

Closed

haircommander mentioned this issue Oct 31, 2019

Switch to bufio Reader for exec streams #4400

Merged

github-actions bot added the stale-issue label Nov 3, 2019

mheon added the do-not-close label Nov 3, 2019

ljesmin mentioned this issue Nov 23, 2019

podman connection plugin order of arguments ansible/ansible#65220

Closed

This was referenced Jan 6, 2020

Docker/podman difference when piping to exec statement #4785

Closed

exec: fix pipes #4818

Merged

openshift-merge-robot closed this as completed in #4818 Jan 9, 2020

jlebon mentioned this issue Aug 5, 2020

podman.base test is flaking with podman-2:1.9.3-1.fc32 coreos/fedora-coreos-tracker#594

Closed

edsantiago mentioned this issue Mar 3, 2021

podman cp: support copying on tmpfs mounts #9593

Merged

github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 23, 2023

github-actions bot locked as resolved and limited conversation to collaborators Sep 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[intermittent]: podman run -a stdin&stderr: read unixpacket: connection reset by peer #3302

[intermittent]: podman run -a stdin&stderr: read unixpacket: connection reset by peer #3302

edsantiago commented Jun 11, 2019

rhatdan commented Jun 12, 2019

edsantiago commented Jun 19, 2019

mheon commented Jun 19, 2019

edsantiago commented Jun 19, 2019

rhatdan commented Aug 5, 2019

edsantiago commented Aug 6, 2019

rhatdan commented Aug 8, 2019

haircommander commented Aug 8, 2019

edsantiago commented Aug 12, 2019

fkaempfer commented Aug 31, 2019 •

edited

rhatdan commented Sep 1, 2019

rhatdan commented Sep 1, 2019

tinyzimmer commented Sep 11, 2019 •

edited

edsantiago commented Sep 24, 2019

maflcko commented Sep 25, 2019

edsantiago commented Sep 25, 2019

mheon commented Sep 25, 2019 via email

mheon commented Oct 1, 2019

mheon commented Oct 3, 2019

haircommander commented Oct 3, 2019

mheon commented Oct 3, 2019

haircommander commented Oct 3, 2019

github-actions bot commented Nov 3, 2019

rhatdan commented Nov 3, 2019

haircommander commented Nov 4, 2019

edsantiago commented Nov 4, 2019

maflcko commented Nov 12, 2019

[intermittent]: podman run -a stdin&stderr: read unixpacket: connection reset by peer #3302

[intermittent]: podman run -a stdin&stderr: read unixpacket: connection reset by peer #3302

Comments

edsantiago commented Jun 11, 2019

rhatdan commented Jun 12, 2019

edsantiago commented Jun 19, 2019

mheon commented Jun 19, 2019

edsantiago commented Jun 19, 2019

rhatdan commented Aug 5, 2019

edsantiago commented Aug 6, 2019

rhatdan commented Aug 8, 2019

haircommander commented Aug 8, 2019

edsantiago commented Aug 12, 2019

fkaempfer commented Aug 31, 2019 • edited

rhatdan commented Sep 1, 2019

rhatdan commented Sep 1, 2019

tinyzimmer commented Sep 11, 2019 • edited

edsantiago commented Sep 24, 2019

maflcko commented Sep 25, 2019

edsantiago commented Sep 25, 2019

mheon commented Sep 25, 2019 via email

mheon commented Oct 1, 2019

mheon commented Oct 3, 2019

haircommander commented Oct 3, 2019

mheon commented Oct 3, 2019

haircommander commented Oct 3, 2019

github-actions bot commented Nov 3, 2019

rhatdan commented Nov 3, 2019

haircommander commented Nov 4, 2019

edsantiago commented Nov 4, 2019

maflcko commented Nov 12, 2019

fkaempfer commented Aug 31, 2019 •

edited

tinyzimmer commented Sep 11, 2019 •

edited