Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util: fix race condition in WaitForFile #3162

Merged

Conversation

giuseppe
Copy link
Member

enable polling also when using inotify. It is generally useful to
have it as under high load inotify can lose notifications. It also
solves a race condition where the file is created while the watcher
is configured and it'd wait until the timeout and fail.

Closes: #2942

Signed-off-by: Giuseppe Scrivano gscrivan@redhat.com

@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M labels May 20, 2019
@AkihiroSuda
Copy link
Collaborator

Is this for slirp4netns API socket? slirp4netns --ready-fd does not work?

@giuseppe
Copy link
Member Author

Is this for slirp4netns API socket? slirp4netns --ready-fd does not work?

IIRC, the ready-fd is written by the slirp4netns child while the api socket is managed by the parent so we cannot assume it is ready when ready-fd returned

@rhatdan
Copy link
Member

rhatdan commented May 20, 2019

LGTM
@mheon @vrothberg @baude @TomSweeneyRedHat PTAL

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can simplify the code with using case <- time.After(...) in line 129. Then we could delete lines 108-122 entirely.

@giuseppe
Copy link
Member Author

I think we can simplify the code with using case <- time.After(...) in line 129. Then we could delete lines 108-122 entirely.

yes good point. I'll change it

Copy link
Member

@TomSweeneyRedHat TomSweeneyRedHat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming happy tests

@giuseppe
Copy link
Member Author

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 20, 2019
@AkihiroSuda
Copy link
Collaborator

we cannot assume it is ready when ready-fd returned

I feel this is a bug. I opened rootless-containers/slirp4netns#90

@giuseppe
Copy link
Member Author

/hold cancel

Pushed a new version that simplifies it even further and doesn't use any go routine

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 20, 2019
Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @giuseppe

// also useful when using inotify as if for any reasons we missed
// a notification, we won't hang the process.
_, err := os.Stat(path)
if err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be checking if os.IsNotExist(err) and blowing up if the err is something else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had moved the existing code, but yes that is better. I've pushed a new version where it checks the error type.

@rhatdan
Copy link
Member

rhatdan commented May 20, 2019

LGTM,
Lets merge when tests pass

@mheon
Copy link
Member

mheon commented May 20, 2019

Tests are very angry

@mheon
Copy link
Member

mheon commented May 20, 2019

Looks like exec broke

@giuseppe giuseppe force-pushed the fix-hang-waitforfile branch 3 times, most recently from 541f7f7 to 390f29c Compare May 20, 2019 15:52
enable polling also when using inotify.  It is generally useful to
have it as under high load inotify can lose notifications.  It also
solves a race condition where the file is created while the watcher
is configured and it'd wait until the timeout and fail.

Closes: containers#2942

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
let the writer of the channel close it.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@giuseppe
Copy link
Member Author

@mheon tests are passing now

@rhatdan
Copy link
Member

rhatdan commented May 20, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 20, 2019
@openshift-merge-robot openshift-merge-robot merged commit a791242 into containers:master May 20, 2019
@rh-atomic-bot rh-atomic-bot mentioned this pull request May 20, 2019
7 tasks
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 26, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rootless podman won't start when exposing ports
8 participants