Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix possibly endless loop in ReadDir iterator #50630

Merged
merged 1 commit into from
Jun 26, 2018
Merged

Fix possibly endless loop in ReadDir iterator #50630

merged 1 commit into from
Jun 26, 2018

Conversation

sharkdp
Copy link
Contributor

@sharkdp sharkdp commented May 10, 2018

Certain directories in /proc can cause the ReadDir iterator to loop indefinitely. We get an error code (22) when calling libc's readdir_r on these directories, but entry_ptr is NULL at the same time, signalling the end of the directory stream.

This change introduces an internal state to the iterator such that the Some(Err(..)) value will only be returned once when calling next. Subsequent calls will return None.

fixes #50619

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @sfackler (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 10, 2018
@ishitatsuyuki
Copy link
Contributor

Can we add a test for this?

@@ -51,6 +51,7 @@ pub struct FileAttr {
pub struct ReadDir {
dirp: Dir,
root: Arc<PathBuf>,
end_of_stream: bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be better to just wrap Dir into an Option (and mark the pointer within Dir as NonZero) as that would both be more semantically accurate, more efficient space wise and more future-proof in terms of preventing mistakes.

@nagisa
Copy link
Member

nagisa commented May 11, 2018

Adding a test for this would be fairly difficult, as the original reproducer relies on a fairly specific state of the system that would be pretty hard to emulate in way that wouldn’t spuriously fail all the time.

@sharkdp
Copy link
Contributor Author

sharkdp commented May 12, 2018

I have updated the code to use Option<Dir> and NonNull<libc::DIR> instead of the additional bool flag, as per @nagisa's excellent suggestion.

However, I'm not too happy about my code changes in the next function. It is slightly awkward to pattern-match on dirp: Option<Dir> while having to mutate dirp at the same time. If someone has a better suggestion on how to write this function, I'd be happy to change this again.

@ishitatsuyuki Concerning tests: I would love to write a test for this, but I'm really not sure how to. I also share the same view as @nagisa. Note that there is a way to reproduce the bug in the attached ticket (#50619).

}
if ret.name_bytes() != b"." && ret.name_bytes() != b".." {
return Some(Ok(ret))
if self.dirp.is_none() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we make this check again just below on 259?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, absolutely - I will remove this if clause.

@@ -49,11 +49,11 @@ pub struct FileAttr {
}

pub struct ReadDir {
dirp: Dir,
dirp: Option<Dir>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change related to the infinite loop issue?

Copy link
Contributor Author

@sharkdp sharkdp May 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Option is used to store whether we are at the end of the directory stream (Some(..)) or not (None).

if readdir64_r(dirp.as_mut().unwrap().0.as_mut(),
&mut ret.entry,
&mut entry_ptr) != 0 {
if entry_ptr.is_null() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the actual change being made, right?

Copy link
Contributor Author

@sharkdp sharkdp May 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. If readdir64_r returns a non-empty exit code and signals the end of the directory stream by writing nullptr to entry_ptr, we still want to return the OS error in this iteration but None on the next iteration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change can be made a bit simpler by just using a null pointer instead of the whole option business. Then the diff just becomes a if self.dirp.0.is_null() { return None; } at the top of Next, and an if entry_ptr.is_null() { self.dirp.0 = ptr::null_mut(); } in the error case here.

@pietroalbini
Copy link
Member

Ping from triage @sfackler! The author pushed some new commits.

@bors
Copy link
Contributor

bors commented May 31, 2018

☔ The latest upstream changes (presumably #51050) made this pull request unmergeable. Please resolve the merge conflicts.

@nagisa
Copy link
Member

nagisa commented Jun 1, 2018 via email

@sharkdp
Copy link
Contributor Author

sharkdp commented Jun 1, 2018

Option ensures that other areas of code properly handle the null case as well.

Exactly. Also, I'm personally fine with the solution suggested by @sfackler but I'm not sure if we should actively use nullptr to modify control flow ("actively" meaning that we set dirp to nullptr in cases where the C code does not).

Let me know whether I should go forward with @sfackler's suggestion or resolve the merge conflicts in the current version.

@nagisa
Copy link
Member

nagisa commented Jun 1, 2018

Just rebase it, I’ll r+ it.

@sharkdp
Copy link
Contributor Author

sharkdp commented Jun 2, 2018

Just rebase it

I tried and failed. The whole ReadDir (or now: InnerReadDir) struct has been moved inside an Arc (in this PR: #51050), which means that I can not mutate parts of it(?).

To be honest, I don't really understand why it has been moved into the Arc (Note that the underlying DIR pointer will still be mutated via readdir64_r).

As it stands, I either need some help to make this work again (my attempt at resolving the conflicts can be found here: https://gist.github.com/sharkdp/fd60040daf25630f67f095d0fa616b82 - it currently does not pass the borrow-checker) or I move forward with @sfackler's suggestion, which should still work.

@nagisa
Copy link
Member

nagisa commented Jun 2, 2018

Well, I assume that Dir is now within an Arc specifically so that it is freed only when both ReadDir and all DirEntryies are out of scope (as calling metadata on DirEntry requires a valid open Dir).

The problem is, of course, that now we have two distinct, and colliding requirements. First, we must somehow maintain information about when Dir has completed reading (preferably by setting the Dir field of ReadDir to None), but must as well keep Dir around until all of the DirEntryies are dropped (so that DirEntry is able to call the metadata method). One option would be a

struct DirEntry { 
    dir: Option<Arc<Dir>>,
    root: Arc<PathBuf>,
}

but that seems to increasingly stray away from what should be a 0-cost abstraction.

Quick and dirty alternative is to just add a boolean flag to the struct now, but I’d prefer not to land code like that if there’s any way to avoid it.

@TimNN TimNN added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 12, 2018
@TimNN
Copy link
Contributor

TimNN commented Jun 12, 2018

Friendly ping from triage, @sharkdp, it looks like this PR needs some additional design / implementation work. Do you think you'll be able to look into that?

Certain directories in `/proc` can cause the `ReadDir`
iterator to loop indefinitely. We get an error code (22) when
calling libc's `readdir_r` on these directories, but `entry_ptr`
is `NULL` at the same time, signalling the end of the directory
stream.

This change introduces an internal state to the iterator such
that the `Some(Err(..))` value will only be returned once when
calling `next`. Subsequent calls will return `None`.

fixes #50619
@sharkdp
Copy link
Contributor Author

sharkdp commented Jun 12, 2018

@TimNN I am happy to work on this, but I will probably need some help, as mentioned above.

I have now pushed a new commit which resolves the merge conflicts and handles the problem in the same way as in my first commit - with an additional flag. This is a rather minimal change to fix this bug, but I agree with @nagisa that it is probably not optimal. However, I don't see how the other two discussed solutions (using Option or nullptr to store the state) can still work with the new changes on master (where the inner part of ReadDir is wrapped in an Arc).

@pietroalbini
Copy link
Member

Ping from triage @sfackler! The author needs a bit of help, could you check in on this?

@pietroalbini pietroalbini added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 25, 2018
@pietroalbini
Copy link
Member

Ping from triage @sfackler! The author needs a bit of help, could you check in on this?

@sfackler
Copy link
Member

Oops, sorry. This seems good to me!

@bors r+

@bors
Copy link
Contributor

bors commented Jun 26, 2018

📌 Commit af75314 has been approved by sfackler

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 26, 2018
@bors
Copy link
Contributor

bors commented Jun 26, 2018

⌛ Testing commit af75314 with merge 7d2fa4a...

bors added a commit that referenced this pull request Jun 26, 2018
Fix possibly endless loop in ReadDir iterator

Certain directories in `/proc` can cause the `ReadDir` iterator to loop indefinitely. We get an error code (22) when calling libc's `readdir_r` on these directories, but `entry_ptr` is `NULL` at the same time, signalling the end of the directory stream.

This change introduces an internal state to the iterator such that the `Some(Err(..))` value will only be returned once when calling `next`. Subsequent calls will return `None`.

fixes #50619
@bors
Copy link
Contributor

bors commented Jun 26, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: sfackler
Pushing 7d2fa4a to master...

@bors bors merged commit af75314 into rust-lang:master Jun 26, 2018
sharkdp added a commit to sharkdp/ripgrep that referenced this pull request Sep 17, 2018
This upgrades the minimum required version of Rust to 1.29 in order to
fix BurntSushi#916 (Zombie processes cause `rg --files` to hang in `/proc`).

See also:
- Rust compiler bug ticket: rust-lang/rust#50619
- Rust compiler PR with the fix: rust-lang/rust#50630

closes BurntSushi#916
sharkdp added a commit to sharkdp/fd that referenced this pull request Sep 17, 2018
This upgrades the minimum required version of Rust to 1.29 in order to
fix #288.

See also:
- Rust compiler bug ticket: rust-lang/rust#50619
- Rust compiler PR with the fix: rust-lang/rust#50630

closes #288
sharkdp added a commit to sharkdp/fd that referenced this pull request Sep 18, 2018
This upgrades the minimum required version of Rust to 1.29 in order to
fix #288.

See also:
- Rust compiler bug ticket: rust-lang/rust#50619
- Rust compiler PR with the fix: rust-lang/rust#50630

closes #288
bors pushed a commit to rust-lang-ci/rust that referenced this pull request Dec 18, 2022
Bug rust-lang#50619 was fixed by adding an end_of_stream flag in rust-lang#50630.
Unfortunately, that fix only applied to the readdir_r() path.  When I
switched Linux to use readdir() in rust-lang#92778, I inadvertently reintroduced
the bug on that platform.  Other platforms that had always used
readdir() were presumably never fixed.

This patch enables end_of_stream for all platforms, and adds a
Linux-specific regression test that should hopefully prevent the bug
from being reintroduced again.
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 18, 2022
…imulacrum

fs: Fix rust-lang#50619 (again) and add a regression test

Bug rust-lang#50619 was fixed by adding an end_of_stream flag in rust-lang#50630.
Unfortunately, that fix only applied to the readdir_r() path.  When I
switched Linux to use readdir() in rust-lang#92778, I inadvertently reintroduced
the bug on that platform.  Other platforms that had always used
readdir() were presumably never fixed.

This patch enables end_of_stream for all platforms, and adds a
Linux-specific regression test that should hopefully prevent the bug
from being reintroduced again.
Aaron1011 pushed a commit to Aaron1011/rust that referenced this pull request Jan 6, 2023
…imulacrum

fs: Fix rust-lang#50619 (again) and add a regression test

Bug rust-lang#50619 was fixed by adding an end_of_stream flag in rust-lang#50630.
Unfortunately, that fix only applied to the readdir_r() path.  When I
switched Linux to use readdir() in rust-lang#92778, I inadvertently reintroduced
the bug on that platform.  Other platforms that had always used
readdir() were presumably never fixed.

This patch enables end_of_stream for all platforms, and adds a
Linux-specific regression test that should hopefully prevent the bug
from being reintroduced again.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fs::read_dir iterator loops forever on same entry
8 participants