Skip to content

Correct the use of ::read() #1025

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Aug 24, 2020
Merged

Correct the use of ::read() #1025

merged 9 commits into from
Aug 24, 2020

Conversation

tobim
Copy link
Member

@tobim tobim commented Aug 21, 2020

No description provided.

@tobim tobim added the bug Incorrect behavior label Aug 21, 2020
@tobim tobim requested a review from a team August 21, 2020 12:01
Copy link
Member

@dominiklohmann dominiklohmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not run this on my machine, since it is very hard to test for this case, but from looking at the code and the man page this seems to be correct.

Copy link
Member

@mavam mavam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit on the fence. There are two changes here (which would have been nice to make explicit, either via commit message or comment):

  1. Do not fail when read(2) returns 0.
  2. Busy-loop until all requested bytes are fetched.

Is this correct?

@dominiklohmann
Copy link
Member

  • Do not fail when read(2) returns 0.

Maybe we need a check for total != 0 at the end of the function?

Copy link
Member

@mavam mavam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a bug though: you are overwriting the same buffer multiple times instead of advancing the pointer by taken.

@tobim tobim force-pushed the story/ch18990/fix-detail-read branch from 6a74f84 to aa3832e Compare August 21, 2020 12:44
@mavam
Copy link
Member

mavam commented Aug 21, 2020

It looks correct now. However, the fact that the first review didn't catch this immediately indicates that we need a more objective way to verify the functionality. How would you suggest we test this? A unit test that reads a large file?

@mavam
Copy link
Member

mavam commented Aug 21, 2020

I think the function signature is not good, actually. It should return expected<size_t> and not use an optional got output argument. The caller should easily determine whether a partial read occurs. This would be ideal:

auto xs = span<byte>{...};
auto n = read(xs);
if (!n)
  // failure
else if (*n != xs.size())
  // partial read

The span migration is too heavy-weight for this PR, and we should apply it uniformly to all related functions, but I think the return value should be changed now that we're touching this function.

@tobim tobim force-pushed the story/ch18990/fix-detail-read branch 3 times, most recently from a017bc6 to e5a4621 Compare August 21, 2020 19:13
@mavam
Copy link
Member

mavam commented Aug 21, 2020

Why are we changing the implementation of the POSIX wrappers so dramatically with the latest changes? Wasn't it almost working already? fread is a buffered libc function, whereas read is just a syscall, no? But we provide the buffer, so now we have two buffers in the mix, one managed by fread and the other being our application-level buffer.


FIXTURE_SCOPE(chunk_tests, fixtures::filesystem)

// This test takes almost one minute on macOS 10.14.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In CI or in general?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or rather: With HFS oder APFS?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general with APFS on my office iMac.

@tobim
Copy link
Member Author

tobim commented Aug 24, 2020

@mavam As discussed I figured out how to use the level 2 API correctly on macOS. The code can now be reviewed again.

Copy link
Member

@mavam mavam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the current solution.

One last thing: please add a changelog entry about the bugfix since this was a user-reported error.

mavam
mavam previously requested changes Aug 24, 2020
tobim and others added 9 commits August 24, 2020 14:04
We now take care to not call `::read` and `::write` with more than
`INT_MAX` bytes.
Co-authored-by: Matthias Vallentin <matthias@tenzir.com>
We now let the caller decided wheter reading a smaller-than-expected
file is a problem or not.
@tobim tobim force-pushed the story/ch18990/fix-detail-read branch from 0fbe4f7 to 0577cfc Compare August 24, 2020 12:08
Copy link
Member

@lava lava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hate to be piling on here, but imho read/write are not good names for these functions. There is already a well-known function called read(int, void*, size_t) that everyone is familiar with, and from looking at the name and signature I'd expect a function called posix::read() to be just a thing wrapper around that. As is, everybody glancing at that will assume there is a bug in the calling code because we dont retry in case of a partial read.

I dont mind committing this now to get the bugfix out to DCSO, but imho renaming this to something like read_all()/safe_read()/read_n()/etc. (that can maybe call the actual read internally) should be the near-term goal.

@lava
Copy link
Member

lava commented Aug 24, 2020

Addendum to the above: I just realized that the API change is bigger than I initially thought. Usually one passes an upper bound to read(), i.e. the size of the buffer. After the changes here, the size also becomes a lower bound, i.e. one has to know exactly how many bytes are to be read in order to call this function.

@tobim
Copy link
Member Author

tobim commented Aug 24, 2020

After the changes here, the size also becomes a lower bound, i.e. one has to know exactly how many bytes are to be read in order to call this function.

That is not the case, see https://github.com/tenzir/vast/pull/1025/files#diff-4d6b0fc0d187a8a24760ed369528e14fR272-R273.

@tobim tobim dismissed mavam’s stale review August 24, 2020 14:58

The requested changes have been applied

@tobim tobim merged commit 6a90b63 into master Aug 24, 2020
@tobim tobim deleted the story/ch18990/fix-detail-read branch August 24, 2020 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants