Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] ReadAt/WriteAt are inconsistent with moving the files position #19211

Closed
asfimport opened this issue Jul 12, 2018 · 10 comments
Closed

[C++] ReadAt/WriteAt are inconsistent with moving the files position #19211

asfimport opened this issue Jul 12, 2018 · 10 comments

Comments

@asfimport
Copy link

Right now, there is inconsistent behaviour regarding moving the files position pointer after calling ReadAt or WriteAt. For example, the default implementation of ReadAt seeks to the desired offset and calls Read which moves the position pointer. MemoryMappedFile::ReadAt, however, doesn't change the position. WriteableFile::WriteAt seem to move the position in the current implementation, but there is no docstring which prescribes this behaviour.

Antoine suggested that *At methods shouldn't touch the position and it makes more sense, IMHO. The change isn't huge and doesn't seem to break anything internally, but it might break the existing user code.

Reporter: Dimitri Vorona / @alendit
Assignee: Antoine Pitrou / @pitrou

PRs and other links:

Note: This issue was originally created as ARROW-2835. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
Unfortunately, things are bit more complicated as in some cases, ReadAt / WriteAt are forced to update the file position anyway (on Windows, see implementation of FileRead in io-util.cc).

@asfimport
Copy link
Author

Dimitri Vorona / @alendit:
Ok, then we should at least be consistent across implementation, i.e. advance the position in MemoryMappedFile::ReadAt, right?

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
In that case it would mean only advance the position if we're on Windows :) I don't think there's a nice way out of this. If we want to keep the parallelization benefits of ReadAt and WriteAt, we'll need to live with the platform specifics, I think. @wesm

@asfimport
Copy link
Author

Wes McKinney / @wesm:
I agree that we may in general have to live with the platform specific behavior. When there is a choice (rather than being a platform specific detail), I would support having ReadAt/WriteAt set the position to the end of the read segment (or the end of the file, as the case may be). Does that sound like a reasonable scope for this JIRA?

@asfimport
Copy link
Author

Wes McKinney / @wesm:
Moving this off 0.10.0 for now. I will focus on getting ARROW-2701 merged

@asfimport
Copy link
Author

Wes McKinney / @wesm:
Do we want to make any changes here for 0.11?

@asfimport
Copy link
Author

Wes McKinney / @wesm:
Bumping this issue. It seems this is mostly a documentation issue, but if we want to make MemoryMappedFile::WriteAt/ReadAt behave a certain way (e.g advancing the file position), feel free to propose changes

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
I see two other ways around this:

  1. As soon as ReadAt or WriteAt is called, change the internal file state so that any implicitly-positioning operation (such as Read, Write or Tell) fails until Seek is called first.

or 2) Have an internal "positioning" lock that ensures that we can have several ReadAt or WriteAt calls simultaneously, but that implicitly positioning operations wait for the last *At call to end and to restore the file pointer.

I'm not sure how easy #2 is, but should be doable.

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
I'm leaning towards #1, because it helps expose bugs.

As a sidenote, Tensorflow's RandomAccessFile only supports ReadAt-like operation, not implicit positioning with Seek() or Tell(). See https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/platform/file_system.h#L231

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
Issue resolved by pull request 4417
#4417

@asfimport asfimport added this to the 0.14.0 milestone Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants