Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retain references to buffers while doing blocking I/O #742

Merged
merged 8 commits into from Oct 30, 2019

Conversation

aantron
Copy link
Collaborator

@aantron aantron commented Oct 30, 2019

@olafhering, #726 was actually a symptom of a small family of pretty serious bugs in Lwt. Some Lwt APIs were not retaining references to user buffers that were passed in for I/O, and the buffers were occasionally being released too early.

These bugs are fairly difficult to trigger, as they require:

  • Doing I/O on a blocking fd (Lwt is primarily used with sockets and pipes in non-blocking mode).
  • The user not retaining a reference to the buffer (most programs reuse their buffers, so they retain the references).
  • The I/O not completing right away (in most cases, blocking I/O calls do complete right away).
  • An intervening GC cycle between when the I/O is scheduled and when it runs.

When triggered, Lwt could access dead (or reallocated) memory when writing, and unexpectedly write to memory of other objects when reading. The buggy APIs were Lwt_unix.writev, Lwt_unix.readv, Lwt_bytes.write, Lwt_bytes.read. Lwt_bytes.wait_mincore also had the issue, even though it's a different kind of function.

I added a new kind of tests specifically for buffer retention, and reviewed the arguments of all functions in lwt.unix for whether they have this bug or not.

I ran the same stress test that I used to reproduce #726 locally, and did not observe the problem with these patches. I'll run it a few more times, and then merge.

@olafhering, thanks again for being thorough and reporting the issue.

Fixes #726.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

test fail in writev: basic blocking
1 participant