Replace Use of Kthread-blocking Epoll with Poller Read, Remove Per-Event LStats on Linux #433
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
NOTE: Please do not open a pull request that adds or changes the public API without giving consideration to all supported operating systems.
What does this pull request do?
This PR:
Change #1
Where should the reviewer start?
The changes themselves are fairly trivial, but the concepts are not, so I'd recommend familiarizing yourself with the conceptual background. I've provided a few sources with notes on takeaways, and my explanation and rundown of the Go source code as it pertains to this change.
References
Golang I/O Rundown
In Golang, I/O is blocking, but we can't directly use blocking syscalls, as this would block whichever kernel thread the Goroutine was running in. As a solution, Go I/O is performed via the internal poller package (https://pkg.go.dev/internal/poll), which parks goroutines while waiting for I/O, and uses the platform-specific asynchronous I/O mechanism (e.g. epoll) to detect FDs for which I/O can be performed.
I went through the code to make sure that this is the case, here's a small rundown:
Here's the implementation of read in *os.FIle:
we can see this method calls .read(), which is defined as follows for posix:
It calls pfd.Read(), which we can see refers to poll.FD:
The implementation of poll.FD's read for unix is as follows:
We first execute the read syscall. If the file descriptor is non-blocking, then it will return quickly with EAGAIN and wait for waitRead to return. If it's blocking, then it'll cause the current kernel thread to sleep, and will not call waitRead(), as the fd isn't pollable and doesn't return EAGAIN.
waitRead is as follows:
which calls pd.Wait:
which calls runtime_pollWait() (note that this is the internal/poll package)
which calls netpollblock:
I'll assume that the comment, " returns true if IO is ready, or false if timedout or closed" is true. The current goroutine is parked until I/O is ready.
gpp is set to ready within the function netpollunblock, which is called by netpollready :
netpollready is called by the platform-specific implementation of netpoll(); here's the implementation for Linux, which uses epoll to determine which FDs are ready for I/O, and notifies the corresponding poll descriptor.
How should this be manually tested?
I think that the changes I've made are fairly easy to reason about, and automated tests should validate that the behavior is correct. I tested this change on Linux, which is the affected platform. I did not test on other platforms.
Change #2
Where should the reviewer start?
I would read #404. My argument is that since inotify guarantees that events will be sent in order, and by the time the client receives the event for which we're calling lstat the file may have been removed anyway, there's no need to perform LStat for so many file events.
How should this be manually tested?
I think the automated tests should cover it. I tested this change on Linux, which is the affected platform. I did not test on other platforms.