Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NO_EPOLL=1 kills CPU, busywaits. #90

Closed
srh opened this issue Nov 20, 2012 · 4 comments
Closed

NO_EPOLL=1 kills CPU, busywaits. #90

srh opened this issue Nov 20, 2012 · 4 comments
Assignees
Milestone

Comments

@srh
Copy link
Contributor

srh commented Nov 20, 2012

To reproduce.

  1. Build rethinkdb with make NO_EPOLL=1.
  2. Run htop or some other CPU monitor in another terminal.
  3. Run ./build/debug-noepoll/rethinkdb

Put a debugf in front of the poll(2) call in arch/runtime/event_queue/poll.cc and you'll see that it spins around the loop, busywaiting, with many calls per millisecond while the server should be idling. Compare that with epoll, where you'll see one call per 5 milliseconds (because of the timerfd timer).

The OS X port (#5) is waiting on this.

@ghost ghost assigned srh Nov 20, 2012
@coffeemug
Copy link
Contributor

You should take this issue. I believe this is a regression -- we had a similar issue before which was fixed. I don't remember what the cause was, but I believe it was pretty simple.

Another possibility is this -- NO_EPOLL might have been only meant for builds with LEGACY_LINUX turned on. Try turning on both NO_EPOLL and LEGACY_LINUX -- do you experience the same problem? If that turns out to be the case, we should probably fix NO_EPOLL to work even in the absence of LEGACY_LINUX, because that's a sensible thing to do.

@srh
Copy link
Contributor Author

srh commented Nov 20, 2012

It is confirmed that we don't get the problem with LEGACY_LINUX=1. That option disables epoll compilation. This will help narrowing down the problem.

@srh
Copy link
Contributor Author

srh commented Nov 21, 2012

It turns out that the problem is not particularly deep. We just pass in 0 as the timeout parameter to poll, when passing -1 is what would give us infinite timeout. When under LEGACY_LINUX, we used ppoll, for which an infinite timeout is passed correctly.

@srh
Copy link
Contributor Author

srh commented Nov 21, 2012

This has been fixed and pushed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants