New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make post_stat async #4157

Merged
merged 1 commit into from Jan 3, 2017

Conversation

Projects
None yet
5 participants
@wisechengyi
Contributor

wisechengyi commented Jan 3, 2017

Problem

Currently posting stat is synchronous so if user cannot reach the stat server, the whole operation will have to wait for timeout.

Solution

This PR changes the post action to be asynchronous. The return value is discarded (also status quo)

Existing test can be found here

Result

It is async and does not block user actions.

@wisechengyi wisechengyi requested review from stuhood, benjyw and mateor Jan 3, 2017

@mateor

mateor approved these changes Jan 3, 2017

@benjyw

benjyw approved these changes Jan 3, 2017

It's too bad that requests doesn't support async io out of the box. There are various integrations into async frameworks, but that's overkill. This is totally fine for this one-off thing.

@wisechengyi wisechengyi merged commit 44b1565 into pantsbuild:master Jan 3, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@baroquebobcat

This comment has been minimized.

Show comment
Hide comment
@baroquebobcat

baroquebobcat Jan 6, 2017

Contributor

This has the unfortunate effect of moving the post to a non main thread, which is not interrupted by Ctrl+C. If you are not running the daemon, this is a problem. I didn't test with the daemon, but I suspect the behavior is ok because I think the thread will run in a different process.

I tested this by starting a local http server that lets requests time out and changing the url to that with a large timeout. This causes the post to hang.

Without this change, Ctrl+C on the hanging request kills the process. With it, Ctrl+C does not interrupt the hanging request, and pants will wait the full length of the timeout.

The result looks something like this:

$ ./pants list -h
<list options here>
^C^C^C^CWARNING: Failed to upload stats to http://0.0.0.0:53101 due to Error: HTTPConnectionPool(host='0.0.0.0', port=53101): Read timed out. (read timeout=20)
Exception KeyboardInterrupt in <module 'threading' from '/.../2.7.6_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.pyc'> ignored
Contributor

baroquebobcat commented Jan 6, 2017

This has the unfortunate effect of moving the post to a non main thread, which is not interrupted by Ctrl+C. If you are not running the daemon, this is a problem. I didn't test with the daemon, but I suspect the behavior is ok because I think the thread will run in a different process.

I tested this by starting a local http server that lets requests time out and changing the url to that with a large timeout. This causes the post to hang.

Without this change, Ctrl+C on the hanging request kills the process. With it, Ctrl+C does not interrupt the hanging request, and pants will wait the full length of the timeout.

The result looks something like this:

$ ./pants list -h
<list options here>
^C^C^C^CWARNING: Failed to upload stats to http://0.0.0.0:53101 due to Error: HTTPConnectionPool(host='0.0.0.0', port=53101): Read timed out. (read timeout=20)
Exception KeyboardInterrupt in <module 'threading' from '/.../2.7.6_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.pyc'> ignored
@wisechengyi

This comment has been minimized.

Show comment
Hide comment
@wisechengyi

wisechengyi Jan 6, 2017

Contributor

iiuc without being to interrupt the side thread, pants has to wait for the full timeout and is unable to exit, which is be worse than the status quo in sync fashion, correct?

I can see a couple of options:

  1. Does https://pypi.python.org/pypi/grequests look promising?

Optionally, in the event of a timeout or any other exception during the connection of the request, you can add an exception handler that will be called with the request and exception inside the main thread

  1. How's the approach to post_stat in a separate process by forking?
Contributor

wisechengyi commented Jan 6, 2017

iiuc without being to interrupt the side thread, pants has to wait for the full timeout and is unable to exit, which is be worse than the status quo in sync fashion, correct?

I can see a couple of options:

  1. Does https://pypi.python.org/pypi/grequests look promising?

Optionally, in the event of a timeout or any other exception during the connection of the request, you can add an exception handler that will be called with the request and exception inside the main thread

  1. How's the approach to post_stat in a separate process by forking?
@stuhood

This comment has been minimized.

Show comment
Hide comment
@stuhood

stuhood Jan 6, 2017

Member

Marking the thread as a daemon thread, and then explicitly joining it during a healthy exit would probably do the right thing. During an unhealthy exit due to ctrl+c, you wouldn't get stats, but during a healthy exit, you would be certain not to interrupt.

Member

stuhood commented Jan 6, 2017

Marking the thread as a daemon thread, and then explicitly joining it during a healthy exit would probably do the right thing. During an unhealthy exit due to ctrl+c, you wouldn't get stats, but during a healthy exit, you would be certain not to interrupt.

@baroquebobcat

This comment has been minimized.

Show comment
Hide comment
@baroquebobcat

baroquebobcat Jan 6, 2017

Contributor

Stu's got a good idea there. If we could pass a thunk to the exiter to join the thread just prior to calling sys.exit, I think that would work nicely. I'm not quite sure what it would look like though.

Contributor

baroquebobcat commented Jan 6, 2017

Stu's got a good idea there. If we could pass a thunk to the exiter to join the thread just prior to calling sys.exit, I think that would work nicely. I'm not quite sure what it would look like though.

@wisechengyi

This comment has been minimized.

Show comment
Hide comment
@wisechengyi

wisechengyi Jan 6, 2017

Contributor

Thanks for the idea, TIL daemon thread! Please see #4170

Contributor

wisechengyi commented Jan 6, 2017

Thanks for the idea, TIL daemon thread! Please see #4170

@wisechengyi

This comment has been minimized.

Show comment
Hide comment
@wisechengyi

wisechengyi Jan 6, 2017

Contributor

then explicitly joining it during a healthy exit

Since post_stat's return value is ignored regardless, would it still be necessary? Would join also delay user like the sync case?

Contributor

wisechengyi commented Jan 6, 2017

then explicitly joining it during a healthy exit

Since post_stat's return value is ignored regardless, would it still be necessary? Would join also delay user like the sync case?

@wisechengyi

This comment has been minimized.

Show comment
Hide comment
@wisechengyi

wisechengyi Jan 6, 2017

Contributor

hmm okay. the daemon thread will get killed half way when the program exits without join.

which circles back to the original issue:
when user is offline, pants run succeeds, but has to wait post_stat to timeout.
What I would propose instead:
Option 1)

  • At the beginning of the pants run, try to make connection with the server in async fashion
  • At the end of run, post_stat in sync if the connection was made, otherwise skip.

Options 2)
fork to post (easier approach)

Contributor

wisechengyi commented Jan 6, 2017

hmm okay. the daemon thread will get killed half way when the program exits without join.

which circles back to the original issue:
when user is offline, pants run succeeds, but has to wait post_stat to timeout.
What I would propose instead:
Option 1)

  • At the beginning of the pants run, try to make connection with the server in async fashion
  • At the end of run, post_stat in sync if the connection was made, otherwise skip.

Options 2)
fork to post (easier approach)

lenucksi added a commit to lenucksi/pants that referenced this pull request Apr 25, 2017

Make post_stat async (#4157)
Make post_stat async so user does not have to wait for it (to timeout if offline) at every Pants invocation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment