Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throttling, concurrency, structure #59

Closed
banksJeremy opened this issue May 7, 2014 · 2 comments
Closed

Throttling, concurrency, structure #59

banksJeremy opened this issue May 7, 2014 · 2 comments

Comments

@banksJeremy
Copy link
Collaborator

I've been thinking a bit about what to work on next, after Message (and possibly User and Room) are implemented. Here are some rough thoughts I've had. (I'll probably create more specific tickets for associated work, and implementation details, but these topics are pretty related so it would be useful to have initial discussion in one place.)

Throttling

When we post a new message, or edit one, the requests are throttled, and retried when appropriate. This is good, but doesn't apply to any of the other requests we make. We should generalize the existing code, and make it easy to apply for different types of requests. (There would need to be some code specific to recognizing success/temporary error/fatal error for different types of requests.) Ignoring the implementation for a sec, what behaviour do we want?

It might be reasonable to have two requests queues, one for read requests and one for write requests. That way we can keep seeing updates, even while our chat messages are throttled and being retried. Maybe by default they could limit us to one request per five seconds, or maybe a smaller limit that increases if we keep sending a lot of requests. Or maybe the read queue could allow a couple requests to be in-flight at once, while writing is limited to a single request.

Concurrency

The concurrency model of this code might have been sane before I touched it, but given the work I've done I'm sure isn't any more, and that there are probably many possible race conditions that could result in bugs.

For example, users from two different threads could both make requests and read and write from a Message event at the same time, possibly resulting in errors.

I propose that Wrapper ensures that nothing modify its data from outside of a main worker thread. Anything that could modify data will need to be passed in through a queue, which will be processed by that single thread. Wrapper will also manage our connections for throttling. Any public (non-prefixed) methods on Worker should be safe to use from any thread.

Given that you have a message, and you access the missing message.user_name:

  • it calls message.scrape_transcript()
  • which calls wrapper.scrape_transcript_for_message_id(...)
  • which queues a request for the worker thread, then blocks on a response queue
  • the worker thread makes the HTTP request through Browser (it uses the throttling mechanism, so the request may be queued and not take place instantly)
  • once the worker thread gets the response, is updates all of the Messages and other objects that it has learned about
  • the worker thread returns a value through the response queue
  • execution resumes in the initial thread, with the message.user_name value now populated

The Wrapper should also de-duplicate requests made at the same time, when possible. For example, if two different threads both call request_transcript around the same time because a field is missing, wrapper should notice that they want the same information and only make a single request.

Structure

I'd like to clearly define a division of responsibilities between wrapper and client. One possibility is as follows:

chatexchange.Browser (possible alternative name: Connection)

  • provides a clean interface for operations with the chat server
  • contains everything that directly touches the network
  • contains everything that handles raw HTML
  • returns JSON-style objects (str/int/float/dict/list), either as provided by Stack Exchange or scraped from soup
  • does not manage retries or throttling, except to raise appropriate exceptions
  • nothing thread-related
  • could be used by third parties who want to implement their own chat library, without dealing with soup or URLs

chatexchange.Client (suggested rename from Wrapper1)

  • higher-level interface to chat
  • returns nice objects like Event, Message, User, Room
  • all public methods (on Client and on objects returned from public methods) are safe to use from any thread (though they may block)
  • retrying and throttling logic

At this point, it might make sense to delete asyncwrapper;.


1 "Wrapper" sounds like more of an implementation description than an explanation of what it provides, so I'd prefer a different name if one makes sense.

@Manishearth
Copy link
Owner

For throttling we should probably have a mechanism for waiting and retrying, with the interval increasing every moment. Queue everything up, basically. Message posting already works this way, but we might want some way to generalize.

The concurrency bit makes sense, though I'd have to think a bit on the implementation details.

I agree with the structure part. Oh, and asyncwrapper was obsolete, and should have been removed a long time ago :)

banksJeremy pushed a commit that referenced this issue May 8, 2014
banksJeremy pushed a commit to jeremyBanks/stack.chat that referenced this issue May 14, 2014
banksJeremy pushed a commit to jeremyBanks/stack.chat that referenced this issue May 14, 2014
banksJeremy pushed a commit to jeremyBanks/stack.chat that referenced this issue May 15, 2014
The message scraping methods haven't been moved yet; they'll need to be
disentangled from Message logic.

PEP-8 formatting changes in browser.

ref Manishearth#59
banksJeremy pushed a commit that referenced this issue May 15, 2014
Made pinning/starring methods more intelligent with known attributes.

ref #59
@banksJeremy
Copy link
Collaborator Author

Closing this ticket as redundant with the more-specific descendants #68 and #69.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants