Throttling, concurrency, structure #59

banksJeremy · 2014-05-07T21:25:07Z

I've been thinking a bit about what to work on next, after Message (and possibly User and Room) are implemented. Here are some rough thoughts I've had. (I'll probably create more specific tickets for associated work, and implementation details, but these topics are pretty related so it would be useful to have initial discussion in one place.)

Throttling

When we post a new message, or edit one, the requests are throttled, and retried when appropriate. This is good, but doesn't apply to any of the other requests we make. We should generalize the existing code, and make it easy to apply for different types of requests. (There would need to be some code specific to recognizing success/temporary error/fatal error for different types of requests.) Ignoring the implementation for a sec, what behaviour do we want?

It might be reasonable to have two requests queues, one for read requests and one for write requests. That way we can keep seeing updates, even while our chat messages are throttled and being retried. Maybe by default they could limit us to one request per five seconds, or maybe a smaller limit that increases if we keep sending a lot of requests. Or maybe the read queue could allow a couple requests to be in-flight at once, while writing is limited to a single request.

Concurrency

The concurrency model of this code might have been sane before I touched it, but given the work I've done I'm sure isn't any more, and that there are probably many possible race conditions that could result in bugs.

For example, users from two different threads could both make requests and read and write from a Message event at the same time, possibly resulting in errors.

I propose that Wrapper ensures that nothing modify its data from outside of a main worker thread. Anything that could modify data will need to be passed in through a queue, which will be processed by that single thread. Wrapper will also manage our connections for throttling. Any public (non-prefixed) methods on Worker should be safe to use from any thread.

Given that you have a message, and you access the missing message.user_name:

it calls message.scrape_transcript()
which calls wrapper.scrape_transcript_for_message_id(...)
which queues a request for the worker thread, then blocks on a response queue
the worker thread makes the HTTP request through Browser (it uses the throttling mechanism, so the request may be queued and not take place instantly)
once the worker thread gets the response, is updates all of the Messages and other objects that it has learned about
the worker thread returns a value through the response queue
execution resumes in the initial thread, with the message.user_name value now populated

The Wrapper should also de-duplicate requests made at the same time, when possible. For example, if two different threads both call request_transcript around the same time because a field is missing, wrapper should notice that they want the same information and only make a single request.

Structure

I'd like to clearly define a division of responsibilities between wrapper and client. One possibility is as follows:

chatexchange.Browser (possible alternative name: Connection)

provides a clean interface for operations with the chat server
contains everything that directly touches the network
contains everything that handles raw HTML
returns JSON-style objects (str/int/float/dict/list), either as provided by Stack Exchange or scraped from soup
does not manage retries or throttling, except to raise appropriate exceptions
nothing thread-related
could be used by third parties who want to implement their own chat library, without dealing with soup or URLs

chatexchange.Client (suggested rename from Wrapper¹)

higher-level interface to chat
returns nice objects like Event, Message, User, Room
all public methods (on Client and on objects returned from public methods) are safe to use from any thread (though they may block)
retrying and throttling logic

At this point, it might make sense to delete asyncwrapper;.

¹ "Wrapper" sounds like more of an implementation description than an explanation of what it provides, so I'd prefer a different name if one makes sense.

The text was updated successfully, but these errors were encountered:

Manishearth · 2014-05-08T05:39:13Z

For throttling we should probably have a mechanism for waiting and retrying, with the interval increasing every moment. Queue everything up, basically. Message posting already works this way, but we might want some way to generalize.

The concurrency bit makes sense, though I'd have to think a bit on the implementation details.

I agree with the structure part. Oh, and asyncwrapper was obsolete, and should have been removed a long time ago :)

� ref Manishearth#59

The message scraping methods haven't been moved yet; they'll need to be disentangled from Message logic. PEP-8 formatting changes in browser. ref Manishearth#59

Made pinning/starring methods more intelligent with known attributes. ref #59

banksJeremy · 2014-05-16T07:04:39Z

Closing this ticket as redundant with the more-specific descendants #68 and #69.

banksJeremy pushed a commit that referenced this issue May 8, 2014

Removes asyncwrapper (ref #59).

02fe1c3

banksJeremy pushed a commit to jeremyBanks/stack.chat that referenced this issue May 14, 2014

Renames browser to connection and wrapper to client.

631a68e

� ref Manishearth#59

banksJeremy pushed a commit to jeremyBanks/stack.chat that referenced this issue May 14, 2014

Renames chatexchange.wrapper.SEChatWrapper -> chatexchange.client.Client

dad4625

� ref Manishearth#59

banksJeremy pushed a commit that referenced this issue May 15, 2014

Moves remaining request and soup logic (Message scraping) to browser.

ca0ea38

Made pinning/starring methods more intelligent with known attributes. ref #59

This was referenced May 15, 2014

Work in Progress jeremyBanks/stack.chat#1

Merged

Concurrency/Thread-safety #68

Open

Throttling, Retrying #69

Open

banksJeremy closed this as completed May 16, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Throttling, concurrency, structure #59

Throttling, concurrency, structure #59

banksJeremy commented May 7, 2014

Manishearth commented May 8, 2014

banksJeremy commented May 16, 2014

Throttling, concurrency, structure #59

Throttling, concurrency, structure #59

Comments

banksJeremy commented May 7, 2014

Throttling

Concurrency

Structure

chatexchange.Browser (possible alternative name: Connection)

chatexchange.Client (suggested rename from Wrapper1)

Manishearth commented May 8, 2014

banksJeremy commented May 16, 2014

chatexchange.Client (suggested rename from Wrapper¹)