Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race conditions with batching #46

Open
lukasgraf opened this issue Nov 18, 2015 · 1 comment
Open

Race conditions with batching #46

lukasgraf opened this issue Nov 18, 2015 · 1 comment

Comments

@lukasgraf
Copy link
Member

(I'm just dumping this here to not have the conversation in #45 get too convoluted - for now I see this as low to medium priority).

Once we implement some sort of batching / pagination, there's some inherent race conditions that can occur:

Imagine a search query. Because fetching a batch page happens in a separate request, the extent and order of the resultset for a given query can change between retrieving batch pages if another client modified the DB in between. This can lead to either duplicate entries or entries that got dropped between batch pages when a consumer simply iterates over all entries in all batch pages.

ElasticSearch addresses this in a rather elegant way with its Scroll API:

  • The first request just creates a server side, persistent search context that has a certain time to live (TTL).
  • That request is answered with a response that basically just contains a _scroll_id that uniquely identifies the resultset created by the query at that point in time
  • To fetch the results, the client issues subsequent requests to fetch a particular batch page from that search context by referencing it via _scroll_id. On each of those requests the TTL for the search context is reset, so it is kept alive for another $TTL minutes.

I could see a similar concept working for us in order to provide stable resultsets for batched sequences, particularly search results.


I'm just brainstorming here, but maybe something along these lines could work:

POST /Plone/search

{"portal_type": "Document"}

This would create a server side, persistent search context. In terms of search results, this could maybe mean persisting a list of brain RIDs [1] for the resultset that matched the query at that point in time.

Returns a response with a scroll_id:

{"scroll_id": "f40dba5"}

The client then can retrieve result batches via GET requests:

GET /Plone/search?scroll_id=f40dba5&page=1&per_page=20

The link to the first batch page can also be provided in a hypermedia fashion as part of the response to the POST that creates the search context.

Search contexts that exceeded their TTL would be destroyed with the next POST. In addition, they could be actively cleared by the client using DELETE or PURGE.

Compared to a simple, stateless GET implementation, I see these pros/cons:

Advantages:

  • Stable resultsets
  • Appropriate use of HTTP methods (IMHO)
  • Allows for complex queries by using JSON in POST body
  • Still allows for hypermedia batching links because those requests are GET with query string params

Disadvantages:

  • Stateful - REST / HATEOAS?
  • Requires at least two requests for even the most trivial search
  • DB write for search / query operations
  • The returned metadata from the brains would still be up to date (not frozen in time). This could still lead to some surprising results if an object that matched at the time of query is included in the resultset, but has been changed later, and now according to its metadata wouldn't match the query any more.

[1] Is there a way to get the brain RIDs from a catalog resultset (LazyMap) without destroying its lazyness? If not, that would at least partly defeat the purpose of batching 😢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants