Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[batch] make create, delete, close not linear in jobs #6569

Merged
merged 3 commits into from Jul 10, 2019

Conversation

jigold
Copy link
Collaborator

@jigold jigold commented Jul 8, 2019

Also only start jobs once a batch is closed.

batch = await Batch.from_db(batch_id, user)
if not batch:
abort(404)
await batch.mark_deleted()
asyncio.ensure_future(batch.mark_deleted())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the caller not getting feedback on whether mark_deleted succeeded the design we want? Similar for the others. Are we going to give the caller a way to track whether the operations they requested succeeded or not (say subscribing to an event stream)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it so we've done all of the work that is not O(n) (updating the jobs), but still return success only when we've updated the batch state in the database.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that sounds about right.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

await j.cancel()
log.info(f'batch {self.id} cancelled')

async def close(self):
await db.batch.update_record(self.id, closed=True)
self.closed = True

for job in await self.get_jobs():
if job._state == 'Ready':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since start_job already checks for 'Ready', do we need the check here?

As an aside, it looks like we spawn 16 consumers, but in the same thread. Could we use a thread-safe Queue implementation, and spawn N threads? In particular, collections.dequeue appears to be thread safe for push/pop operations, and has performance very similar to lists (should be significantly better than Queue). This doesn't need to addressed for this PR, just something I thought could help future work.

I found this exploration by David Beazley (creator of Curio) helpful: https://www.youtube.com/watch?v=MCs5OvhV9S4

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, since start_job awaits on the get, do we need to await on the put? Could use put_nowait, provided the queue size is infinite.

Still, to put less pressure on the queue, I would prefer pushing (eventually, not in this PR), a single collection of jobs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To recap: only question regarding the "Ready" check is for this PR, queue comment is related to broader batch

  • related to "Ready" question: do we potentially wait longer than necessary to start jobs that aren't ready but will be ready soon (Pending), or even never start these jobs (if close is never called again)? If so, it would be potentially slightly better to not check _state here at all, push the entire job list into the queue (easier to do as list in the future), and when the queue is ready to process that job in start_job, then the job is inspected for Ready state. This could protect against transitions happening in between close being called and start_job getting the item.

Copy link
Contributor

@akotlar akotlar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small comments

@akotlar
Copy link
Contributor

akotlar commented Jul 10, 2019

@danking danking merged commit 09a5183 into hail-is:master Jul 10, 2019
jigold added a commit that referenced this pull request Jul 10, 2019
@jigold jigold mentioned this pull request Jul 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants