Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement/Improve annotation creation performance #390

Conversation

@c-w
Copy link
Contributor

commented Oct 8, 2019

I noticed slowdowns in the annotation workflow with large documents, similar to the what has been described in #144.

The slowdowns seem to be linked to the performance of two requests:

  • GET /v1/projects/<project_id>/statistics
  • POST /v1/projects/<project_id>/docs/<doc_id>/annotations

This pull request improves the performance of the annotation creation workflow via two main optimizations:

  • Avoid computing the label and user keys in the statistics response after a new annotation has been created (these properties get used on the stats page, not the annotations page). This saves a database aggregation query.

  • Avoid pulling back the full document from the database on annotation creation (the annotation object only has a link to the document so we can use the document id to create the annotation and don't need to hydrate a full document object reference). This saves a database read that may potentially be expensive if the document is large as we may be pulling back thousands of lines of text from the database into the Django server.

@c-w c-w force-pushed the CatalystCode:enhancement/improve-annotation-creation-performance branch from 33bb417 to a87ed71 Oct 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.