New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizations to annotate_content_models for a 10x speedup #5201

merged 2 commits into from Jul 13, 2016


None yet
2 participants

jamalex commented Jul 11, 2016

This is to ameliorate the slowness identified in #5200.

On my laptop, with the full set of videos:

  • Old code: 1 hour 8 min
  • New code: 6 mins 24 seconds

So, the speedup looks to be around 10x.

The main sources of the speedup:

  • .annotate was super slow in peewee; now we use list comprehensions
  • we recalculated shared ancestors numerous times (non-deduped recursion); now, we queue up and then recurse level-by-level, only ever updating each node at most once

@jamalex jamalex added the has PR label Jul 11, 2016

@aronasorman aronasorman added this to the 0.16.7 milestone Jul 13, 2016

@aronasorman aronasorman self-assigned this Jul 13, 2016


This comment has been minimized.

Show comment
Hide comment

aronasorman Jul 13, 2016


No concerns. Merging.


aronasorman commented Jul 13, 2016

No concerns. Merging.

@aronasorman aronasorman merged commit dc1ecf1 into learningequality:0.16.x Jul 13, 2016

1 check failed

ci/circleci Your tests failed on CircleCI

@aronasorman aronasorman deleted the jamalex:content_annotation_speedup branch Jul 13, 2016

@aronasorman aronasorman removed the has PR label Jul 13, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment