New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizations to annotate_content_models for a 10x speedup #5201

Merged
merged 2 commits into from Jul 13, 2016

Conversation

Projects
None yet
2 participants
@jamalex
Member

jamalex commented Jul 11, 2016

This is to ameliorate the slowness identified in #5200.

On my laptop, with the full set of videos:

  • Old code: 1 hour 8 min
  • New code: 6 mins 24 seconds

So, the speedup looks to be around 10x.

The main sources of the speedup:

  • .annotate was super slow in peewee; now we use list comprehensions
  • we recalculated shared ancestors numerous times (non-deduped recursion); now, we queue up and then recurse level-by-level, only ever updating each node at most once

@jamalex jamalex added the has PR label Jul 11, 2016

@aronasorman aronasorman added this to the 0.16.7 milestone Jul 13, 2016

@aronasorman aronasorman self-assigned this Jul 13, 2016

@aronasorman

This comment has been minimized.

Show comment
Hide comment
@aronasorman

aronasorman Jul 13, 2016

Member

No concerns. Merging.

Member

aronasorman commented Jul 13, 2016

No concerns. Merging.

@aronasorman aronasorman merged commit dc1ecf1 into learningequality:0.16.x Jul 13, 2016

1 check failed

ci/circleci Your tests failed on CircleCI
Details

@aronasorman aronasorman deleted the jamalex:content_annotation_speedup branch Jul 13, 2016

@aronasorman aronasorman removed the has PR label Jul 13, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment