-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance optimizations for alignments tracks, particularly those with many short reads #2523
Conversation
36f3f72
to
541012f
Compare
Codecov Report
@@ Coverage Diff @@
## main #2523 +/- ##
==========================================
+ Coverage 61.09% 61.26% +0.16%
==========================================
Files 543 543
Lines 25141 25311 +170
Branches 5900 5942 +42
==========================================
+ Hits 15361 15506 +145
- Misses 9457 9482 +25
Partials 323 323
Continue to review full report at Codecov.
|
541012f
to
e166a34
Compare
note that by getting rid of the "color" config slot, it removes the ability to specify a jexl callback for alignment feature color. if we want to keep that, it could be restored. |
e166a34
to
949269f
Compare
restores the ability for the user to customize the color using a color callback now. it is by default a magenta color, and we can check against that to avoid a readConfObject on each feature. |
949269f
to
e230d85
Compare
found a pretty significant performance update especially for viewing many short reads by restoring the old granular rect layout back. it was replaced with rbush in an attempt to simplify codebase and address a bug that I thought was caused by layout, but ended up being related to block observability now it seems the rbush has a bad algorithmic characteristic because we query the data structure many times on insert looking for a place we can insert a rect for nice layout packing, but the rbush query time is probably something at least like O(log(n)) leading to probably O(nlog(n)) for inserting a single feature...then inserting many features is like O(log(n)*n^2) so, revert back to granular rect layout which has like O(1) query time essentially, and O(n) insert for a single feature I added a deep sequencing track on volvox to demonstrate: loading this with rbush layout: 70s, much time taken on layout |
7398f41
to
180b1d2
Compare
Looks good to me, merge if you feel it's ready |
on production build, viewing a largish 100kb region with ~25x coverage short reads
http://localhost:3000/?config=test_data%2Fconfig_demo.json&session=share-amxdajf5sI&password=pSD6x
numbers from production build, basically same on dev
before 32s
after 25s
so, maybe about, 20-25% faster on some datasets
removes the 'color' configuration variable on the reads and other stuff to avoid calling a expensive functions on every read
see #969
main standouts in the performance trace now
each of these could be targetted for some additional performance improvement