Skip to content

Fix data_table sort performance#14262

Merged
bryevdv merged 4 commits into
branch-3.7from
bv/14261-table-sort-performance
Jan 29, 2025
Merged

Fix data_table sort performance#14262
bryevdv merged 4 commits into
branch-3.7from
bv/14261-table-sort-performance

Conversation

@bryevdv

@bryevdv bryevdv commented Jan 29, 2025

Copy link
Copy Markdown
Member

This PR removes unnecessary array scans in a tight inner loop that introduced quadratic runtime behavior for table sorts.

Here is a chrome profile showing that a sort on the OP code with N=500000 now takes ~1s.

Screenshot 2025-01-28 at 21 11 18

I do actually wonder if getRecords could be improved, and also wonder why sort is called so many times for a single click. But I don't think it is worth investing much more effort until/unless we do an actual top-to-bottom evaluation of our slickgrid usage wrt to current best practices.

Update

OK argsort is not quite the right solution here in general since in the presence of CDS views the indices may not be contiguous. I am beginning to see why I did things the way I did in 2017, even though I was surely aware of argsort at the time. However, I think constructing a manual reverse lookup will do just as well. @mattpap if there's any better typescript way to write this please let me know. I have gone ahead and left the argsort util since it is tiny and has tests and may be useful somewhere.

Comment thread bokehjs/src/lib/core/util/array.ts
Comment thread bokehjs/src/lib/core/util/array.ts
Comment thread bokehjs/src/lib/models/widgets/tables/data_table.ts Outdated
Co-authored-by: Mateusz Paprocki <mattpap@gmail.com>
@hageldave

Copy link
Copy Markdown

I'm not aware of the bokeh internals and where the argsort function is used, so this question may be irrelevant. But argsort only considers number arrays currently. Do you also allow sorting table columns of strings? And would it then not work because of array[a] - array[b]? So a more general comparator would be needed?

@bryevdv

bryevdv commented Jan 29, 2025

Copy link
Copy Markdown
Member Author

@hageldave It's not used at all currently. I added to use in this PR before realizing an explicit reverse-lookup was needed instead, due to the possibility of CDS views. But since it is very tiny and already includes tests, I decided to leave it in case its useful in the future to have a numeric argsort. If we were ever to need an argsort for more types we would need to generalize it, but no need to bother until a need is demonstrated.

@bryevdv bryevdv merged commit 7bc4140 into branch-3.7 Jan 29, 2025
@bryevdv bryevdv deleted the bv/14261-table-sort-performance branch January 29, 2025 16:49
bryevdv added a commit that referenced this pull request Jan 29, 2025
* add argsort function to BokehJS

* use argsort on table index once for comparator

* construct explicit lookup to accommodate views

* Apply suggestions from code review

Co-authored-by: Mateusz Paprocki <mattpap@gmail.com>

---------

Co-authored-by: Mateusz Paprocki <mattpap@gmail.com>
@bryevdv bryevdv mentioned this pull request Jan 30, 2025
21 tasks
@bryevdv bryevdv modified the milestones: 3.7, 3.6.3 Jan 30, 2025
bryevdv added a commit that referenced this pull request Feb 2, 2025
* add argsort function to BokehJS

* use argsort on table index once for comparator

* construct explicit lookup to accommodate views

* Apply suggestions from code review

Co-authored-by: Mateusz Paprocki <mattpap@gmail.com>

---------

Co-authored-by: Mateusz Paprocki <mattpap@gmail.com>
@github-actions

Copy link
Copy Markdown

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions Bot locked as resolved and limited conversation to collaborators May 14, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DataTable column sorting unacceptably slow

3 participants