Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROPOSAL] Count performance on index page #58

Closed
areski opened this issue May 28, 2020 · 2 comments
Closed

[PROPOSAL] Count performance on index page #58

areski opened this issue May 28, 2020 · 2 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@areski
Copy link
Collaborator

areski commented May 28, 2020

In relation to #57 I wanted to shared other points to keep track for future improvements.

I have tested Kaffy with 10 millions records.
In my use case, I will often have some big tables, in those case count can become a huge pain in the neck, unfortunately unavoidable.

One good hack I use with django-admin.
https://stackoverflow.com/questions/39851915/faster-django-admin-paginator-cannot-get-this-django-snippet-to-work#39852663
This allows you to have an estimated count which is very cheap, although not exact, this will not work with filters of course, but most of the time, the admin will open an index page and start from there, so if we can improve that first view that's a big win.

It would be great if we could overload with our own paginator or having an option to select a different one for some view. It doesn't make sense for all tables, so definitely something you would want to use only when table are huge.

More info about count performance, here a good read:
https://www.citusdata.com/blog/2016/10/12/count-performance/

@areski areski added the enhancement New feature or request label May 28, 2020
@aesmail aesmail added this to the v0.8.0 milestone May 28, 2020
@aesmail
Copy link
Owner

aesmail commented Jun 1, 2020

I have used your original PR and slightly modified it to use a builtin cache with ets and GenServers. The result is:

  • The count will be cached if the table has more than 100,000 records.
  • Filtered results (using search and column filters) will not be cached, so the select count(*) query will be performed. Only the total number of records will be cached. We might add custom result caching later.
  • The cached result count will expire after 10 mins by default. After that, the result will be calculated and cached again if it is more than 100,000 records.

I tested this quickly and it seems to work. Let me know if you find any weird behavior.

@aesmail aesmail closed this as completed Jun 1, 2020
@areski
Copy link
Collaborator Author

areski commented Jun 1, 2020

Works like a charm, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants