Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dockets aren't indexed for date_modified #1276

Closed
saizai opened this issue May 14, 2020 · 2 comments
Closed

dockets aren't indexed for date_modified #1276

saizai opened this issue May 14, 2020 · 2 comments

Comments

@saizai
Copy link

saizai commented May 14, 2020

E.g. https://www.courtlistener.com/api/rest/v3/dockets/?date_modified__gte=2014-10-30T07%3A30%3A44.561840Z&fields%21=resource_uri&order_by=date_modified%2Cid&page=9 responds at ~11 records per minute — slower even than 26 rec/min for a high page index, eg https://www.courtlistener.com/api/rest/v3/dockets/?fields%21=resource_uri&order_by=date_modified%2Cid&page=9120).

Contrast e.g. clusters & fjc-integrated-database, which load quickly for the same query (approx. 300-400 rec/min, and not slower with date_modified__gte). And opinions runs at about 60-75 rec/min either way (maybe slightly faster with the gte constraint).

I think it would also be helpful for performance to increase the page size. A query returning 100 records is probably less of a hit than 5 returning 20, given the high search penalty. I'd suggest the max page size be based on the payload size & RAM, i.e. such that it isn't so large that the server runs out of memory or the payload gets cut off due to network i/o loss. Not sure what exactly that number would be, but pretty sure it's not as low as 20.

@mlissner
Copy link
Member

I don't think we can change the page size without doing a major new version of the API (though we could add a parameter for it, somehow, I suppose).

Performance of date_modified is probably more to do with the size of the table than with the absence of an index:

https://github.com/freelawproject/courtlistener/blob/master/cl/search/models.py#L376-L383

Not sure where that leaves us though.

@mlissner
Copy link
Member

mlissner commented Jul 7, 2022

This is just an issue with the size of the table.

@mlissner mlissner closed this as not planned Won't fix, can't repro, duplicate, stale Jul 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants