-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get_entries() query too complicated #123
Comments
Alternative: Don't "build" the query; always shove everything into it and disable various bits through boolean parameters, and let the optimizer sort it out. Cons:
|
Summarizing the possibilities a bit:
Whatever solution we use, it needs to support:
For now, we'll do nothing; we can re-assess after we implement full text search, use indexes, and maybe implement random sort. |
Version 2 of the query builder, with support for WITH, JOIN, and UNION, and enforcing the correct keyword order. ~110 lines base, ~140 lines with extensions (no type annotations). There are full examples of the get_entries() and search_entries() queries. |
Counts for _search.py + _storage.py and _sql_utils.py (the query builder, for after):
It's not much worse. There's less duplication between get_entries() and search_entries(), and it should now be easier to paginate most of the get_...() queries. Also, it should now be much easier to tweak the queries. fb38d91 is a good example of this (the search_entries_read is ~5x faster because 1/5 of the entries are read). Before: $ python -u bench.py time 'search_entries_*' -n1 -r100 \
> | grep -e stat -e min -e p50 | grep -v -e ' 32' -e ' 64'
stat number repeat num_entries search_entries_all search_entries_read
min 1 100 128 0.008 0.003
p50 1 100 128 0.008 0.003
min 1 100 256 0.014 0.005
p50 1 100 256 0.015 0.005
min 1 100 512 0.030 0.011
p50 1 100 512 0.030 0.011
min 1 100 1024 0.062 0.024
p50 1 100 1024 0.062 0.024
min 1 100 2048 0.168 0.046
p50 1 100 2048 0.168 0.046 After: $ python -u bench.py time 'search_entries_*' -n1 -r100 \
> | grep -e stat -e min -e p50 | grep -v -e ' 32' -e ' 64'
stat number repeat num_entries search_entries_all search_entries_read
min 1 100 128 0.009 0.001
p50 1 100 128 0.009 0.001
min 1 100 256 0.016 0.001
p50 1 100 256 0.016 0.001
min 1 100 512 0.031 0.002
p50 1 100 512 0.032 0.002
min 1 100 1024 0.064 0.004
p50 1 100 1024 0.064 0.004
min 1 100 2048 0.171 0.007
p50 1 100 2048 0.172 0.007 |
reader/src/reader/core/storage.py
Line 511 in be336e0
The text was updated successfully, but these errors were encountered: