-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poor performance when reading as_of a date with many early versions deleted #1384
Labels
bug
Something isn't working
Comments
This will be a failure to short-circuit on fast-tombstone all keys, as the logic is a bit more complex than when searching by exact version number. |
IvoDD
added a commit
that referenced
this issue
Mar 20, 2024
- We fail to exit early when following the version chain with both LOAD_DOWNTO and LOAD_FROM_TIME. Adds a cpp test to show that - Fix will be done in follow up commits
IvoDD
added a commit
that referenced
this issue
Mar 20, 2024
- Adds "_UNDELETED" suffix to LOAD_DOWNTO and LOAD_FROM_TIME to make it obvious that they should only work on undeleted versions - No behavior change, just plain rename
IvoDD
added a commit
that referenced
this issue
Mar 20, 2024
LOAD_DOWNTO - previously we did the check for tombstone_all only on LOAD_UNDELETED and LOAD_LATEST_UNDELETED - Renamed the functions in version_utils which we use to check when to stop following the version chain to less confusing names - Google test VersionMap.FollowingVersionChainEndEarly now passes
5 tasks
IvoDD
added a commit
that referenced
this issue
Mar 22, 2024
LOAD_DOWNTO WORK IN PROGRESS - previously we did the check for tombstone_all only on LOAD_UNDELETED and LOAD_LATEST_UNDELETED - Renamed the functions in version_utils which we use to check when to stop following the version chain to less confusing names - Google test VersionMap.FollowingVersionChainEndEarly now passes
IvoDD
added a commit
that referenced
this issue
Mar 26, 2024
- We fail to exit early when following the version chain with both LOAD_DOWNTO and LOAD_FROM_TIME. Adds a cpp test to show that - Fix will be done in follow up commits
IvoDD
added a commit
that referenced
this issue
Apr 16, 2024
- We fail to exit early when following the version chain with both LOAD_DOWNTO and LOAD_FROM_TIME. Adds a cpp test to show that - Fix will be done in follow up commits
IvoDD
added a commit
that referenced
this issue
May 21, 2024
- We fail to exit early when following the version chain with both LOAD_DOWNTO and LOAD_FROM_TIME. Adds a cpp test to show that - Fix will be done in follow up commits
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
There is a feature called tombstone all that is supposed to prevent version search having to walk the entire historical version list when the early versions have all been deleted.
It works for as_of = a version number (it will return quickly not having found the version)
However when as_of = a date is used it can be slow. This is much more apparent using AWS where the latency is higher.
When there are thousands of versions the read can take several minutes.
Steps/Code to Reproduce
import arcticdb as adb
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
arctic = adb.Arctic()
lib = arctic.get_library('adb_bugs', create_if_missing=True)
N = 3
df = pd.DataFrame(
index=pd.date_range("20240101", periods=N),
data={'col': np.arange(0., N)}
)
write 500 versions
sym1 = 'asof_slow_read'
for i in range(500):
lib.write(sym1, df)
remove early versions
lib.delete(sym1)
add one more version
lib.write(sym1, df)
this is slow (12s in my test)
as_of = datetime.now() - timedelta(days=1)
lib.read(sym1, as_of=as_of)
this is fast (171ms in my test)
lib.read(sym1, as_of=499)
Expected Results
Results are as expected. This is a performance issue.
OS, Python Version and ArcticDB Version
Python 3.10
Linux Linux version 5.15.133.1-microsoft-standard-WSL2
arcticdb 4.3.1
Backend storage used
AWS S3
Additional Context
This is possibly related to this issue (failure to observe tombstone correctly). It may be easier to solve the two together
#1385
The text was updated successfully, but these errors were encountered: