Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scan keys less often on ingest #1868

Merged
merged 7 commits into from
Feb 22, 2024
Merged

Scan keys less often on ingest #1868

merged 7 commits into from
Feb 22, 2024

Conversation

lferran
Copy link
Contributor

@lferran lferran commented Feb 22, 2024

Description

Scanning of keys to get field ids is no longer needed (a migration was run log time ago), as we are materializing the list of field ids in a separate key in the KV store.

This will reduce the amount of scan_keys that we do to tikv, who doesn't seem to be very happy.

How was this PR tested?

Existing tests should cover. I also tested agains stage to check that all field ids keys correspond to the valid list of fields.

Copy link

codecov bot commented Feb 22, 2024

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (757469a) 84.34% compared to head (be33baa) 84.30%.

Files Patch % Lines
nucliadb/nucliadb/ingest/orm/resource.py 92.85% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1868      +/-   ##
==========================================
- Coverage   84.34%   84.30%   -0.04%     
==========================================
  Files         328      328              
  Lines       18747    18747              
==========================================
- Hits        15812    15805       -7     
- Misses       2935     2942       +7     
Flag Coverage Δ
ingest 69.46% <92.85%> (-0.29%) ⬇️
sdk 87.85% <ø> (ø)
utils 81.81% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@lferran
Copy link
Contributor Author

lferran commented Feb 22, 2024

[sc-8994]

Copy link

This pull request has been linked to Shortcut Story #8994: reduce tikv scans.

@lferran lferran requested a review from a team February 22, 2024 10:55
# Get all fields
if len(self.all_fields_keys) == 0 or force:
Copy link
Contributor Author

@lferran lferran Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a bug: resources that had no fields would fetching the fields ids list no matter what

Comment on lines -647 to -651
# backward compatibility if all fields key is not set
all_fields = PBAllFieldIDs()
async for (field_type, field_id) in self._scan_fields_ids():
result.append((field_type, field_id))
all_fields.fields.append(FieldID(field_type=field_type, field=field_id))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this bw/compatible code can be removed, as the migration ran in prod long time ago.
I've checked in stage and all fields ids keys match with the correct number of fields

@lferran lferran merged commit bce99c6 into main Feb 22, 2024
83 checks passed
@lferran lferran deleted the dont-scan-keys-for-field-ids branch February 22, 2024 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants