Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid DataLayer insert ancestors precheck #15728

Merged
merged 6 commits into from
Jul 17, 2023

Conversation

altendky
Copy link
Contributor

@altendky altendky commented Jul 10, 2023

Purpose:

The expectation is that we will not trigger collisions usually so the except block won't be triggered and we'll save an sql query.

extracted from #15613

Current Behavior:

New Behavior:

Testing Notes:

@altendky altendky added the Changed Required label for PR that categorizes merge commit message as "Changed" for changelog label Jul 10, 2023
@altendky altendky requested a review from a team as a code owner July 10, 2023 16:57
altendky added a commit that referenced this pull request Jul 10, 2023
@almogdepaz
Copy link
Contributor

almogdepaz commented Jul 17, 2023

now we always write and we read if this is a known row
previously we used to always read and write only if its a new row

it looks to me like this change has performance tradeoffs, whats the average expected case ?

@altendky
Copy link
Contributor Author

Yes, this change is intended to boost performance by reducing the number of queries. I can't say that a detailed analysis has been done but evidence so far is that we tend to insert new rows, not re-insert the exact same info, so cutting the query count in half (for this function) when doing that expected normal case is an improvement.

The check is just in case we end up inserting the same keys (SQLite checked) with different values in other columns. It is a double check for errors elsewhere, but I wanted to get the optimization in place without working through whether the double check was critical or important or slightly useful or.... We think this does that.

almogdepaz
almogdepaz previously approved these changes Jul 17, 2023
emlowe
emlowe previously approved these changes Jul 17, 2023
@github-actions github-actions bot added merge_conflict Branch has conflicts that prevent merge to main labels Jul 17, 2023
@github-actions
Copy link
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

1 similar comment
@github-actions
Copy link
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@altendky altendky dismissed stale reviews from emlowe and almogdepaz via c472436 July 17, 2023 18:21
@github-actions github-actions bot removed the merge_conflict Branch has conflicts that prevent merge to main label Jul 17, 2023
@github-actions
Copy link
Contributor

Conflicts have been resolved. A maintainer will review the pull request shortly.

@wallentx wallentx merged commit aa3e156 into main Jul 17, 2023
196 checks passed
@wallentx wallentx deleted the avoid_datalayer_ancestor_insert_precheck branch July 17, 2023 20:37
wallentx pushed a commit that referenced this pull request Jul 19, 2023
* Cache root changes in insert_batch loop

* Reduce selects for root and generation

* remove sql logging, extracted to #15690

* remove optimization extracted to #15691

* remove optimization extracted to #15728

* Delete test_batch_speed.py

benchmark provided in another already merged pr

* add `Root.to_row()`

* shift optimization to `DataStore.get_keys_values_dict()`

* rename to `intermediate_root`

* less tuple returning

* Revert "shift optimization to `DataStore.get_keys_values_dict()`"

This reverts commit b12ae08.

* add `test_data_store.test_insert_batch_reference_and_side()`

---------

Co-authored-by: Kyle Altendorf <sda@fstab.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changed Required label for PR that categorizes merge commit message as "Changed" for changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants