Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

satellite/overlay: avoid large statement for piece counts #3001

Merged
merged 6 commits into from
Sep 11, 2019

Conversation

egonelbre
Copy link
Member

@egonelbre egonelbre commented Sep 11, 2019

What: Use array arguments instead of constructing a large single query.

Why: Concatenated SQL statements are harder to audit and often can end up slower.

Benchmark                                      old ns/op      new ns/op     delta
BenchmarkDB_PieceCounts/Sqlite/Update-32       2403696100     136937262     -94.30%
BenchmarkDB_PieceCounts/Sqlite/All-32          16454915       17416226      +5.84%
BenchmarkDB_PieceCounts/Postgres/Update-32     562646350      149697062     -73.39%
BenchmarkDB_PieceCounts/Postgres/All-32        13657257       14630751      +7.13%

PS: Each commit is separately reviewable.

Please describe the tests:

  • Test 1:
  • Test 2:

Please describe the performance impact:

Code Review Checklist (to be filled out by reviewer)

  • Does the PR describe what changes are being made?
  • Does the PR describe why the changes are being made?
  • Does the code follow our style guide?
  • Does the code follow our testing guide?
  • Is the PR appropriately sized? (If it could be broken into smaller PRs it should be)
  • Does the new code have enough tests? (every PR should have tests or justification otherwise. Bug-fix PRs especially)
  • Does the new code have enough documentation that answers "how do I use it?" and "what does it do?"? (both source documentation and higher level, diagrams?)
  • Does any documentation need updating?
  • Do the database access patterns make sense?

@cla-bot cla-bot bot added the cla-signed label Sep 11, 2019
Benchmark                                      old ns/op      new ns/op     delta
BenchmarkDB_PieceCounts/Sqlite/Update-32       2403696100     136937262     -94.30%
BenchmarkDB_PieceCounts/Sqlite/All-32          16454915       17416226      +5.84%
BenchmarkDB_PieceCounts/Postgres/Update-32     562646350      149697062     -73.39%
BenchmarkDB_PieceCounts/Postgres/All-32        13657257       14630751      +7.13%
@egonelbre egonelbre marked this pull request as ready for review September 11, 2019 09:13
@egonelbre egonelbre requested review from a team and stefanbenten September 11, 2019 09:13
@ghost ghost requested review from kaloyan-raev and navillasa and removed request for a team September 11, 2019 09:14
@egonelbre egonelbre added Request Code Review Code review requested Reviewer Can Merge If all checks have passed, non-owner can merge PR labels Sep 11, 2019
@egonelbre egonelbre changed the title satellite/overlay: avoid single large statement satellite/overlay: avoid single large statement for piece counts Sep 11, 2019
@egonelbre egonelbre changed the title satellite/overlay: avoid single large statement for piece counts satellite/overlay: avoid large statement for piece counts Sep 11, 2019
Copy link
Contributor

@stefanbenten stefanbenten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also not sure why this is written directly into the nodes table, rather than being cached and put into the table together with the other metrics

initialCounts, err := overlaydb.AllPieceCounts(ctx)
require.NoError(t, err)
require.Empty(t, initialCounts)
// TODO: make it actually return everything
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it fine to leave the TODO here and remove the code below?
Or better fix it directly? 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to fix it directly at this moment, otherwise the PR would grow too large. But I certainly can remove it.

for _, count := range counts {
_, err := tx.Tx.ExecContext(ctx, query, count.Count, count.ID)
if err != nil {
return Error.Wrap(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldnt this mean that if a single update fails, we return out of the entire function and discard the other inserts.
And the piece count for some nodes could be updated by then, as i dont see that we rollback in an error case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely up to speed how critical this number is..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldnt this mean that if a single update fails, we return out of the entire function and discard the other inserts.

Yes, but this would also mean that there's something critically wrong already. Either we ended up creating piece count updates for storage nodes that aren't in our table. Or somebody deleted the node from the table... either would be problematic.

There of course could be an issue with database.

Of course, we don't care that much here, since it's for sqlite, which is mainly for testing.

And the piece count for some nodes could be updated by then, as i dont see that we rollback in an error case?

When WithTx returns an error, it will rollback the transaction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that should be ok then 👍

@egonelbre
Copy link
Member Author

@stefanbenten this is not for metrics, this number is used for calculating garbage collection bloom filter sizes.

_, err = cache.db.DB.ExecContext(ctx, cache.db.Rebind(sqlQuery), args...)
return err
}
sort.Slice(counts, func(i, k int) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious why not use .SliceStable here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the node ids are guaranteed to be unique, so there wouldn’t be a difference in the result.

Copy link
Collaborator

@ethanadams ethanadams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@egonelbre egonelbre merged commit 3d410ad into master Sep 11, 2019
@egonelbre egonelbre deleted the ee/piece-counts branch September 11, 2019 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed Request Code Review Code review requested Reviewer Can Merge If all checks have passed, non-owner can merge PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants