Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better query plan viewing experience #4285

Merged
merged 40 commits into from
Jan 24, 2024

Conversation

dullbananas
Copy link
Collaborator

  • Write auto_explain output to log file when running tests
  • Don't delete database after running tests so the log can still be seen
  • Make some insertions batched so the log is small enough (more insertions should be changed later to improve performance)
  • Add SQL comments to some queries so they are labeled and easily searchable in the log
  • Disable auto_explain for migrations

@Nutomic
Copy link
Member

Nutomic commented Dec 18, 2023

Good idea! Query batching would also be useful for the sent_activity table`. All federation activities are going through there now, with each activity being written, then read one by one. So it would be good to collect a few seconds of activities and handle them together.

Edit: That would probably solve #4282

@dullbananas
Copy link
Collaborator Author

Instead of showing query plans when running tests, I will add a separate bin crate that inserts a huge amount of stuff and does 1 query specified by cli arg, with auto_explain disabled during insertions.

Small amounts of rows result in query plans that would be inefficient for realistic amounts.

I might keep SQL comments because they can be useful when auto_explain is used in production.

Batched insertions will be kept so they don't take an extremely long time to run.

@dullbananas
Copy link
Collaborator Author

Query batching would also be useful for the sent_activity table`. All federation activities are going through there now, with each activity being written, then read one by one. So it would be good to collect a few seconds of activities and handle them together.

This should not be done with too many rows at a time because one time an error said something like "too large to send to database" when i tried a huge batch insert

@dullbananas dullbananas marked this pull request as ready for review December 21, 2023 05:53
@dullbananas
Copy link
Collaborator Author

Things other than PostView can be added later

Copy link
Collaborator

@phiresky phiresky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The language batch changes are great because they should also reduce the duration the connection is allocated and not available to the pool.

The diesel comment addition is very interesting. Looks like it shouldn't negatively affect anything except for the complexity and maybe the danger of there being a bug in the ast construction for some edge cases?

The post_aggregates_post trigger change I don't fully understand. I'm not sure when the FOR EACH STATEMENT change is relevant since aren't posts always inserted as one post per individual statement anyways? I also see that the change means you had to add more triggers which isn't that great maybe?

@dullbananas
Copy link
Collaborator Author

I'm not sure when the FOR EACH STATEMENT change is relevant since aren't posts always inserted as one post per individual statement anyways?

The db_perf program sets up the posts table way too slowly with FOR EACH ROW triggers

.collect::<Vec<_>>();

insert_into(local_user_language)
.values(forms)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this should be much better.

scripts/db_perf.sh Show resolved Hide resolved
SelectableExpression,
};

#[derive(QueryId)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is all this for, and what is it doing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added documentation on ValuesFromSeries to explain it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this into db_perf crate as its only used there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

Ok(())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont see the purpose of this wrapper, Rust should handle errors from main just fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also can you run this code from CI with low number of posts and communities to make sure it doesnt break in the future?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The custom error handling makes it possible to show the log path when an error happens. I now changed it to be cleaner and return the correct exit code.

I also made it run in CI.

Copy link
Member

@dessalines dessalines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of great additions here. @phiresky take a look when you get a chance.

@dessalines dessalines merged commit 759f6d8 into LemmyNet:main Jan 24, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants