Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: remove order by clause when counting rows to improve performance #2030

Merged
merged 1 commit into from Jan 10, 2024

Conversation

shogo-nakano-desu
Copy link
Contributor

@shogo-nakano-desu shogo-nakano-desu commented Dec 21, 2023

Purpose of this PR

To enhance performance. I have added a process to remove the order by clause in the num_items method of the Paginator trait, which is not needed to count rows.

Performance Measurement Results

Upon executing queries on test data available to me, a performance improvement of approximately three times was observed when the order by clause was removed. (Unfortunately, I cannot provide the data used here, but I hope for your understanding.) Testing environment is PostgreSQL 14.7.

@shogo-nakano-desu shogo-nakano-desu changed the title refactor: add clear_order_by for num_items refactor: add clear_order_by for num_items to improve performance Dec 21, 2023
@shogo-nakano-desu shogo-nakano-desu changed the title refactor: add clear_order_by for num_items to improve performance refactor: remove order by clause when counting rows to improve performance Jan 3, 2024
@shogo-nakano-desu
Copy link
Contributor Author

@tyt2y3
First and foremost, I would like to express my gratitude for creating and maintaining such an excellent crate. Thank you very much.

At your convenience, could you please review this PR? We are currently using sea-orm in our production environment, and the changes proposed in this PR are expected to significantly contribute to resolving performance bottlenecks. Your assistance in reviewing would be greatly appreciated. Thank you for your time and consideration.

@tyt2y3
Copy link
Member

tyt2y3 commented Jan 9, 2024

Thank you for the contribution. I think this is the right thing to do. Just being curious, what performance gains you saw on a typical vs worst case query?

@shogo-nakano-desu
Copy link
Contributor Author

shogo-nakano-desu commented Jan 9, 2024

Thank you so much for your review.

Table info (Our testing environment)

  • About 15 columns
  • About 8M rows
  • The field used in the ORDER BY clause does not have an index.
    *In our production environment, there are approximately 50M rows and the number is increasing rapidly.

Result in our testing environment

Worst case

Execution Time: 2740 ms -> 840 ms

Typycal case

In the production environment, handling a significantly larger volume of data is expected. Therefore, it is anticipated that the worst-case scenario in terms of data quantity in the test environment would be equivalent to a typical case in the production environment. As a result, we have not measured the typical case in the test environment.

@tyt2y3
Copy link
Member

tyt2y3 commented Jan 9, 2024

Interesting, good to know!

The field used in the ORDER BY clause does not have an index.

That's usually the culprit haha

@shogo-nakano-desu
Copy link
Contributor Author

Now I am pushing a commit to fix Clippy error. It will be reflected when GitHub status is back to normal. https://www.githubstatus.com/

Screenshot 2024-01-09 at 22 38 25

@tyt2y3
Copy link
Member

tyt2y3 commented Jan 9, 2024

GitHub is back now

@shogo-nakano-desu
Copy link
Contributor Author

Great, I pushed my commits. Now I think this PR will pass CI.

Comment on lines -74 to +79
self.query.clone().reset_limit().reset_offset().to_owned(),
self.query
.clone()
.reset_limit()
.reset_offset()
.clear_order_by()
.to_owned(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only these lines are the main change. Others are changes to fix clippy&compile errors.

@shogo-nakano-desu shogo-nakano-desu marked this pull request as draft January 9, 2024 23:48
@shogo-nakano-desu
Copy link
Contributor Author

shogo-nakano-desu commented Jan 9, 2024

I have to fix error: associated function accepts is never used clippy errors in src/driver/sqlx_xxx.rs
Fixed 571b51d

@shogo-nakano-desu shogo-nakano-desu marked this pull request as ready for review January 9, 2024 23:58
@tyt2y3
Copy link
Member

tyt2y3 commented Jan 10, 2024

Sad, those clippy warnings seem to be very confusing

@tyt2y3
Copy link
Member

tyt2y3 commented Jan 10, 2024

Can you reset hard this PR to 3ef07ab ? I will merge that first.
We can work on the clippy fixes later.

@shogo-nakano-desu
Copy link
Contributor Author

Sure, that's a good idea. I will create another PR to fix clippy errors.

@shogo-nakano-desu
Copy link
Contributor Author

Can you reset hard this PR to 3ef07ab ?

Done

@tyt2y3 tyt2y3 merged commit 3dc66aa into SeaQL:master Jan 10, 2024
30 of 31 checks passed
Copy link

🎉 Released In 0.12.11 🎉

Thank you everyone for the contribution!
This feature is now available in the latest release. Now is a good time to upgrade!
Your participation is what makes us unique; your adoption is what drives us forward.
You can support SeaQL 🌊 by starring our repos, sharing our libraries and becoming a sponsor ⭐.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants