Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve planning speed using impl Into<Arc<str>> to create Arc<str> rather than &str #9916

Merged
merged 1 commit into from
Apr 2, 2024

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Apr 2, 2024

Which issue does this PR close?

Related to #9764. Follow on to #9824

Rationale for this change

On #9824 @comphead saw that using Arc<String> made a significant difference in planning benchmarks compared to Arc<str> - #9824 (comment)

I believe this is due to the fact that the original data often comes as a String so making an Arc<str>from a &str results in a second copy.

It turns out that the Rust compiler is smart enough to avoid copying when using impl Into<Arc<str>> -- I thought it was, even though the docs seem to suggest that is not the case.

What changes are included in this PR?

Change from

    pub fn bare(table: &str) -> TableReference {

To

    pub fn bare(table: impl Into<Arc<str>>) -> TableReference {

Are these changes tested?

yes, by existing CI

Are there any user-facing changes?

Significantly faster performance (5x in at least one case it seems, on select *)

Benchmark Results

++ critcmp main perf_experiment
group                                         main                                    perf_experiment
-----                                         ----                                    ---------------
logical_aggregate_with_join                   1.07  1287.6±101.41µs        ? ?/sec    1.00  1201.6±70.47µs        ? ?/sec
logical_plan_tpch_all                         1.02     17.3±0.16ms        ? ?/sec     1.00     16.9±0.17ms        ? ?/sec
logical_select_all_from_1000                  4.93     95.3±0.41ms        ? ?/sec     1.00     19.3±0.10ms        ? ?/sec
logical_select_one_from_700                   1.00    741.9±9.77µs        ? ?/sec     1.07   795.3±13.53µs        ? ?/sec
logical_trivial_join_high_numbered_columns    1.07   791.2±11.13µs        ? ?/sec     1.00    739.9±9.52µs        ? ?/sec
logical_trivial_join_low_numbered_columns     1.04    754.9±8.61µs        ? ?/sec     1.00   729.0±21.65µs        ? ?/sec
physical_plan_tpch_all                        1.11    136.9±0.88ms        ? ?/sec     1.00    123.8±1.07ms        ? ?/sec
physical_plan_tpch_q1                         1.05      7.8±0.07ms        ? ?/sec     1.00      7.4±0.07ms        ? ?/sec
physical_plan_tpch_q10                        1.14      6.5±0.05ms        ? ?/sec     1.00      5.7±0.06ms        ? ?/sec
physical_plan_tpch_q11                        1.02      5.1±0.04ms        ? ?/sec     1.00      5.0±0.07ms        ? ?/sec
physical_plan_tpch_q12                        1.05      4.2±0.04ms        ? ?/sec     1.00      4.0±0.03ms        ? ?/sec
physical_plan_tpch_q13                        1.03      2.7±0.03ms        ? ?/sec     1.00      2.6±0.02ms        ? ?/sec
physical_plan_tpch_q14                        1.02      3.5±0.03ms        ? ?/sec     1.00      3.4±0.04ms        ? ?/sec
physical_plan_tpch_q16                        1.08      5.4±0.05ms        ? ?/sec     1.00      5.0±0.04ms        ? ?/sec
physical_plan_tpch_q17                        1.04      5.0±0.17ms        ? ?/sec     1.00      4.8±0.03ms        ? ?/sec
physical_plan_tpch_q18                        1.09      5.6±0.06ms        ? ?/sec     1.00      5.1±0.05ms        ? ?/sec
physical_plan_tpch_q19                        1.06     10.3±0.07ms        ? ?/sec     1.00      9.8±0.06ms        ? ?/sec
physical_plan_tpch_q2                         1.15     12.5±0.07ms        ? ?/sec     1.00     10.8±0.16ms        ? ?/sec
physical_plan_tpch_q20                        1.05      6.6±0.07ms        ? ?/sec     1.00      6.3±0.08ms        ? ?/sec
physical_plan_tpch_q21                        1.14      9.7±0.07ms        ? ?/sec     1.00      8.5±0.06ms        ? ?/sec
physical_plan_tpch_q22                        1.05      4.7±0.05ms        ? ?/sec     1.00      4.5±0.04ms        ? ?/sec
physical_plan_tpch_q3                         1.07      4.2±0.04ms        ? ?/sec     1.00      4.0±0.03ms        ? ?/sec
physical_plan_tpch_q4                         1.15      3.4±0.03ms        ? ?/sec     1.00      2.9±0.03ms        ? ?/sec
physical_plan_tpch_q5                         1.08      6.2±0.05ms        ? ?/sec     1.00      5.7±0.06ms        ? ?/sec
physical_plan_tpch_q6                         1.02      2.1±0.03ms        ? ?/sec     1.00      2.0±0.02ms        ? ?/sec
physical_plan_tpch_q7                         1.14      8.8±0.07ms        ? ?/sec     1.00      7.8±0.06ms        ? ?/sec
physical_plan_tpch_q8                         1.25     12.4±0.09ms        ? ?/sec     1.00      9.9±0.08ms        ? ?/sec
physical_plan_tpch_q9                         1.26      9.4±0.07ms        ? ?/sec     1.00      7.5±0.06ms        ? ?/sec
physical_select_all_from_1000                 5.46    693.9±1.52ms        ? ?/sec     1.00    127.1±1.17ms        ? ?/sec
physical_select_one_from_700                  1.04      4.2±0.02ms        ? ?/sec     1.00      4.1±0.03ms        ? ?/sec

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions core Core DataFusion crate substrait labels Apr 2, 2024
@@ -131,7 +131,7 @@ impl TableReference {
/// As described on [`TableReference`] this does *NO* parsing at
/// all, so "Foo.Bar" stays as a reference to the table named
/// "Foo.Bar" (rather than "foo"."bar")
pub fn bare(table: &str) -> TableReference {
pub fn bare(table: impl Into<Arc<str>>) -> TableReference {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR tests the theory that

    pub fn bare(table: impl Into<Arc<str>>) -> TableReference {

will be faster than

    pub fn bare(table: &str) -> TableReference {

@alamb

This comment was marked as outdated.

@github-actions github-actions bot removed logical-expr Logical plan and expressions core Core DataFusion crate substrait labels Apr 2, 2024
@alamb alamb changed the title [WIP] Test using impl Into<Arc<str>> Improve planning speed using impl Into<Arc<str>> to create Arc<str> rather than &str Apr 2, 2024
@alamb
Copy link
Contributor Author

alamb commented Apr 2, 2024

This PR actually seems quite a bit more effective at squeezing even more performance out of @comphead 's #9824

@@ -530,18 +530,18 @@ pub(crate) fn idents_to_table_reference(
match taker.0.len() {
1 => {
let table = taker.take(enable_normalization);
Ok(OwnedTableReference::bare(&table))
Ok(OwnedTableReference::bare(table))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

somehow it seems the compiler can avoid copying / allocating by getting this as an owned item

@alamb alamb marked this pull request as ready for review April 2, 2024 21:16
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thanks @alamb 🚀

@comphead
Copy link
Contributor

comphead commented Apr 2, 2024

I'm planning to deprecate Owned* types, so will wait this PR to be merged to avoid conflicts

@alamb alamb merged commit d2ba901 into apache:main Apr 2, 2024
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants