Skip to content

Conversation

Amruta-Ranade
Copy link
Contributor

@Amruta-Ranade Amruta-Ranade commented Aug 8, 2019

Closes #4692

Note: The TC build's failing because of a link that points to a file in Eric's open PR. (https://github.com/cockroachdb/docs/pull/5075/files)

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@cockroach-teamcity
Copy link
Member

@Amruta-Ranade Amruta-Ranade requested review from jseldess and removed request for jordanlewis August 12, 2019 18:53
Copy link
Contributor

@jseldess jseldess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having read everything yet, but my main question is why we're using explain(opt) vs standard explain? Let's chat about this.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Amruta-Ranade and @jseldess)


_includes/sidebar-data-v19.2.json, line 497 at r1 (raw file):

          },
          {
            "title": "Use EXPLAIN to Troubleshoot SQL Performance",

We use too many naming variations across url, title, sidenav:

  • Use EXPLAIN to Troubleshoot SQL Performance
  • sql-tuning-with-explain.html
  • Troubleshoot SQL Performance

I might just go with "SQL Tuning with EXPLAIN" and maybe put this in the Tutorials section and find the right cross-references from troubleshooting pages. Thoughts?


v19.2/sql-tuning-with-explain.md, line 11 at r1 (raw file):

The following examples use [MovR](movr.html), a fictional vehicle-sharing application, to demonstrate CockroachDB SQL statements. Run [`cockroach demo movr`](cockroach-demo.html) to open an interactive SQL shell to a temporary, in-memory cluster with the `movr` database preloaded and set as the [current database](sql-name-resolution.html#current-database).

## Issue: Full table scans (Filtering by a non-indexed column)

I'd remove (Filtering by a non-indexed column) from the heading.


v19.2/sql-tuning-with-explain.md, line 13 at r1 (raw file):

## Issue: Full table scans (Filtering by a non-indexed column)

The most common reason for slow queries is sub-optimal `SELECT` statements which includes full table scans and incorrect use of indexes:

which includes > that include.

Also, it's weird to end both this and the next sentence with colons.


v19.2/sql-tuning-with-explain.md, line 31 at r1 (raw file):

~~~

To understand why this query performs poorly, use [`EXPLAIN (OPT)`](explain.html) the query plan:

Something's missing or not right in ... use EXPLAIN the query plan...


v19.2/sql-tuning-with-explain.md, line 88 at r1 (raw file):

~~~

To understand why performance improved from 4.059ms (without index) to 1.457ms (with index), use [`EXPLAIN (OPT)`](explain.html) to see the new query plan:

Wonder if we should leave out the specific latency measurements and just say, To understand why performance improved, use...?


v19.2/sql-tuning-with-explain.md, line 120 at r1 (raw file):

~~~

This shows you that CockroachDB starts with the secondary index (`users@users_name_idx`). Because it is sorted by `name`, the query can jump directly to the relevant value (`/'Cheyenne Smith' - /'Cheyenne Smith'`). However, the query needs to return values not in the secondary index, so CockroachDB grabs the primary key (`city`/`id`) stored with the `name` value (the primary key is always stored with entries in a secondary index), jumps to that value in the primary index, and then returns the full row.

The explain(opt) output doesn't map well to this story. You can't see the access of the primary index, whereas you can in the normal explain output:

root@127.0.0.1:51365/movr> EXPLAIN SELECT * FROM users WHERE name = 'Cheyenne Smith';
     tree    | field |                  description
+------------+-------+-----------------------------------------------+
  index-join |       |
   │         | table | users@primary
   └── scan  |       |
             | table | users@users_name_idx
             | spans | /"Cheyenne Smith"-/"Cheyenne Smith"/PrefixEnd
(5 rows)

What's the benefit of using opt here?


v19.2/sql-tuning-with-explain.md, line 166 at r1 (raw file):

~~~
                                  text                                   

Again, I think the normal explain output is much more useful here.


v19.2/sql-tuning-with-explain.md, line 308 at r1 (raw file):

-->

Reading from bottom up, you can see that CockroachDB does a full table scan (`scan rides`) first on `rides` to get all rows with a `start_time` in the specified range and then does another full table scan on `users` to find matching rows and calculate the count.

This description doesn't match the opt output. Again, do we really want opt over the standard?

Copy link
Contributor Author

@Amruta-Ranade Amruta-Ranade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Amruta-Ranade and @jseldess)


_includes/sidebar-data-v19.2.json, line 497 at r1 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

We use too many naming variations across url, title, sidenav:

  • Use EXPLAIN to Troubleshoot SQL Performance
  • sql-tuning-with-explain.html
  • Troubleshoot SQL Performance

I might just go with "SQL Tuning with EXPLAIN" and maybe put this in the Tutorials section and find the right cross-references from troubleshooting pages. Thoughts?

I agree with the name-change, but am not sure about placing the doc in the Tutorials section because it ended up not being a tutorial :( I don't mind placing it in the Tutorials section -- just confirming if that's where it belongs.


v19.2/sql-tuning-with-explain.md, line 11 at r1 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

I'd remove (Filtering by a non-indexed column) from the heading.

Done.


v19.2/sql-tuning-with-explain.md, line 13 at r1 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

which includes > that include.

Also, it's weird to end both this and the next sentence with colons.

Done and done.


v19.2/sql-tuning-with-explain.md, line 31 at r1 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

Something's missing or not right in ... use EXPLAIN the query plan...

Done.


v19.2/sql-tuning-with-explain.md, line 88 at r1 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

Wonder if we should leave out the specific latency measurements and just say, To understand why performance improved, use...?

Done.


v19.2/sql-tuning-with-explain.md, line 120 at r1 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

The explain(opt) output doesn't map well to this story. You can't see the access of the primary index, whereas you can in the normal explain output:

root@127.0.0.1:51365/movr> EXPLAIN SELECT * FROM users WHERE name = 'Cheyenne Smith';
     tree    | field |                  description
+------------+-------+-----------------------------------------------+
  index-join |       |
   │         | table | users@primary
   └── scan  |       |
             | table | users@users_name_idx
             | spans | /"Cheyenne Smith"-/"Cheyenne Smith"/PrefixEnd
(5 rows)

What's the benefit of using opt here?

Made all the EXPLAIN (OPT) > EXPLAIN changes that we discussed.


v19.2/sql-tuning-with-explain.md, line 308 at r1 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

This description doesn't match the opt output. Again, do we really want opt over the standard?

As discussed, used EXPLAIN instead of EXPLAIN (OPT) throughout (except for inefficient joins)

@cockroach-teamcity
Copy link
Member

Copy link
Contributor

@jseldess jseldess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with some final comments.

Also, let's add a "See also" section linking to:

- [SQL Best Practices](performance-best-practices-overview.html)
- [SQL Troubleshooting](query-behavior-troubleshooting.html)

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Amruta-Ranade and @jseldess)


_includes/sidebar-data-v19.2.json, line 497 at r1 (raw file):

Previously, Amruta-Ranade (Amruta Ranade) wrote…

I agree with the name-change, but am not sure about placing the doc in the Tutorials section because it ended up not being a tutorial :( I don't mind placing it in the Tutorials section -- just confirming if that's where it belongs.

I'd put it under Tutorials. It's true there aren't steps, but it's a guided experience, and you call it a tutorial in the intro sentence.


_includes/sidebar-data-v19.2.json, line 497 at r2 (raw file):

          },
          {
            "title": "SQL Tuning with EXPLAIN",

Need to put backticks around EXPLAIN here.


v19.2/sql-tuning-with-explain.md, line 2 at r2 (raw file):

---
title: Troubleshoot SQL Performance

I'd change the page title to "SQL Tuning with EXPLAIN" as well.


v19.2/sql-tuning-with-explain.md, line 9 at r2 (raw file):

This tutorial walks you through the common reasons for slow SQL statements and describes how to use [`EXPLAIN`](explain.html) to troubleshoot the issues.

The following examples use [MovR](movr.html), a fictional vehicle-sharing application, to demonstrate CockroachDB SQL statements. Run [`cockroach demo movr`](cockroach-demo.html) to open an interactive SQL shell to a temporary, in-memory cluster with the `movr` database preloaded and set as the [current database](sql-name-resolution.html#current-database).

What about making this a ## Setup section? Also, you might look at Eric's include for this type of thing. Not sure if it'll work seamlessly though.


v19.2/sql-tuning-with-explain.md, line 62 at r2 (raw file):

-->

The row with `scan users` shows you that, without a secondary index on the `name` column, CockroachDB scans every row of the `users` table, ordered by the primary key (`city`/`id`), until it finds the row with the correct `name` value.

scan users doesn't map to the new output. Need to adjust.


v19.2/sql-tuning-with-explain.md, line 311 at r2 (raw file):

-->

Reading from bottom up, you can see that CockroachDB does a full table scan (`scan rides`) first on `rides` to get all rows with a `start_time` in the specified range and then does another full table scan on `users` to find matching rows and calculate the count.

scan rides doesn't map to the new output. Might be a few other misalignments. Check the original performance tuning doc for ideas about how to adjust.


v19.2/sql-tuning-with-explain.md, line 324 at r2 (raw file):

~~~

Adding the secondary index reduced the query time from 1573ms to 61.56ms:

These measurements don't match what's show in your docs. Wonder if we should remove them and just say "reduced the query time significantly" or something.


v19.2/sql-tuning-with-explain.md, line 393 at r2 (raw file):

[Hash joins](joins.html#hash-joins) are more expensive and require more memory than [lookup joins](joins.html#lookup-joins). Hence the [cost-based optimizer](cost-based-optimizer.html) uses a lookup join whenever possible.

For the following query, the cost-based optimizer can’t perform a lookup join because the query doesn’t have a prefix of the `rides` table’s primary key available and thus has to read the entire table and search for a match, resulting in a slow query.

End with a colon.


v19.2/sql-tuning-with-explain.md, line 397 at r2 (raw file):

{% include copy-clipboard.html %}
~~~ sql
> EXPLAIN (OPT) SELECT * FROM VEHICLES JOIN rides on rides.vehicle_id = vehicles.id limit 1;

We're using OPT here for a reason, I assume? Fine with that, but just want to check, and we should consider calling out the difference.


v19.2/sql-tuning-with-explain.md, line 415 at r2 (raw file):

~~~

### Solution: Provide primary key to allow lookup join

We should probably add a sentence or two explanation here?

Copy link
Contributor Author

@Amruta-Ranade Amruta-Ranade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jseldess)


_includes/sidebar-data-v19.2.json, line 497 at r1 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

I'd put it under Tutorials. It's true there aren't steps, but it's a guided experience, and you call it a tutorial in the intro sentence.

Done.


_includes/sidebar-data-v19.2.json, line 497 at r2 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

Need to put backticks around EXPLAIN here.

Done.


v19.2/sql-tuning-with-explain.md, line 166 at r1 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

Again, I think the normal explain output is much more useful here.

Done.


v19.2/sql-tuning-with-explain.md, line 2 at r2 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

I'd change the page title to "SQL Tuning with EXPLAIN" as well.

Done.


v19.2/sql-tuning-with-explain.md, line 9 at r2 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

What about making this a ## Setup section? Also, you might look at Eric's include for this type of thing. Not sure if it'll work seamlessly though.

Previously, I had used Eric's include. But his include gives the user the option to either use cockroach demo or set up a cluster and then use cockroach workload. And as per our discussion, we decided to go with cockroach demo. So changed it to the current para.

## Setup section seems unnecessary to me, but don't mind adding it either.


v19.2/sql-tuning-with-explain.md, line 62 at r2 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

scan users doesn't map to the new output. Need to adjust.

Done.


v19.2/sql-tuning-with-explain.md, line 311 at r2 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

scan rides doesn't map to the new output. Might be a few other misalignments. Check the original performance tuning doc for ideas about how to adjust.

Done.


v19.2/sql-tuning-with-explain.md, line 324 at r2 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

These measurements don't match what's show in your docs. Wonder if we should remove them and just say "reduced the query time significantly" or something.

Agreed. Done.


v19.2/sql-tuning-with-explain.md, line 393 at r2 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

End with a colon.

Done.


v19.2/sql-tuning-with-explain.md, line 397 at r2 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

We're using OPT here for a reason, I assume? Fine with that, but just want to check, and we should consider calling out the difference.

Yeah, the reason was that the OPT shows better joins, but I checked the normal EXPLAIN output and I prefer it in this case as well. Reverting to normal EXPLAIN.


v19.2/sql-tuning-with-explain.md, line 415 at r2 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

We should probably add a sentence or two explanation here?

Done.

@cockroach-teamcity
Copy link
Member

@cockroach-teamcity
Copy link
Member

@Amruta-Ranade Amruta-Ranade merged commit b07c79e into master Aug 14, 2019
@Amruta-Ranade Amruta-Ranade deleted the explain_project branch August 14, 2019 23:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Using logical plans / EXPLAIN
3 participants