Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/_data/menu-sql.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,8 @@
url: sql-ref-syntax-qry-select-tvf.html
- text: Inline Table
url: sql-ref-syntax-qry-select-inline-table.html
- text: Common Table Expression
url: sql-ref-syntax-qry-select-cte.html
- text: EXPLAIN
url: sql-ref-syntax-qry-explain.html
- text: Auxiliary Statements
Expand Down
109 changes: 108 additions & 1 deletion docs/sql-ref-syntax-qry-select-cte.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,111 @@ license: |
limitations under the License.
---

**This page is under construction**
### Description

A common table expression (CTE) defines a temporary result set that a user can reference possibly multiple times within the scope of a SQL statement. A CTE is used mainly in a SELECT statement.

### Syntax

{% highlight sql %}
WITH common_table_expression [ , ... ]
{% endhighlight %}

While `common_table_expression` is defined as
{% highlight sql %}
expression_name [ ( column_name [ , ... ] ) ] [ AS ] ( [ common_table_expression ] query )
{% endhighlight %}

### Parameters

<dl>
<dt><code><em>expression_name</em></code></dt>
<dd>
Specifies a name for the common table expression.
</dd>
</dl>
<dl>
<dt><code><em>query</em></code></dt>
<dd>
A <a href="sql-ref-syntax-qry-select.html">SELECT</a> statement.
</dd>
</dl>

### Examples

{% highlight sql %}
-- CTE with multiple column aliases
WITH t(x, y) AS (SELECT 1, 2)
SELECT * FROM t WHERE x = 1 AND y = 2;
+---+---+
| x| y|
+---+---+
| 1| 2|
+---+---+

-- CTE in CTE definition
WITH t as (
WITH t2 AS (SELECT 1)
SELECT * FROM t2
)
SELECT * FROM t;
+---+
| 1|
+---+
| 1|
+---+

-- CTE in subquery
SELECT max(c) FROM (
WITH t(c) AS (SELECT 1)
SELECT * FROM t
);
+------+
|max(c)|
+------+
| 1|
+------+

-- CTE in subquery expression
SELECT (
WITH t AS (SELECT 1)
SELECT * FROM t
);
+----------------+
|scalarsubquery()|
+----------------+
| 1|
+----------------+

-- CTE in CREATE VIEW statement
CREATE VIEW v AS
WITH t(a, b, c, d) AS (SELECT 1, 2, 3, 4)
SELECT * FROM t;
SELECT * FROM v;
+---+---+---+---+
| a| b| c| d|
+---+---+---+---+
| 1| 2| 3| 4|
+---+---+---+---+

-- If name conflict is detected in nested CTE, then AnalysisException is thrown by default.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to describe this? I personally think users can see the migration guide for these kinds of legacy behaviours.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comment.
I prefer to document the default behavior. I actually thought the default behavior is "inner CTE definitions take precedence over outer definitions". When I tried the example, I was surprised to see the Exception, then I looked the code and found out I need to set spark.sql.legacy.ctePrecedencePolicy to make it work. So I guess it is worth mentioning the default here.

-- SET spark.sql.legacy.ctePrecedencePolicy = CORRECTED (which is recommended),
-- inner CTE definitions take precedence over outer definitions.
SET spark.sql.legacy.ctePrecedencePolicy = CORRECTED;
WITH
t AS (SELECT 1),
t2 AS (
WITH t AS (SELECT 2)
SELECT * FROM t
)
SELECT * FROM t2;
+---+
| 2|
+---+
| 2|
+---+
{% endhighlight %}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding one more CTE example in CREATE VIEW?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

### Related Statements

* [SELECT](sql-ref-syntax-qry-select.html)
3 changes: 2 additions & 1 deletion docs/sql-ref-syntax-qry-select.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ SELECT [ hints , ... ] [ ALL | DISTINCT ] { named_expression [ , ... ] }
<dl>
<dt><code><em>with_query</em></code></dt>
<dd>
Specifies the common table expressions (CTEs) before the main query block.
Specifies the <a href="sql-ref-syntax-qry-select-cte.html">common table expressions (CTEs)</a> before the main query block.
These table expressions are allowed to be referenced later in the FROM clause. This is useful to abstract
out repeated subquery blocks in the FROM clause and improves readability of the query.
</dd>
Expand Down Expand Up @@ -159,3 +159,4 @@ SELECT [ hints , ... ] [ ALL | DISTINCT ] { named_expression [ , ... ] }
* [TABLESAMPLE](sql-ref-syntax-qry-sampling.html)
* [JOIN](sql-ref-syntax-qry-select-join.html)
* [SET Operators](sql-ref-syntax-qry-select-setops.html)
* [Common Table Expression](sql-ref-syntax-qry-select-cte.html)