diff --git a/docs/source/user-guide/sql/index.rst b/docs/source/user-guide/sql/index.rst index 97753d708e1b..4015864b4701 100644 --- a/docs/source/user-guide/sql/index.rst +++ b/docs/source/user-guide/sql/index.rst @@ -23,6 +23,7 @@ SQL Reference sql_status select + subqueries ddl aggregate_functions scalar_functions diff --git a/docs/source/user-guide/sql/subqueries.md b/docs/source/user-guide/sql/subqueries.md new file mode 100644 index 000000000000..478fab7e7c2d --- /dev/null +++ b/docs/source/user-guide/sql/subqueries.md @@ -0,0 +1,98 @@ + + +# Subqueries + +DataFusion supports `EXISTS`, `NOT EXISTS`, `IN`, `NOT IN` and Scalar Subqueries. + +The examples below are based on the following table. + +```sql +❯ select * from x; ++----------+----------+ +| column_1 | column_2 | ++----------+----------+ +| 1 | 2 | ++----------+----------+ +``` + +## EXISTS + +The `EXISTS` syntax can be used to find all rows in a relation where a correlated subquery produces one or more matches +for that row. Only correlated subqueries are supported. + +```sql +❯ select * from x y where exists (select * from x where x.column_1 = y.column_1); ++----------+----------+ +| column_1 | column_2 | ++----------+----------+ +| 1 | 2 | ++----------+----------+ +1 row in set. +``` + +## NOT EXISTS + +The `NOT EXISTS` syntax can be used to find all rows in a relation where a correlated subquery produces zero matches +for that row. Only correlated subqueries are supported. + +```sql +❯ select * from x y where not exists (select * from x where x.column_1 = y.column_1); +0 rows in set. +``` + +## IN + +The `IN` syntax can be used to find all rows in a relation where a given expression's value can be found in the +results of a correlated subquery. + +```sql +❯ select * from x where column_1 in (select column_1 from x); ++----------+----------+ +| column_1 | column_2 | ++----------+----------+ +| 1 | 2 | ++----------+----------+ +1 row in set. +``` + +## NOT IN + +The `NOT IN` syntax can be used to find all rows in a relation where a given expression's value can not be found in the +results of a correlated subquery. + +```sql +❯ select * from x where column_1 not in (select column_1 from x); +0 rows in set. +``` + +## Scalar Subquery + +A scalar subquery can be used to produce a single value that can be used in many different contexts in a query. Here +is an example of a filter using a scalar subquery. Only correlated subqueries are supported. + +```sql +❯ select * from x y where column_1 < (select sum(column_2) from x where x.column_1 = y.column_1); ++----------+----------+ +| column_1 | column_2 | ++----------+----------+ +| 1 | 2 | ++----------+----------+ +1 row in set. +```