Skip to content

Commit

Permalink
DOC-8059 changed n1ql subqueries to support 7.0 version (#1905)
Browse files Browse the repository at this point in the history
* DOC-8059 changed n1ql subqueries to support 7.0 version

* DOC-8059 added n1ql queries and refatored syntax for consistency

* DOC 8059 EX7 changed

* Update modules/n1ql/pages/n1ql-language-reference/subqueries.adoc

* Auto stash before merge of "DOC-8059" and "origin/DOC-8059"

* rechecked the queries for the scope

* DOC-8059 added subqueries example to  support scopes and collection

* Update modules/n1ql/pages/n1ql-language-reference/subquery-examples.adoc

* DOC-8059 updated subquery page with suggestion and structure

* DOC-8059 Update modules/n1ql/pages/n1ql-language-reference/subquery-examples.adoc

* DOC-8059 added warning globally

* DOC-8059 added correlated-subquery page

* DOC-8059 Update modules/n1ql/pages/n1ql-language-reference/correlated-subqueries.adoc

* Minor formatting updates for Correlated Subqueries

* Minor formatting updates for Subqueries

* Minor formatting updates for Examples

* Restoring deleted text to Examples

* Restore deleted text in Correlated Subqueries

* Fix incorrect link in Subqueries
  • Loading branch information
sixthcodebrewer committed Jun 26, 2021
1 parent 6a0120d commit a120788
Show file tree
Hide file tree
Showing 3 changed files with 712 additions and 277 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -28,52 +28,80 @@ The query engine smartly reuses the correlated document already fetched in the o
In N1QL, the way in which a subquery is correlated with its parent queries is very important.
That dictates certain behaviors and limitations in writing nested subqueries, and impacts query performance.

Correlation by Source (or FROM clause-Expression)::
=== Correlation by Source (or FROM clause-Expression)

The data source for a query or subquery is specified by its FROM clause.
When the FROM clause of a subquery refers to any variables (aliases, keyspace names, LET/LETTING variables, or document attributes) in the scope of parent queries, then the correlation is established using the source keyspace in the FROM expression.
Such subquery is called *Source Correlated Subquery* and it offers the following benefits:
+
*Nested Paths in FROM clause*
+

Nested Paths in FROM clause::
Couchbase Server version 4.6.2 introduced powerful subquery functionality where correlated nested paths can be used in a subquery FROM clause.
This provides powerful language expressibility, simplicity, and flexibility to N1QL queries especially when dealing with nested array attributes.
See xref:n1ql-language-reference/subqueries.adoc#nested-path-expr[Nested Paths in Subqueries] for more details.
+
*Better Performance*
+

Better Performance::
When correlation is established through the FROM clause in the subquery (with variables in scope), then the N1QL engine knows that the subquery is referring to the same document that is being processed in one of the outer queries.
Therefore, the subquery avoids fetch of the documents used in the subquery.
This significantly improves the performance of such subqueries, as shown in example xref:n1ql-language-reference/subqueries.adoc#Q6[Q6] earlier, to contrast the example xref:n1ql-language-reference/subqueries.adoc#Q6A[Q6A] cannot take advantage of this optimization.

Correlation by Reference (or non FROM clause-Expression)::
=== Correlation by Reference (or non FROM clause-Expression)

In this case the subqueries refer to xref:n1ql-language-reference/subqueries.adoc#section_onz_3tj_mz[variables in the scope] of outer level queries, in clauses other than the FROM clause of the subquery.
In such case, the FROM clause will have an independent keyspace identifier that does not reference any variables in the scope.
This kind of subquery execution works like a JOIN query and requires the USE KEYS clause.
For more information, see <<use-keys,USE KEYS in the Subquery>> and xref:n1ql-language-reference/subqueries.adoc#from-clause[FROM clause in Subqueries] for more details.
+

In the following example, in LET clause of parent query, with correlation introduced in the WHERE clause (`t2.iata = t1.airline`) and USE KEYS clauses of the subquery (referencing `t1` fields).
This query finds the airline and route details of flights that have routes starting from SFO airport.
+

[#Q10]
====
.Q10
[source,n1ql]
----
Example Q10:
SELECT airline_details, t1.destinationairport, t1.stops
FROM `travel-sample` t1
FROM `travel-sample`.inventory.route t1
LET airline_details = (SELECT t2.name, t2.callsign
FROM `travel-sample` t2
FROM `travel-sample`.inventory.airline t2
USE KEYS t1.airlineid
WHERE t2.type = "airline"
AND t2.iata = t1.airline)
WHERE t1.type = "route" AND t1.sourceairport = "SFO"
WHERE t2.iata = t1.airline)
WHERE t1.sourceairport = "SFO"
AND ARRAY_LENGTH(airline_details) > 0
LIMIT 2;
----
.Results
[source,json]
----
[
{
"airline_details": [
{
"callsign": "JETBLUE",
"name": "JetBlue Airways"
}
],
"destinationairport": "AUS",
"stops": 0
},
{
"airline_details": [
{
"callsign": "JETBLUE",
"name": "JetBlue Airways"
}
],
"destinationairport": "BOS",
"stops": 0
}
]
----
====

[#use-keys]
== FROM clause and USE KEYS in Correlated Subqueries

In the example <<Q10,Q10>>, note the USE KEYS clause (in *bold*) used to establish the correlation with the outer query documents.
In the example <<Q10,Q10>>, note the USE KEYS clause used to establish the correlation with the outer query documents.
Otherwise, it is not possible to identify the documents in the subquery that are related to the specific document being considered by the outer query.

It is important to understand the reasoning to include the USE KEYS clause.
Expand All @@ -83,31 +111,30 @@ It entirely depends on how the FROM clause is formulated, which indicates the so
NOTE: When a keyspace name identifier is used in FROM-clause of a subquery, that refers to a collection of documents referenced by the keyspace identifier.
However, when an alias of the keyspace is used in FROM-clause (or any other clauses of the query), that refers to an individual document of the keyspace being considered in the outer query.

FROM clause with Keyspace Identifier::
=== FROM clause with Keyspace Identifier

The USE KEYS clause is mandatory for the primary keyspace of the subquery when the FROM clause has keyspace identifier that is independent of any of the aliases/variables in scope.
This is needed to establish correlation with the documents/keyspace used in the outer query.
For example:

* FROM clause of the subquery in xref:n1ql-language-reference/subqueries.adoc#Q7[Q7] is an independent keyspace identifier `pass:c[`travel-sample`]` and hence the correlation with parent query is established explicitly using the USE KEYS clause through the referential attribute `t1.airlineid`.
* FROM clause of the subquery in <<Q10,Q10>> is an independent keyspace identifier `pass:c[`travel-sample`]` and hence the correlation with parent query is established explicitly using the USE KEYS clause through the referential attribute `t1.airlineid`.


* Similarly, the subquery in xref:n1ql-language-reference/subqueries.adoc#Q6A[Q6A] has an independent keyspace identifier `pass:c[`travel-sample`]` in FROM clause, but the correlation is self-referencing to the same document.
* Similarly, the subquery in xref:n1ql-language-reference/subqueries.adoc#Q9A[Q9A] has an independent keyspace identifier `pass:c[`travel-sample`.inventory.airport]` in FROM clause, but the correlation is self-referencing to the same document.
Therefore, `USE KEYS meta(t).id` is used.

+
This is exactly same as the PrimaryKey-ForeignKey relationship required to xref:n1ql-language-reference/join.adoc[join] two documents that are referenced in the outer/inner queries.
Note that, in `pass:c[`travel-sample`]` keyspace data model, the `"route"` documents refer the `"airline"` documents using the attribute `airlineid`.
Refer to the xref:learn:data/document-data-model.adoc[Data Model].

FROM clause with Expression::
=== FROM clause with Expression

The USE KEYS clause is not required in the subquery when the FROM clause in subquery has generic expression as data source, and not a keyspace name identifier.
The FROM clause expression can be:

* Independent constant expression or subquery expression that does not refer to any variables in scope.
* Generic N1QL expression or subquery that refers to any variables in scope.

+
In the example xref:n1ql-language-reference/subqueries.adoc#Q6[Q6], the FROM clause is an expression referring to the variable/alias `t` (in fact the nested path `t.reviews`) that already establishes correlation and hence the subquery does not need explicit USE KEYS clause.
In the example xref:n1ql-language-reference/subqueries.adoc#Q9[Q9], the FROM clause is an expression referring to the variable/alias `t` (in fact the nested path `t.reviews`) that already establishes correlation and hence the subquery does not need explicit USE KEYS clause.

== Correlated Subquery versus JOINs

Expand All @@ -119,15 +146,16 @@ In general, N1QL recommends usage of JOIN queries when possible, instead of sema
However, in some cases it may be easier or intuitive to formulate some queries using subqueries (instead of JOINs).
In such case, it is recommended to understand the EXPLAIN query plans and performance of both queries.

[#Q10A]
====
.Q10A: Earlier Q10 rewritten with JOIN
[source,n1ql]
----
Example Q7A: Earlier Q7 rewritten with JOIN
SELECT DISTINCT airline.name, airline.callsign, route.destinationairport, route.stops, route.airline
FROM `travel-sample` route
JOIN `travel-sample` airline
FROM `travel-sample`.inventory.route
JOIN `travel-sample`.inventory.airline
ON KEYS route.airlineid
WHERE route.type = "route"
AND airline.type = "airline"
AND route.sourceairport = "SFO"
WHERE route.sourceairport = "SFO"
LIMIT 2;
----
====

0 comments on commit a120788

Please sign in to comment.