You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -71,6 +72,9 @@ When querying, if the index name contains a `#` or `_` character, you
71
72
keyspace-ref:: [Required] Specifies the keyspace where the index is created.
72
73
See <<keyspace-ref>>.
73
74
75
+
index-partition:: (Optional) Specifies index partitions.
76
+
See <<index-partition>>.
77
+
74
78
index-using:: (Optional) Specifies the index type.
75
79
See <<index-using>>.
76
80
@@ -160,6 +164,13 @@ collection::
160
164
For example, `airline` indicates the `airline` collection, assuming the query context is set.
161
165
====
162
166
167
+
[[index-partition]]
168
+
=== PARTITION BY HASH Clause
169
+
170
+
Used to partition the index.
171
+
Index partitioning helps increase the query performance by dividing and spreading a large index of documents across multiple nodes, horizontally scaling out an index as needed.
172
+
For more information, see {index-partitioning}[Index Partitioning].
173
+
163
174
[[index-using]]
164
175
=== USING Clause
165
176
@@ -242,6 +253,9 @@ If the value of this property is not less than the number of index nodes in the
:description: Index Partitioning enables you to increase aggregate query performance by dividing and spreading a large index of documents across multiple nodes, horizontally scaling out an index as needed.
3
+
:description: Index partitioning enables you to increase aggregate query performance by dividing and spreading a large index of documents across multiple nodes, horizontally scaling out an index as needed.
image::n1ql-language-reference/index-partition.png["Syntax diagram: refer to source code listing", align=left]
51
+
image::n1ql-language-reference/index-partition.png["Syntax diagram: see source code listing", align=left]
48
52
49
-
_partition-key-expr_::
53
+
[horizontal]
54
+
partition-key-expr::
50
55
A field or an expression over a field representing a partition key.
51
-
For details and examples, refer to <<partition-keys>>.
56
+
For details and examples, see <<partition-keys>>.
52
57
53
58
[[index-with,index-with]]
54
59
=== WITH Clause
@@ -58,53 +63,96 @@ For details and examples, refer to <<partition-keys>>.
58
63
include::partial$grammar/ddl.ebnf[tag=index-with]
59
64
----
60
65
61
-
image::n1ql-language-reference/index-with.png["Syntax diagram: refer to source code listing", align=left]
66
+
image::n1ql-language-reference/index-with.png["Syntax diagram: see source code listing", align=left]
62
67
63
68
When creating a partitioned index, you can use the WITH clause to specify additional options for the partitions.
64
69
65
-
_expr_::
66
-
An object with the following properties:
70
+
[horizontal#index-with-args]
71
+
expr::
72
+
An object with the following properties.
73
+
74
+
[options="header", cols="1a,4a,1a"]
75
+
|===
76
+
| Name | Description | Schema
77
+
78
+
| **num_partition** +
79
+
__optional__
80
+
| The number of partitions to divide the index into.
81
+
For more information, see <<Number of Partitions>>.
67
82
68
-
num_partition;;
69
-
[Optional] An integer that defines the number of partitions to divide into.
70
-
The default value is 8.
71
-
For more details and examples, refer to <<Number of Partitions>>.
83
+
**Default:** `8`
84
+
| Integer
72
85
73
-
nodes;;
74
-
[Optional] An array of strings, specifying a list of nodes.
75
-
The node list to restrict the set of nodes available for placement.
76
-
Refer to the {index-with}[CREATE INDEX] statement for details of the syntax.
77
-
For more details and examples, refer to <<Partition Placement>>.
86
+
| **nodes** +
87
+
__optional__
88
+
| A list of nodes to restrict the set of nodes available for placement.
89
+
For more information, see <<Partition Placement>>.
78
90
79
-
defer_build;;
80
-
[Optional] Boolean.
81
-
When set to true, the index creation operation queues the task for building the index, but immediately pauses the building of the index.
82
-
Refer to the {index-with}[CREATE INDEX] statement for more details.
91
+
For details of the syntax, see {primary-index-with}[CREATE PRIMARY INDEX], {index-with}[CREATE INDEX], or {vector-index-with}[CREATE VECTOR INDEX].
92
+
| String array
83
93
84
-
num_replica;;
85
-
[Optional] An integer specifying the number of replicas of the partitioned index to create.
94
+
| **defer_build** +
95
+
__optional__
96
+
| When set to true, the index creation operation queues the task for building the index, but immediately pauses the building of the index.
97
+
98
+
For more information, see {primary-index-with}[CREATE PRIMARY INDEX], {index-with}[CREATE INDEX], or {vector-index-with}[CREATE VECTOR INDEX].
99
+
| Boolean
100
+
101
+
| **num_replica** +
102
+
__optional__
103
+
| The number of replicas of the partitioned index to create.
86
104
If this integer is greater than or equal to the number of index nodes in the cluster, then the index creation will fail.
87
-
Refer to the {index-with}[CREATE INDEX] statement for more details.
88
105
89
-
secKeySize;;
90
-
[Optional] An integer, specifying the average length of the combined index keys.
91
-
For more details and examples, refer to <<sizing-hints>>.
106
+
For more information, see {primary-index-with}[CREATE PRIMARY INDEX], {index-with}[CREATE INDEX], or {vector-index-with}[CREATE VECTOR INDEX].
107
+
| Integer
108
+
109
+
| **secKeySize** +
110
+
__optional__
111
+
| A sizing hint, specifying the average length of the combined index keys.
112
+
For more information, see <<sizing-hints>>.
113
+
114
+
**Example:** `20`
115
+
| Integer
116
+
117
+
| **docKeySize** +
118
+
__optional__
119
+
| A sizing hint, specifying the average length of the document key `meta().id`.
120
+
For more information, see <<sizing-hints>>.
121
+
122
+
**Example:** `20`
123
+
|Integer
124
+
125
+
| **arrSize** +
126
+
__optional__
127
+
| A sizing hint, specifying the average length of the array fields.
128
+
Non-array fields will be ignored.
129
+
For more information, see <<sizing-hints>>.
130
+
131
+
**Example:** `10`
132
+
| Integer
133
+
134
+
| **numDoc** +
135
+
__optional__
136
+
| A sizing hint, specifying the number of documents in the index.
137
+
For more information, see <<sizing-hints>>.
92
138
93
-
docKeySize;;
94
-
[Optional] An integer, specifying the average length of the document key.
95
-
For more details and examples, refer to <<sizing-hints>>.
139
+
**Example:** `7303`
140
+
| Integer
96
141
97
-
arrSize;;
98
-
[Optional] An integer, specifying the average length of the array fields.
99
-
For more details and examples, refer to <<sizing-hints>>.
142
+
| **residentRatio** +
143
+
__optional__
144
+
| A sizing hint, specifying the resident ratio of the index.
145
+
The resident ratio is the memory usage of the index, as a percentage of its estimated data size.
146
+
For more information, see <<sizing-hints>>.
100
147
101
-
numDoc;;
102
-
[Optional] An integer, specifying the number of documents in the index.
103
-
For more details and examples, refer to <<sizing-hints>>.
148
+
Couchbase recommends setting this property to `10` or higher, to avoid index build failures and other issues.
104
149
105
-
residentRatio;;
106
-
[Optional] An integer, specifying the resident ratio of the index.
107
-
For more details and examples, refer to <<sizing-hints>>.
150
+
**Example:** `50`
151
+
| Integer
152
+
|===
153
+
154
+
Composite Vector indexes and Hyperscale Vector indexes support further options.
155
+
See {index-with}[CREATE INDEX] or {vector-index-with}[CREATE VECTOR INDEX].
108
156
109
157
[[partition-keys]]
110
158
== Partition Keys
@@ -113,13 +161,13 @@ Partition keys are made up of one or more terms, with each term being the docume
113
161
The partition keys are hashed to generate a partition ID for each document.
114
162
The partition ID is then used to identify the partition in which the document's index keys would reside.
115
163
116
-
The partition keys should be immutable, that is, its values shouldn't change once the document is created.
164
+
The partition keys should be immutable: their values should not change once the document is created.
117
165
For example, in the `landmark` keyspace, the field named `activity` almost never changes, and is therefore a good candidate for partition key.
118
166
If the partition keys have changed, then the corresponding document should be deleted and recreated with the new partition keys.
119
167
120
168
Each term in the partition keys can be any JSON data type: number, string, boolean, array, object, or NULL.
121
169
If a term in the partition keys is missing in the document, the term will have a {sqlpp} MISSING value.
122
-
Partition keys do not support {sqlpp} array expressions, e.g. `ARRAY` \... `FOR` \... `IN`.
170
+
Partition keys do not support {sqlpp} array expressions, such as `ARRAY` \... `FOR` \... `IN`.
123
171
124
172
The following table lists some examples of partition keys.
125
173
@@ -198,7 +246,7 @@ CREATE INDEX idx ON route
198
246
// * NULL value
199
247
200
248
[#doc-keys-as-partition-key]
201
-
== Using Document Keys as Partition Key
249
+
== Use Document Keys as Partition Key
202
250
203
251
The simplest way to create a partitioned index is to use the document key as the partition key.
204
252
@@ -223,7 +271,7 @@ With [.cmd]`meta().id` as the partition key, the index keys are evenly distribut
223
271
Every query will gather the qualifying index keys from all the partitions.
224
272
225
273
[#partition-keys-range-query]
226
-
== Choosing Partition Keys for Range Query
274
+
== Choose Partition Keys for Range Query
227
275
228
276
An application has the option to choose the partition key that can minimize latency on a range query for a partitioned index.
229
277
For example, let's say a query has an equality predicate based on the field `sourceairport` and `destinationairport`.
@@ -298,7 +346,7 @@ ORDER BY airline
298
346
====
299
347
300
348
As with equality predicate in the previous examples, the query engine can select qualifying partitions using an IN clause with matching partitioned keys.
301
-
The following example scans at most three partitions with `sourceairport "SFO"`, `"SJC"`, or `"OAK"`.
349
+
The following example scans at most 3 partitions with `sourceairport "SFO"`, `"SJC"`, or `"OAK"`.
302
350
303
351
.Create a partitioned index with partition keys matching query IN clause
304
352
====
@@ -398,12 +446,12 @@ CREATE INDEX idx ON route
398
446
During index rebalancing, the rebalancer takes into account the data skew among the partitions using runtime statistics.
399
447
It tries to even out resource utilization across the index service nodes by moving the partitions across the nodes when possible.
400
448
401
-
== Choosing Partition Keys for Aggregate Query
449
+
== Choose Partition Keys for Aggregate Query
402
450
403
451
As with a range query, when an index is partitioned by document key, an aggregate query can gather the qualifying index keys from all the partitions before performing aggregation in the query engine.
404
-
Whenever aggregate pushdown optimization is allowed, the query engine will push down "partial aggregate" calculation to each partition.
452
+
Whenever aggregate pushdown optimization is allowed, the query engine will push down partial aggregate calculation to each partition.
405
453
The query engine then computes the final aggregate result from the partial aggregates across all the partitions.
406
-
For more details on aggregate query optimization, refer to {gbap}[Group By and Aggregate Performance].
454
+
For more information on aggregate query optimization, see {gbap}[Group By and Aggregate Performance].
@@ -425,7 +473,7 @@ GROUP BY sourceairport, destinationairport;
425
473
----
426
474
====
427
475
428
-
The choice of partition keys can also improve aggregate query performance when the query engine can push down the "full aggregate" calculation to the index node.
476
+
The choice of partition keys can also improve aggregate query performance by enabling the query engine to push down the full aggregate calculation to the index node.
429
477
In this case, the query engine does not have to recompute the final aggregate result from the index nodes.
430
478
In addition, certain pushdown optimizations can only be enabled when a full aggregate result is expected from the index node.
431
479
To enable a full aggregate computation, the index must be created with the following requirements:
@@ -501,40 +549,11 @@ NOTE: To avoid any downtime, before removing the partitioned index, first create
501
549
[[sizing-hints]]
502
550
=== Sizing Hints
503
551
504
-
You can optionally provide sizing hints too.
552
+
You can optionally provide sizing hints to help place the partitions.
505
553
Given the sizing hints, the planner uses a formula to estimate the memory and CPU usage of the index.
506
554
Based on the estimated memory and CPU usage, the planner tries to place the partitions according to the free resources available to each index node.
507
555
508
-
.Sizing Hints
509
-
[cols="2,5,2"]
510
-
|===
511
-
| Optional Sizing Hint | Description | Example
512
-
513
-
| *secKeySize*
514
-
| The average length of the combined index keys.
515
-
| `20`
516
-
517
-
| *docKeySize*
518
-
| The average length of the document key `meta().id`.
519
-
| `20`
520
-
521
-
| *arrSize*
522
-
| The average length of the array field.
523
-
Non-array fields will be ignored.
524
-
| `10`
525
-
526
-
| *numDoc*
527
-
| The number of documents in the index.
528
-
| `7303`
529
-
530
-
| *residentRatio*
531
-
| The memory usage of the index, as a percentage of its estimated data size.
532
-
| `50`
533
-
|===
534
-
535
-
NOTE: Couchbase recommends setting the residentRatio property value over 10 to avoid issues, for example, index build failures.
536
-
537
-
To provide sizing estimation, you can use a command similar to the following examples.
556
+
For a list of sizing hints and example values, see <<index-with,WITH Clause>>.
@@ -584,26 +603,27 @@ When an index node fails, any in-flight query requests (serviced by the failed n
584
603
Any new query requests requiring the lost partition are then serviced by the partitions in the replica.
585
604
586
605
[[rebalancing]]
587
-
== Rebalancing
606
+
== Rebalance
588
607
589
608
When new index nodes are added or removed from the cluster, the rebalance operation attempts to move the index partitions across available index nodes in order to balance resource consumptions.
590
609
At the time of rebalancing, the rebalance operation gathers statistics from each index.
591
610
These statistics are fed to an optimization algorithm to determine the possible placement of each partition in order to minimize the variation of resource consumption across index nodes.
592
611
593
612
The rebalancer will only attempt to balance resource consumption on a best try basis.
594
-
There are situations where the resource consumption cannot be fully balanced.
613
+
In some situations, the resource consumption cannot be fully balanced.
595
614
For example:
596
615
597
616
* The index service will not try to move the index if the cost to move an index across nodes is too high.
598
617
* A cluster has a mix of non-partitioned indexes and partitioned indexes.
599
-
* There is data skew in the partitions.
618
+
* The partitions contain skewed data.
600
619
601
620
ifdef::flag-devex-rest-api[]
602
621
The index redistribution setting enables you to specify how Couchbase Capella redistributes indexes automatically on rebalance.
603
622
endif::flag-devex-rest-api[]
604
623
For more information, see {rebalancing-the-index-service}[Rebalance].
605
624
606
-
== Repairing Failed Partitions
625
+
[[repairing-failed-partitions]]
626
+
== Repair Failed Partitions
607
627
608
628
When an index node fails, the index partitions on that node will be lost.
609
629
The lost partitions can be recovered or repaired when:
@@ -615,13 +635,14 @@ The lost partitions cannot be repaired when the number of remaining nodes is les
615
635
616
636
== Performance Considerations
617
637
638
+
// Nothing
639
+
618
640
=== Max Parallelism
619
641
620
642
Along with aggregate pushdown optimization, an application can further enhance the aggregate query performance by computing aggregation in parallel for each partition in the index service.
621
643
This can be controlled by specifying the parameter `max_parallelism` when issuing a query.
622
644
In Couchbase Capella, `max_parallelism` is set by default to match the number of partitions of the index.
623
-
Note that when `max_parallelism` is set to the default value, the index service uses more CPU and memory since the query traffic is increased.
624
-
645
+
When `max_parallelism` is set to the default value, the index service uses more CPU and memory since the query traffic is increased.
0 commit comments