From 50737652f1f2162f6f512f539c7e85538af387d7 Mon Sep 17 00:00:00 2001 From: Katya Macedo Date: Mon, 18 Dec 2023 13:35:53 -0600 Subject: [PATCH 01/15] Refactor streaming ingestion docs --- docs/api-reference/supervisor-api.md | 286 ++++++++++-- docs/configuration/index.md | 2 +- .../extensions-core/kafka-ingestion.md | 411 ++++++++++-------- .../kafka-supervisor-operations.md | 287 ------------ .../kafka-supervisor-reference.md | 261 ----------- .../extensions-core/kinesis-ingestion.md | 394 ++--------------- docs/ingestion/data-formats.md | 172 ++++++-- docs/ingestion/ingestion-spec.md | 2 +- docs/ingestion/streaming.md | 35 ++ docs/ingestion/supervisor.md | 117 +++++ docs/querying/sql-metadata-tables.md | 2 +- docs/tutorials/tutorial-kafka.md | 4 +- website/redirects.js | 4 +- website/sidebars.json | 4 +- 14 files changed, 821 insertions(+), 1160 deletions(-) delete mode 100644 docs/development/extensions-core/kafka-supervisor-operations.md delete mode 100644 docs/development/extensions-core/kafka-supervisor-reference.md create mode 100644 docs/ingestion/streaming.md create mode 100644 docs/ingestion/supervisor.md diff --git a/docs/api-reference/supervisor-api.md b/docs/api-reference/supervisor-api.md index c5f6c0762709..ddbb67c819e0 100644 --- a/docs/api-reference/supervisor-api.md +++ b/docs/api-reference/supervisor-api.md @@ -36,7 +36,7 @@ The following table lists the properties of a supervisor object: |Property|Type|Description| |---|---|---| |`id`|String|Unique identifier.| -|`state`|String|Generic state of the supervisor. Available states:`UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`. See [Apache Kafka operations](../development/extensions-core/kafka-supervisor-operations.md) for details.| +|`state`|String|Generic state of the supervisor. Available states:`UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`. See [Supervisor reference](../ingestion/supervisor.md#status-report) for more information.| |`detailedState`|String|Detailed state of the supervisor. This property contains a more descriptive, implementation-specific state that may provide more insight into the supervisor's activities than the `state` property. See [Apache Kafka ingestion](../development/extensions-core/kafka-ingestion.md) and [Amazon Kinesis ingestion](../development/extensions-core/kinesis-ingestion.md) for supervisor-specific states.| |`healthy`|Boolean|Supervisor health indicator.| |`spec`|Object|Container object for the supervisor configuration.| @@ -1205,9 +1205,7 @@ Host: http://ROUTER_IP:ROUTER_PORT Retrieves the current status report for a single supervisor. The report contains the state of the supervisor tasks and an array of recently thrown exceptions. -For additional information about the status report, see the topic for each streaming ingestion methods: -* [Amazon Kinesis](../development/extensions-core/kinesis-ingestion.md#get-supervisor-status-report) -* [Apache Kafka](../development/extensions-core/kafka-supervisor-operations.md#getting-supervisor-status-report) +For additional information about the status report, see [Supervisor reference](../ingestion/supervisor.md#status-report). #### URL @@ -1309,13 +1307,184 @@ Host: http://ROUTER_IP:ROUTER_PORT ``` +### Get supervisor health + +Retrieves the current health report for a single supervisor. The health of a supervisor is determined by the supervisor's `state` (as returned by the `/status` endpoint) and the `druid.supervisor.*` Overlord configuration thresholds. + +#### URL + +GET /druid/indexer/v1/supervisor/:supervisorId/health + +#### Responses + + + + + +*Supervisor is healthy* + + + + + +*Invalid supervisor ID* + + + + + +*Supervisor is unhealthy* + + + + + +--- + +#### Sample request + +The following example shows how to retrieve the health report for a supervisor with the name `social_media`. + + + + + +```shell +curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/social_media/health" +``` + + + + +```HTTP +GET /druid/indexer/v1/supervisor/social_media/health HTTP/1.1 +Host: http://ROUTER_IP:ROUTER_PORT +``` + + + + +#### Sample response + +
+ Click to show sample response + + ```json + { + "healthy": false + } + ``` +
+ +### Get supervisor ingestion stats + +Returns a snapshot of the current ingestion row counters for each task being managed by the supervisor, along with moving averages for the row counters. See [Row stats](../ingestion/tasks.md#row-stats) for more information. + +#### URL + +GET /druid/indexer/v1/supervisor/:supervisorId/stats + +#### Responses + + + + + +*Successfully retrieved supervisor stats* + + + + + +*Invalid supervisor ID* + + + + + +--- + +#### Sample request + +The following example shows how to retrieve the current ingestion row counters for a supervisor with the name `custom_data`. + + + + + + +```shell +curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/custom_data/stats" +``` + + + + + +```HTTP +GET /druid/indexer/v1/supervisor/custom_data/stats HTTP/1.1 +Host: http://ROUTER_IP:ROUTER_PORT +``` + + + + +#### Sample response + +
+ Click to show sample response + + ```json + { + "0": { + "index_kafka_custom_data_881d621078f6b7c_ccplchbi": { + "movingAverages": { + "buildSegments": { + "5m": { + "processed": 53.401225142603316, + "processedBytes": 5226.400757148808, + "unparseable": 0.0, + "thrownAway": 0.0, + "processedWithError": 0.0 + }, + "15m": { + "processed": 56.92994990102502, + "processedBytes": 5571.772059828217, + "unparseable": 0.0, + "thrownAway": 0.0, + "processedWithError": 0.0 + }, + "1m": { + "processed": 37.134921285556636, + "processedBytes": 3634.2766230628677, + "unparseable": 0.0, + "thrownAway": 0.0, + "processedWithError": 0.0 + } + } + }, + "totals": { + "buildSegments": { + "processed": 665, + "processedBytes": 65079, + "processedWithError": 0, + "thrownAway": 0, + "unparseable": 0 + } + } + } + } + } + ``` +
+ ## Audit history An audit history provides a comprehensive log of events, including supervisor configuration, creation, suspension, and modification history. ### Get audit history for all supervisors -Retrieve an audit history of specs for all supervisors. +Retrieves an audit history of specs for all supervisors. #### URL @@ -1325,7 +1494,7 @@ Retrieve an audit history of specs for all supervisors. - + *Successfully retrieved audit history* @@ -1339,7 +1508,7 @@ Retrieve an audit history of specs for all supervisors. - + ```shell @@ -1347,7 +1516,7 @@ curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/history" ``` - + ```HTTP @@ -1686,13 +1855,13 @@ Retrieves an audit history of specs for a single supervisor. - + *Successfully retrieved supervisor audit history* - + *Invalid supervisor ID* @@ -1716,7 +1885,7 @@ curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/wikipedia_stream/ ``` - + ```HTTP @@ -2046,9 +2215,22 @@ Host: http://ROUTER_IP:ROUTER_PORT Creates a new supervisor or updates an existing one for the same datasource with a new schema and configuration. -You can define a supervisor spec for [Apache Kafka](../development/extensions-core/kafka-ingestion.md#define-a-supervisor-spec) or [Amazon Kinesis](../development/extensions-core/kinesis-ingestion.md#supervisor-spec) streaming ingestion methods. Once created, the supervisor persists in the metadata database. +You can define a supervisor spec for [Apache Kafka](../development/extensions-core/kinesis-ingestion.md#supervisor-spec) or [Amazon Kinesis](../development/extensions-core/kinesis-ingestion.md#supervisor-spec) streaming ingestion methods. Once created, the supervisor persists in the metadata database. + +The following table lists the properties of a supervisor spec: + +|Property|Type|Description|Required| +|--------|----|-----------|--------| +|`type`|String|The supervisor type. Choose from `kafka` or `kinesis`.|Yes| +|`spec`|Object|The container object for the supervisor configuration.|Yes| +|`ioConfig`|Object|The I/O configuration object to define the connection and I/O-related settings for the supervisor and indexing task.|Yes| +|`dataSchema`|Object|The schema for the indexing task to use during ingestion. See [`dataSchema`](../ingestion/ingestion-spec.md#dataschema) for more information.|Yes| +|`tuningConfig`|Object|The tuning configuration object to define performance-related settings for the supervisor and indexing tasks.|No| When you call this endpoint on an existing supervisor for the same datasource, the running supervisor signals its tasks to stop reading and begin publishing, exiting itself. Druid then uses the provided configuration from the request body to create a new supervisor. Druid submits a new schema while retaining existing publishing tasks and starts new tasks at the previous task offsets. +In this way, configuration changes can be applied without requiring any pause in ingestion. + +You can achieve seamless schema migrations by submitting the new schema using the `/druid/indexer/v1/supervisor` endpoint. #### URL @@ -2058,13 +2240,13 @@ When you call this endpoint on an existing supervisor for the same datasource, t - + *Successfully created a new supervisor or updated an existing supervisor* - + *Request body content type is not in JSON format* @@ -2080,8 +2262,7 @@ The following example uses JSON input format to create a supervisor spec for Kaf - - + ```shell curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor" \ @@ -2139,8 +2320,8 @@ curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor" \ ``` - + ```HTTP POST /druid/indexer/v1/supervisor HTTP/1.1 @@ -2218,6 +2399,7 @@ Content-Length: 1359 ### Suspend a running supervisor Suspends a single running supervisor. Returns the updated supervisor spec, where the `suspended` property is set to `true`. The suspended supervisor continues to emit logs and metrics. +Indexing tasks remain suspended until the supervisor is resumed. #### URL POST /druid/indexer/v1/supervisor/:supervisorId/suspend @@ -2226,19 +2408,19 @@ Suspends a single running supervisor. Returns the updated supervisor spec, where - + *Successfully shut down supervisor* - + *Supervisor already suspended* - + *Invalid supervisor ID* @@ -2254,7 +2436,7 @@ The following example shows how to suspend a running supervisor with the name `s - + ```shell @@ -2262,7 +2444,7 @@ curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/so ``` - + ```HTTP @@ -2592,7 +2774,7 @@ Suspends all supervisors. Note that this endpoint returns an HTTP `200 Success` - + *Successfully suspended all supervisors* @@ -2606,7 +2788,7 @@ Suspends all supervisors. Note that this endpoint returns an HTTP `200 Success` - + ```shell @@ -2614,7 +2796,7 @@ curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/su ``` - + ```HTTP @@ -2649,19 +2831,19 @@ Resumes indexing tasks for a supervisor. Returns an updated supervisor spec with - + *Successfully resumed supervisor* - + *Supervisor already running* - + *Invalid supervisor ID* @@ -2677,7 +2859,7 @@ The following example resumes a previously suspended supervisor with name `socia - + ```shell @@ -2685,7 +2867,7 @@ curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/so ``` - + ```HTTP @@ -3016,7 +3198,7 @@ Resumes all supervisors. Note that this endpoint returns an HTTP `200 Success` c - + *Successfully resumed all supervisors* @@ -3030,7 +3212,7 @@ Resumes all supervisors. Note that this endpoint returns an HTTP `200 Success` c - + ```shell @@ -3038,7 +3220,7 @@ curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/re ``` - + ```HTTP @@ -3063,10 +3245,16 @@ Host: http://ROUTER_IP:ROUTER_PORT ### Reset a supervisor -Resets the specified supervisor. This endpoint clears _all_ stored offsets in Kafka or sequence numbers in Kinesis, prompting the supervisor to resume data reading. The supervisor will start from the earliest or latest available position, depending on the platform (offsets in Kafka or sequence numbers in Kinesis). It kills and recreates active tasks to read from valid positions. +The supervisor must be running for this endpoint to be available. + +Resets the specified supervisor. This endpoint clears all stored offsets in Kafka or sequence numbers in Kinesis, prompting the supervisor to resume data reading. The supervisor will start from the earliest or latest available position, depending on the platform (offsets in Kafka or sequence numbers in Kinesis). +After clearing all stored offsets in Kafka or sequence numbers in Kinesis, the supervisor kills and recreates active tasks, +so that tasks begin reading from valid positions. Use this endpoint to recover from a stopped state due to missing offsets in Kafka or sequence numbers in Kinesis. Use this endpoint with caution as it may result in skipped messages and lead to data loss or duplicate data. +The indexing service keeps track of the latest persisted offsets in Kafka or sequence numbers in Kinesis to provide exactly-once ingestion guarantees across tasks. Subsequent tasks must start reading from where the previous task completed for the generated segments to be accepted. If the messages at the expected starting offsets in Kafka or sequence numbers in Kinesis are no longer available (typically because the message retention period has elapsed or the topic was removed and re-created) the supervisor will refuse to start and in flight tasks will fail. This endpoint enables you to recover from this condition. + #### URL POST /druid/indexer/v1/supervisor/:supervisorId/reset @@ -3075,13 +3263,13 @@ Use this endpoint to recover from a stopped state due to missing offsets in Kafk - + *Successfully reset supervisor* - + *Invalid supervisor ID* @@ -3097,7 +3285,7 @@ The following example shows how to reset a supervisor with the name `social_medi - + ```shell @@ -3105,7 +3293,7 @@ curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/so ``` - + ```HTTP @@ -3128,13 +3316,19 @@ Host: http://ROUTER_IP:ROUTER_PORT ``` -### Reset Offsets for a supervisor +### Reset offsets for a supervisor + +The supervisor must be running for this endpoint to be available. + +Resets the specified offsets for partitions without resetting the entire set. + +This endpoint clears only the specified offsets in Kafka or sequence numbers in Kinesis, prompting the supervisor to resume reading data from the specified offsets. +If there are no stored offsets, the specified offsets are set in the metadata store. -Resets the specified offsets for a supervisor. This endpoint clears _only_ the specified offsets in Kafka or sequence numbers in Kinesis, prompting the supervisor to resume data reading. -If there are no stored offsets, the specified offsets will be set in the metadata store. The supervisor will start from the reset offsets for the partitions specified and for the other partitions from the stored offset. -It kills and recreates active tasks pertaining to the partitions specified to read from valid offsets. +After resetting stored offsets, the supervisor kills and recreates any active tasks pertaining to the specified partitions, +so that tasks begin reading specified offsets. For partitions that are not specified in this operation, the supervisor resumes from the last stored offset. -Use this endpoint to selectively reset offsets for partitions without resetting the entire set. +Use this endpoint with caution as it may result in skipped messages, leading to data loss or duplicate data. #### URL @@ -3180,8 +3374,8 @@ The following table defines the fields within the `partitions` object in the res #### Sample request -The following example shows how to reset offsets for a kafka supervisor with the name `social_media`. Let's say the supervisor is reading -from a kafka topic `ads_media_stream` and has the stored offsets: `{"0": 0, "1": 10, "2": 20, "3": 40}`. +The following example shows how to reset offsets for a Kafka supervisor with the name `social_media`. Let's say the supervisor is reading +from a Kafka topic `ads_media_stream` and has the stored offsets: `{"0": 0, "1": 10, "2": 20, "3": 40}`. @@ -3216,7 +3410,7 @@ Content-Type: application/json } ``` -The above operation will reset offsets only for partitions 0 and 2 to 100 and 650 respectively. After a successful reset, +The above operation will reset offsets only for partitions `0` and `2` to 100 and 650 respectively. After a successful reset, when the supervisor's tasks restart, they will resume reading from `{"0": 100, "1": 10, "2": 650, "3": 40}`. diff --git a/docs/configuration/index.md b/docs/configuration/index.md index 3c4ef3024203..c539096e4ea7 100644 --- a/docs/configuration/index.md +++ b/docs/configuration/index.md @@ -1159,7 +1159,7 @@ If autoscaling is enabled, you can set these additional configs: |`druid.supervisor.idleConfig.enabled`|If `true`, supervisor can become idle if there is no data on input stream/topic for some time.|false| |`druid.supervisor.idleConfig.inactiveAfterMillis`|Supervisor is marked as idle if all existing data has been read from input topic and no new data has been published for `inactiveAfterMillis` milliseconds.|`600_000`| -The `druid.supervisor.idleConfig.*` specified in the Overlord runtime properties defines the default behavior for the entire cluster. See [Idle Configuration in Kafka Supervisor IOConfig](../development/extensions-core/kafka-supervisor-reference.md#supervisor-io-configuration) to override it for an individual supervisor. +The `druid.supervisor.idleConfig.*` specified in the Overlord runtime properties defines the default behavior for the entire cluster. See [Idle Configuration in Kafka Supervisor IOConfig](../development/extensions-core/kinesis-ingestion.md#io-configuration) to override it for an individual supervisor. #### Overlord dynamic configuration diff --git a/docs/development/extensions-core/kafka-ingestion.md b/docs/development/extensions-core/kafka-ingestion.md index 329967747bfa..c842deef7c51 100644 --- a/docs/development/extensions-core/kafka-ingestion.md +++ b/docs/development/extensions-core/kafka-ingestion.md @@ -26,45 +26,42 @@ description: "Overview of the Kafka indexing service for Druid. Includes example When you enable the Kafka indexing service, you can configure supervisors on the Overlord to manage the creation and lifetime of Kafka indexing tasks. -Kafka indexing tasks read events using Kafka's own partition and offset mechanism to guarantee exactly-once ingestion. The supervisor oversees the state of the indexing tasks to: - - coordinate handoffs - - manage failures - - ensure that scalability and replication requirements are maintained. +Kafka indexing tasks read events using Kafka's own partition and offset mechanism to guarantee exactly-once ingestion. The supervisor oversees the state of the indexing tasks to coordinate handoffs, manage failures, and ensure that scalability and replication requirements are maintained. +This topic contains configuration reference information for the Kafka indexing service supervisor for Apache Druid. -This topic covers how to submit a supervisor spec to ingest event data, also known as message data, from Kafka. See the following for more information: -- For a reference of Kafka supervisor spec configuration options, see the [Kafka supervisor reference](./kafka-supervisor-reference.md). -- For operations reference information to help run and maintain Apache Kafka supervisors, see [Kafka supervisor operations](./kafka-supervisor-operations.md). -- For a walk-through, see the [Loading from Apache Kafka](../../tutorials/tutorial-kafka.md) tutorial. +## Setup -## Kafka support +To use the Kafka indexing service, you must first load the `druid-kafka-indexing-service` extension on both the Overlord and the MiddleManager. See [Loading extensions](../../configuration/extensions.md) for more information. + +### Kafka support The Kafka indexing service supports transactional topics introduced in Kafka 0.11.x by default. The consumer for Kafka indexing service is incompatible with older Kafka brokers. If you are using an older version, refer to the [Kafka upgrade guide](https://kafka.apache.org/documentation/#upgrade). Additionally, you can set `isolation.level` to `read_uncommitted` in `consumerProperties` if either: - You don't need Druid to consume transactional topics. -- You need Druid to consume older versions of Kafka. Make sure offsets are sequential, since there is no offset gap check in Druid anymore. - -If your Kafka cluster enables consumer-group based ACLs, you can set `group.id` in `consumerProperties` to override the default auto generated group id. +- You need Druid to consume older versions of Kafka. Make sure offsets are sequential, since there is no offset gap check in Druid. -## Load the Kafka indexing service +If your Kafka cluster enables consumer-group based ACLs, you can set `group.id` in `consumerProperties` to override the default auto generated group ID. -To use the Kafka indexing service, load the `druid-kafka-indexing-service` extension on both the Overlord and the MiddleManagers. See [Loading extensions](../../configuration/extensions.md) for instructions on how to configure extensions. +## Supervisor spec -## Define a supervisor spec +Similar to the ingestion spec for batch ingestion, the [supervisor spec](../../ingestion/supervisor.md#supervisor-spec) configures the data ingestion for Kafka streaming ingestion. -Similar to the ingestion spec for batch ingestion, the supervisor spec configures the data ingestion for Kafka streaming ingestion. A supervisor spec has the following sections: -- `dataSchema` to specify the Druid datasource name, primary timestamp, dimensions, metrics, transforms, and any necessary filters. -- `ioConfig` to configure Kafka connection settings and configure how Druid parses the data. Kafka-specific connection details go in the `consumerProperties`. The `ioConfig` is also where you define the input format (`inputFormat`) of your Kafka data. For supported formats for Kafka and information on how to configure the input format, see [Data formats](../../ingestion/data-formats.md). -- `tuningConfig` to control various tuning parameters specific to each ingestion method. -For a full description of all the fields and parameters in a Kafka supervisor spec, see the [Kafka supervisor reference](./kafka-supervisor-reference.md). +The following table outlines the high-level configuration options for the Kafka supervisor spec: +|Property|Type|Description|Required| +|--------|----|-----------|--------| +|`type`|String|The supervisor type; this should always be `kafka`.|Yes| +|`spec`|Object|The container object for the supervisor configuration.|Yes| +|`ioConfig`|Object|The I/O configuration object to define the connection and I/O-related settings for the supervisor and indexing tasks.|Yes| +|`dataSchema`|Object|The schema for the indexing task to use during ingestion. See [`dataSchema`](../../ingestion/ingestion-spec.md#dataschema) for more information.|Yes| +|`tuningConfig`|Object|The tuning configuration object to define performance-related settings for the supervisor and indexing tasks.|No| -The following sections contain examples to help you get started with supervisor specs. +The following example shows a supervisor spec for the Kafka indexing service. -### JSON input format supervisor spec example - -The following example demonstrates a supervisor spec for Kafka that uses the `JSON` input format. In this case Druid parses the event contents in JSON format: +
+ Click to view the example ```json { @@ -130,172 +127,244 @@ The following example demonstrates a supervisor spec for Kafka that uses the `JS } ``` -### Kafka input format supervisor spec example - -If you want to parse the Kafka metadata fields in addition to the Kafka payload value contents, you can use the `kafka` input format. - -The `kafka` input format wraps around the payload parsing input format and augments the data it outputs with the Kafka event timestamp, -the Kafka topic name, the Kafka event headers, and the key field that itself can be parsed using any available InputFormat. - -For example, consider the following structure for a Kafka message that represents a fictitious wiki edit in a development environment: - -- **Kafka timestamp**: `1680795276351` -- **Kafka topic**: `wiki-edits` -- **Kafka headers**: - - `env=development` - - `zone=z1` -- **Kafka key**: `wiki-edit` -- **Kafka payload value**: `{"channel":"#sv.wikipedia","timestamp":"2016-06-27T00:00:11.080Z","page":"Salo Toraut","delta":31,"namespace":"Main"}` - -Using `{ "type": "json" }` as the input format would only parse the payload value. -To parse the Kafka metadata in addition to the payload, use the `kafka` input format. - -You would configure it as follows: - -- `valueFormat`: Define how to parse the payload value. Set this to the payload parsing input format (`{ "type": "json" }`). -- `timestampColumnName`: Supply a custom name for the Kafka timestamp in the Druid schema to avoid conflicts with columns from the payload. The default is `kafka.timestamp`. -- `topicColumnName`: Supply a custom name for the Kafka topic in the Druid schema to avoid conflicts with columns from the payload. The default is `kafka.topic`. This field is useful when ingesting data from multiple topics into same datasource. -- `headerFormat`: The default value `string` decodes strings in UTF-8 encoding from the Kafka header. - Other supported encoding formats include the following: - - `ISO-8859-1`: ISO Latin Alphabet No. 1, that is, ISO-LATIN-1. - - `US-ASCII`: Seven-bit ASCII. Also known as ISO646-US. The Basic Latin block of the Unicode character set. - - `UTF-16`: Sixteen-bit UCS Transformation Format, byte order identified by an optional byte-order mark. - - `UTF-16BE`: Sixteen-bit UCS Transformation Format, big-endian byte order. - - `UTF-16LE`: Sixteen-bit UCS Transformation Format, little-endian byte order. -- `headerColumnPrefix`: Supply a prefix to the Kafka headers to avoid any conflicts with columns from the payload. The default is `kafka.header.`. - Considering the header from the example, Druid maps the headers to the following columns: `kafka.header.env`, `kafka.header.zone`. -- `keyFormat`: Supply an input format to parse the key. Only the first value will be used. - If, as in the example, your key values are simple strings, then you can use the `tsv` format to parse them. - ``` - { - "type": "tsv", - "findColumnsFromHeader": false, - "columns": ["x"] - } - ``` - Note that for `tsv`,`csv`, and `regex` formats, you need to provide a `columns` array to make a valid input format. Only the first one is used, and its name will be ignored in favor of `keyColumnName`. -- `keyColumnName`: Supply the name for the Kafka key column to avoid conflicts with columns from the payload. The default is `kafka.key`. +
-Putting it together, the following input format (that uses the default values for `timestampColumnName`, `topicColumnName`, `headerColumnPrefix`, and `keyColumnName`) +### I/O configuration -```json -{ - "type": "kafka", - "valueFormat": { - "type": "json" - }, - "headerFormat": { - "type": "string" - }, - "keyFormat": { - "type": "tsv", - "findColumnsFromHeader": false, - "columns": ["x"] - } -} -``` +The following table outlines the configuration options for `ioConfig`: -would parse the example message as follows: +|Property|Type|Description|Required|Default| +|--------|----|-----------|--------|-------| +|`topic`|String|The Kafka topic to read from. Must be a specific topic. Druid does not support topic patterns. To ingest data from multiple topic, see [Ingest from multiple topics](#ingest-from-multiple-topics). |Yes|| +|`inputFormat`|Object|The [input format](../../ingestion/data-formats.md#input-format) to define input data parsing.|Yes|| +|`consumerProperties`|String, Object|A map of properties to pass to the Kafka consumer. See [Consumer properties](#consumer-properties) for details.|Yes|| +|`pollTimeout`|Long|The length of time to wait for the Kafka consumer to poll records, in milliseconds.|No|100| +|`replicas`|Integer|The number of replica sets, where 1 is a single set of tasks (no replication). Druid always assigns replicate tasks to different workers to provide resiliency against process failure.|No|1| +|`taskCount`|Integer|The maximum number of reading tasks in a replica set. The maximum number of reading tasks equals `taskCount * replicas`. The total number of tasks, reading and publishing, is greater than this count. See [Capacity planning](../../ingestion/supervisor.md#capacity-planning) for more details. When `taskCount > {numKafkaPartitions}`, the actual number of reading tasks is less than the `taskCount` value.|No|1| +|`taskDuration`|ISO 8601 period|The length of time before tasks stop reading and begin publishing segments.|No|PT1H| +|`startDelay`|ISO 8601 period|The period to wait before the supervisor starts managing tasks.|No|PT5S| +|`period`|ISO 8601 period|Determines how often the supervisor executes its management logic. Note that the supervisor also runs in response to certain events, such as tasks succeeding, failing, and reaching their task duration. The `period` value specifies the maximum time between iterations.|No|PT30S| +|`useEarliestOffset`|Boolean|If a supervisor manages a datasource for the first time, it obtains a set of starting offsets from Kafka. This flag determines whether it retrieves the earliest or latest offsets in Kafka. Under normal circumstances, subsequent tasks start from where the previous segments ended. Druid only uses `useEarliestOffset` on the first run.|No|`false`| +|`completionTimeout`|ISO 8601 period|The length of time to wait before declaring a publishing task as failed and terminating it. If the value is too low, your tasks may never publish. The publishing clock for a task begins roughly after `taskDuration` elapses.|No|PT30M| +|`lateMessageRejectionStartDateTime`|ISO 8601 date time|Configures tasks to reject messages with timestamps earlier than this date time. For example, if this property is set to `2016-01-01T11:00Z` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps earlier than `2016-01-01T11:00Z`. This can prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments, such as a realtime and a nightly batch ingestion pipeline.|No|| +|`lateMessageRejectionPeriod`|ISO 8601 period|Configures tasks to reject messages with timestamps earlier than this period before the task was created. For example, if this property is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps earlier than `2016-01-01T11:00Z`. This may help prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments, such as a realtime and a nightly batch ingestion pipeline. Note that you can specify only one of the late message rejection properties.|No|| +|`earlyMessageRejectionPeriod`|ISO 8601 period|Configures tasks to reject messages with timestamps later than this period after the task reached its task duration. For example, if this property is set to `PT1H`, the task duration is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps later than `2016-01-01T14:00Z`. Tasks sometimes run past their task duration, such as in cases of supervisor failover. Setting `earlyMessageRejectionPeriod` too low may cause Druid to drop messages unexpectedly whenever a task runs past its originally configured task duration.|No|| +|`autoScalerConfig`|Object|Defines auto scaling behavior for ingestion tasks. See [Task autoscaler](../../ingestion/supervisor.md#task-autoscaler) for more information.|No|null| +|`idleConfig`|Object|Defines how and when the Kafka supervisor can become idle. See [Idle supervisor configuration](#idle-supervisor-configuration) for more details.|No|null| -```json -{ - "channel": "#sv.wikipedia", - "timestamp": "2016-06-27T00:00:11.080Z", - "page": "Salo Toraut", - "delta": 31, - "namespace": "Main", - "kafka.timestamp": 1680795276351, - "kafka.topic": "wiki-edits", - "kafka.header.env": "development", - "kafka.header.zone": "z1", - "kafka.key": "wiki-edit" -} -``` +#### Consumer properties -For more information on data formats, see [Data formats](../../ingestion/data-formats.md). +Consumer properties must contain a property `bootstrap.servers` with a list of Kafka brokers in the form: `:,:,...`. +By default, `isolation.level` is set to `read_committed`. If you use older versions of Kafka servers without transactions support or don't want Druid to consume only committed transactions, set `isolation.level` to `read_uncommitted`. -Finally, add these Kafka metadata columns to the `dimensionsSpec` or set your `dimensionsSpec` to auto-detect columns. - -The following supervisor spec demonstrates how to ingest the Kafka header, key, timestamp, and topic into Druid dimensions: +In some cases, you may need to fetch consumer properties at runtime. For example, when `bootstrap.servers` is not known upfront, or is not static. To enable SSL connections, you must provide passwords for `keystore`, `truststore`, and `key` secretly. You can provide configurations at runtime with a dynamic config provider implementation like the environment variable config provider that comes with Druid. For more information, see [Dynamic config provider](../../operations/dynamic-config-provider.md). + +For example, if you are using SASL and SSL with Kafka, set the following environment variables for the Druid user on the machines running the Overlord and the Peon services: ``` -{ - "type": "kafka", - "spec": { - "ioConfig": { - "type": "kafka", - "consumerProperties": { - "bootstrap.servers": "localhost:9092" - }, - "topic": "wiki-edits", - "inputFormat": { - "type": "kafka", - "valueFormat": { - "type": "json" - }, - "headerFormat": { - "type": "string" - }, - "keyFormat": { - "type": "tsv", - "findColumnsFromHeader": false, - "columns": ["x"] - } - }, - "useEarliestOffset": true - }, - "dataSchema": { - "dataSource": "wikiticker", - "timestampSpec": { - "column": "timestamp", - "format": "posix" - }, - "dimensionsSpec": "dimensionsSpec": { - "useSchemaDiscovery": true, - "includeAllDimensions": true - }, - "granularitySpec": { - "queryGranularity": "none", - "rollup": false, - "segmentGranularity": "day" +export KAFKA_JAAS_CONFIG="org.apache.kafka.common.security.plain.PlainLoginModule required username='admin_user' password='admin_password';" +export SSL_KEY_PASSWORD=mysecretkeypassword +export SSL_KEYSTORE_PASSWORD=mysecretkeystorepassword +export SSL_TRUSTSTORE_PASSWORD=mysecrettruststorepassword +``` + +```json + "druid.dynamic.config.provider": { + "type": "environment", + "variables": { + "sasl.jaas.config": "KAFKA_JAAS_CONFIG", + "ssl.key.password": "SSL_KEY_PASSWORD", + "ssl.keystore.password": "SSL_KEYSTORE_PASSWORD", + "ssl.truststore.password": "SSL_TRUSTSTORE_PASSWORD" } - }, - "tuningConfig": { - "type": "kafka" - } } -} ``` -After Druid ingests the data, you can query the Kafka metadata columns as follows: +Verify that you've changed the values for all configurations to match your own environment. In the Druid data loader interface, you can use the environment variable config provider syntax in the **Consumer properties** field on the **Connect tab**. When connecting to Kafka, Druid replaces the environment variables with their corresponding values. -```sql -SELECT - "kafka.header.env", - "kafka.key", - "kafka.timestamp", - "kafka.topic" -FROM "wikiticker" -``` +#### Task autoscaler -This query returns: +You can optionally configure autoscaling behavior for ingestion tasks using the `autoScalerConfig` property of the `ioConfig` object. -| `kafka.header.env` | `kafka.key` | `kafka.timestamp` | `kafka.topic` | -|--------------------|-----------|---------------|---------------| -| `development` | `wiki-edit` | `1680795276351` | `wiki-edits` | +The following table outlines the configuration options for `autoScalerConfig`: -For more information, see [`kafka` data format](../../ingestion/data-formats.md#kafka). +|Property|Description|Required|Default| +|--------|-----------|--------|-------| +|`enableTaskAutoScaler`|Enables the auto scaler. If not specified, Druid disables the auto scaler even when `autoScalerConfig` is not null.|No|`false`| +|`taskCountMax`|Maximum number of ingestion tasks. Set `taskCountMax >= taskCountMin`. If `taskCountMax > {numKafkaPartitions}`, Druid only scales reading tasks up to `{numKafkaPartitions}`. In this case, `taskCountMax` is ignored.|Yes|| +|`taskCountMin`|Minimum number of ingestion tasks. When you enable the autoscaler, Druid ignores the value of `taskCount` in `ioConfig` and starts with the `taskCountMin` number of tasks to launch.|Yes|| +|`minTriggerScaleActionFrequencyMillis`|Minimum time interval between two scale actions.| No|600000| +|`autoScalerStrategy`|The algorithm of `autoScaler`. Druid only supports the `lagBased` strategy. See [Autoscaler strategy](#autoscaler-strategy) for more information.|No|`lagBased`| -## Submit a supervisor spec +##### Autoscaler strategy -Druid starts a supervisor for a dataSource when you submit a supervisor spec. You can use the data loader in the web console or you can submit a supervisor spec to the following endpoint: +The following table outlines the configuration options for `autoScalerStrategy`: -`http://:/druid/indexer/v1/supervisor` +|Property|Description|Required|Default| +|--------|-----------|--------|-------| +|`lagCollectionIntervalMillis`|The time period during which Druid collects lag metric points.|No|30000| +|`lagCollectionRangeMillis`|The total time window of lag collection. Use with `lagCollectionIntervalMillis` to specify the intervals at which to collect lag metric points.|No|600000| +|`scaleOutThreshold`|The threshold of scale out action. |No|6000000| +|`triggerScaleOutFractionThreshold`|Enables scale out action if `triggerScaleOutFractionThreshold` percent of lag points is higher than `scaleOutThreshold`.|No|0.3| +|`scaleInThreshold`|The threshold of scale in action.|No|1000000| +|`triggerScaleInFractionThreshold`|Enables scale in action if `triggerScaleInFractionThreshold` percent of lag points is lower than `scaleOutThreshold`.|No|0.9| +|`scaleActionStartDelayMillis`|The number of milliseconds to delay after the supervisor starts before the first scale logic check.|No|300000| +|`scaleActionPeriodMillis`|The frequency in milliseconds to check if a scale action is triggered.|No|60000| +|`scaleInStep`|The number of tasks to reduce at once when scaling down.|No|1| +|`scaleOutStep`|The number of tasks to add at once when scaling out.|No|2| -For example: +#### Idle supervisor configuration -``` -curl -X POST -H 'Content-Type: application/json' -d @supervisor-spec.json http://localhost:8090/druid/indexer/v1/supervisor -``` +:::info +Idle state transitioning is currently designated as experimental. +::: + +When the supervisor enters the idle state, no new tasks are launched subsequent to the completion of the currently executing tasks. This strategy may lead to reduced costs for cluster operators while using topics that get sporadic data. + +The following table outlines the configuration options for `idleConfig`: + +|Property|Description|Required| +|--------|-----------|--------| +|`enabled`|If `true`, the supervisor becomes idle if there is no data on input stream or topic for some time.|No|`false`| +|`inactiveAfterMillis`|The supervisor becomes idle if all existing data has been read from input topic and no new data has been published for `inactiveAfterMillis` milliseconds.|No|`600_000`| + +The following example shows a supervisor spec with `lagBased` autoscaler and idle configuration enabled: + +
+ Click to view the example -Where the file `supervisor-spec.json` contains your Kafka supervisor spec file. +```json +{ + "type": "kafka", + "spec": { + "dataSchema": { + ... + }, + "ioConfig": { + "topic": "metrics", + "inputFormat": { + "type": "json" + }, + "consumerProperties": { + "bootstrap.servers": "localhost:9092" + }, + "autoScalerConfig": { + "enableTaskAutoScaler": true, + "taskCountMax": 6, + "taskCountMin": 2, + "minTriggerScaleActionFrequencyMillis": 600000, + "autoScalerStrategy": "lagBased", + "lagCollectionIntervalMillis": 30000, + "lagCollectionRangeMillis": 600000, + "scaleOutThreshold": 6000000, + "triggerScaleOutFractionThreshold": 0.3, + "scaleInThreshold": 1000000, + "triggerScaleInFractionThreshold": 0.9, + "scaleActionStartDelayMillis": 300000, + "scaleActionPeriodMillis": 60000, + "scaleInStep": 1, + "scaleOutStep": 2 + }, + "taskCount":1, + "replicas":1, + "taskDuration":"PT1H", + "idleConfig": { + "enabled": true, + "inactiveAfterMillis": 600000 + } + }, + "tuningConfig":{ + ... + } + } +} +``` +
+ +#### Ingest from multiple topics + +:::info +If you enable multi-topic ingestion for a datasource, downgrading to a version older than +28.0.0 will cause the ingestion for that datasource to fail. +::: + +To ingest data from multiple topics, you set `topicPattern` instead of `topic in the supervisor `ioConfig` object`. +You can pass multiple topics as a regex pattern as the value for `topicPattern` in `ioConfig`. For example, to +ingest data from clicks and impressions, set `topicPattern` to `clicks|impressions` in `ioCofig`. +Similarly, you can use `metrics-.*` as the value for `topicPattern` if you want to ingest from all the topics that +start with `metrics-`. If new topics are added to the cluster that match the regex, Druid automatically starts +ingesting from those new topics. A topic name that only matches partially such as `my-metrics-12` will not be +included for ingestion. + +When ingesting data from multiple topics, partitions are assigned based on the hashcode of the topic name and the +ID of the partition within that topic. The partition assignment might not be uniform across all the tasks. It's also +assumed that partitions across individual topics have similar load. It is recommended that you have a higher number of +partitions for a high load topic and a lower number of partitions for a low load topic. Assuming that you want to +ingest from both high and low load topic in the same supervisor. + +### Tuning configuration + +The `tuningConfig` object is optional. If you don't specify the `tuningConfig` object, Druid uses the default configuration settings. + +The following table outlines the configuration options for `tuningConfig`: + +|Property|Type|Description|Required|Default| +|--------|----|-----------|--------|-------| +|`type`|String|The indexing task type. This should always be `kafka`.|Yes|| +|`maxRowsInMemory`|Integer|The number of rows to aggregate before persisting. This number represents the post-aggregation rows. It is not equivalent to the number of input events, but the resulting number of aggregated rows. Druid uses `maxRowsInMemory` to manage the required JVM heap size. The maximum heap memory usage for indexing scales is `maxRowsInMemory * (2 + maxPendingPersists)`. Normally, you do not need to set this, but depending on the nature of data, if rows are short in terms of bytes, you may not want to store a million rows in memory and this value should be set.|No|150000| +|`maxBytesInMemory`|Long|The number of bytes to aggregate in heap memory before persisting. This is based on a rough estimate of memory usage and not actual usage. Normally, this is computed internally. The maximum heap memory usage for indexing is `maxBytesInMemory * (2 + maxPendingPersists)`.|No|One-sixth of max JVM memory| +|`skipBytesInMemoryOverheadCheck`|Boolean|The calculation of `maxBytesInMemory` takes into account overhead objects created during ingestion and each intermediate persist. To exclude the bytes of these overhead objects from the `maxBytesInMemory` check, set `skipBytesInMemoryOverheadCheck` to `true`.|No|`false`| +|`maxRowsPerSegment`|Integer|The number of rows to store in a segment. This number is post-aggregation rows. Handoff occurs when `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens first.|No|5000000| +|`maxTotalRows`|Long|The number of rows to aggregate across all segments; this number is post-aggregation rows. Handoff happens either if `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens earlier.|No|20000000| +|`intermediateHandoffPeriod`|ISO 8601 period|The period that determines how often tasks hand off segments. Handoff occurs if `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens first.|No|P2147483647D| +|`intermediatePersistPeriod`|ISO 8601 period|The period that determines the rate at which intermediate persists occur.|No|PT10M| +|`maxPendingPersists`|Integer|Maximum number of persists that can be pending but not started. If a new intermediate persist exceeds this limit, Druid blocks ingestion until the currently running persist finishes. One persist can be running concurrently with ingestion, and none can be queued up. The maximum heap memory usage for indexing scales is `maxRowsInMemory * (2 + maxPendingPersists)`.|No|0| +|`indexSpec`|Object|Defines how Druid indexes the data. See [IndexSpec](#indexspec) for more information.|No|| +|`indexSpecForIntermediatePersists`|Object|Defines segment storage format options to use at indexing time for intermediate persisted temporary segments. You can use `indexSpecForIntermediatePersists` to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. However, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published. See [IndexSpec](#indexspec) for possible values.|No|Same as `indexSpec`| +|`reportParseExceptions`|Boolean|DEPRECATED. If `true`, Druid throws exceptions encountered during parsing causing ingestion to halt. If `false`, Druid skips unparseable rows and fields. Setting `reportParseExceptions` to `true` overrides existing configurations for `maxParseExceptions` and `maxSavedParseExceptions`, setting `maxParseExceptions` to 0 and limiting `maxSavedParseExceptions` to not more than 1.|No|`false`| +|`handoffConditionTimeout`|Long|Number of milliseconds to wait for segment handoff. Set to a value >= 0, where 0 means to wait indefinitely.|No|900000 (15 minutes)| +|`resetOffsetAutomatically`|Boolean|Controls behavior when Druid needs to read Kafka messages that are no longer available, when `offsetOutOfRangeException` is encountered.
If `false`, the exception bubbles up causing tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation, potentially using the [Reset Supervisor API](../../api-reference/supervisor-api.md#reset-a-supervisor). This mode is useful for production, since it will make you aware of issues with ingestion.
If `true`, Druid will automatically reset to the earlier or latest offset available in Kafka, based on the value of the `useEarliestOffset` property (earliest if `true`, latest if `false`). Note that this can lead to dropping data (if `useEarliestSequenceNumber` is `false`) or duplicating data (if `useEarliestSequenceNumber` is `true`) without your knowledge. Druid logs messages indicating that a reset has occurred without interrupting ingestion. This mode is useful for non-production situations since it enables Druid to recover from problems automatically, even if they lead to quiet dropping or duplicating of data. This feature behaves similarly to the Kafka `auto.offset.reset` consumer property.|No|`false`| +|`workerThreads`|Integer|The number of threads that the supervisor uses to handle requests/responses for worker tasks, along with any other internal asynchronous operation.|No|`min(10, taskCount)`| +|`chatAsync`|Boolean|If `true`, use asynchronous communication with indexing tasks, and ignore the `chatThreads` parameter. If `false`, use synchronous communication in a thread pool of size `chatThreads`.|No|`true`| +|`chatThreads`|Integer|The number of threads to use for communicating with indexing tasks. Ignored if `chatAsync` is `true`.|No|`min(10, taskCount * replicas)`| +|`chatRetries`|Integer|The number of times HTTP requests to indexing tasks are retried before considering tasks unresponsive.|No|8| +|`httpTimeout`| ISO 8601 period|The period of time to wait for a HTTP response from an indexing task.|No|PT10S| +|`shutdownTimeout`|ISO 8601 period|The period of time to wait for the supervisor to attempt a graceful shutdown of tasks before exiting.|No|PT80S| +|`offsetFetchPeriod`|ISO 8601 period|Determines how often the supervisor queries Kafka and the indexing tasks to fetch current offsets and calculate lag. If the user-specified value is below the minimum value of `PT5S`, the supervisor ignores the value and uses the minimum value instead.|No|PT30S| +|`segmentWriteOutMediumFactory`|Object|The segment write-out medium to use when creating segments. See [Additional Peon configuration: SegmentWriteOutMediumFactory](../../configuration/index.md#segmentwriteoutmediumfactory) for explanation and available options.|No|If not specified, Druid uses the value from `druid.peon.defaultSegmentWriteOutMediumFactory.type`.| +|`logParseExceptions`|Boolean|If `true`, Druid logs an error message when a parsing exception occurs, containing information about the row where the error occurred.|No|`false`| +|`maxParseExceptions`|Integer|The maximum number of parse exceptions that can occur before the task halts ingestion and fails. Overridden if `reportParseExceptions` is set.|No|unlimited| +|`maxSavedParseExceptions`|Integer|When a parse exception occurs, Druid keeps track of the most recent parse exceptions. `maxSavedParseExceptions` limits the number of saved exception instances. These saved exceptions are available after the task finishes in the [task completion report](../../ingestion/tasks.md#task-reports). Overridden if `reportParseExceptions` is set.|No|0| + +#### IndexSpec + +The following table outlines the configuration options for `indexSpec`: + +|Property|Type|Description|Required|Default| +|--------|----|-----------|--------|-------| +|`bitmap`|Object|Compression format for bitmap indexes. Druid supports roaring and concise bitmap types.|No|Roaring| +|`dimensionCompression`|String|Compression format for dimension columns. Choose from `LZ4`, `LZF`, `ZSTD` or `uncompressed`.|No|`LZ4`| +|`metricCompression`|String|Compression format for primitive type metric columns. Choose from `LZ4`, `LZF`, `ZSTD`, `uncompressed` or `none`.|No|`LZ4`| +|`longEncoding`|String|Encoding format for metric and dimension columns with type long. Choose from `auto` or `longs`. `auto` encodes the values using offset or lookup table depending on column cardinality, and store them with variable size. `longs` stores the value as is with 8 bytes each.|No|`longs`| + +## Deployment notes on Kafka partitions and Druid segments + +Druid assigns Kafka partitions to each Kafka indexing task. A task writes the events it consumes from Kafka into a single segment for the segment granularity interval until it reaches one of the following limits: `maxRowsPerSegment`, `maxTotalRows`, or `intermediateHandoffPeriod`. At this point, the task creates a new partition for this segment granularity to contain subsequent events. + +The Kafka indexing task also does incremental hand-offs. Therefore, segments become available as they are ready and you don't have to wait for all segments until the end of the task duration. When the task reaches one of `maxRowsPerSegment`, `maxTotalRows`, or `intermediateHandoffPeriod`, it hands off all the segments and creates a new set of segments for further events. This allows the task to run for longer durations without accumulating old segments locally on MiddleManager services. + +The Kafka indexing service may still produce some small segments. For example, consider the following scenario: +- Task duration is 4 hours. +- Segment granularity is set to an HOUR. +- The supervisor was started at 9:10. +After 4 hours at 13:10, Druid starts a new set of tasks. The events for the interval 13:00 - 14:00 may be split across existing tasks and the new set of tasks which could result in small segments. To merge them together into new segments of an ideal size (in the range of ~500-700 MB per segment), you can schedule re-indexing tasks, optionally with a different segment granularity. +For information on how to optimize the segment size, see [Segment size optimization](../../operations/segment-optimization.md). + +## Learn more + +See the following topics for more information: + +* [Supervisor API](../../api-reference/supervisor-api.md) for how to manage and monitor supervisors using the API. +* [Supervisor](../../ingestion/supervisor.md) for supervisor status and capacity planning. +* [Loading from Apache Kafka](../../tutorials/tutorial-kafka.md) for a tutorial on streaming data from Apache Kafka. +* [Kafka input format](../../ingestion/data-formats.md) to learn about the `kafka` input format. \ No newline at end of file diff --git a/docs/development/extensions-core/kafka-supervisor-operations.md b/docs/development/extensions-core/kafka-supervisor-operations.md deleted file mode 100644 index b76a80f8cb9b..000000000000 --- a/docs/development/extensions-core/kafka-supervisor-operations.md +++ /dev/null @@ -1,287 +0,0 @@ ---- -id: kafka-supervisor-operations -title: "Apache Kafka supervisor operations reference" -sidebar_label: "Apache Kafka operations" -description: "Reference topic for running and maintaining Apache Kafka supervisors" ---- - -import Tabs from '@theme/Tabs'; -import TabItem from '@theme/TabItem'; - - -This topic contains operations reference information to run and maintain Apache Kafka supervisors for Apache Druid. It includes descriptions of how some supervisor APIs work within Kafka Indexing Service. - -For all supervisor APIs, see [Supervisor API reference](../../api-reference/supervisor-api.md). - -## Getting Supervisor Status Report - -`GET /druid/indexer/v1/supervisor//status` returns a snapshot report of the current state of the tasks managed by the given supervisor. This includes the latest -offsets as reported by Kafka, the consumer lag per partition, as well as the aggregate lag of all partitions. The -consumer lag per partition may be reported as negative values if the supervisor has not received a recent latest offset -response from Kafka. The aggregate lag value will always be >= 0. - -The status report also contains the supervisor's state and a list of recently thrown exceptions (reported as -`recentErrors`, whose max size can be controlled using the `druid.supervisor.maxStoredExceptionEvents` configuration). -There are two fields related to the supervisor's state - `state` and `detailedState`. The `state` field will always be -one of a small number of generic states that are applicable to any type of supervisor, while the `detailedState` field -will contain a more descriptive, implementation-specific state that may provide more insight into the supervisor's -activities than the generic `state` field. - -The list of possible `state` values are: [`PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`, `UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`] - -The list of `detailedState` values and their corresponding `state` mapping is as follows: - -|Detailed State|Corresponding State|Description| -|--------------|-------------------|-----------| -|UNHEALTHY_SUPERVISOR|UNHEALTHY_SUPERVISOR|The supervisor has encountered errors on the past `druid.supervisor.unhealthinessThreshold` iterations| -|UNHEALTHY_TASKS|UNHEALTHY_TASKS|The last `druid.supervisor.taskUnhealthinessThreshold` tasks have all failed| -|UNABLE_TO_CONNECT_TO_STREAM|UNHEALTHY_SUPERVISOR|The supervisor is encountering connectivity issues with Kafka and has not successfully connected in the past| -|LOST_CONTACT_WITH_STREAM|UNHEALTHY_SUPERVISOR|The supervisor is encountering connectivity issues with Kafka but has successfully connected in the past| -|PENDING (first iteration only)|PENDING|The supervisor has been initialized and hasn't started connecting to the stream| -|CONNECTING_TO_STREAM (first iteration only)|RUNNING|The supervisor is trying to connect to the stream and update partition data| -|DISCOVERING_INITIAL_TASKS (first iteration only)|RUNNING|The supervisor is discovering already-running tasks| -|CREATING_TASKS (first iteration only)|RUNNING|The supervisor is creating tasks and discovering state| -|RUNNING|RUNNING|The supervisor has started tasks and is waiting for taskDuration to elapse| -|IDLE|IDLE|The supervisor is not creating tasks since the input stream has not received any new data and all the existing data is read.| -|SUSPENDED|SUSPENDED|The supervisor has been suspended| -|STOPPING|STOPPING|The supervisor is stopping| - -On each iteration of the supervisor's run loop, the supervisor completes the following tasks in sequence: - 1) Fetch the list of partitions from Kafka and determine the starting offset for each partition (either based on the - last processed offset if continuing, or starting from the beginning or ending of the stream if this is a new topic). - 2) Discover any running indexing tasks that are writing to the supervisor's datasource and adopt them if they match - the supervisor's configuration, else signal them to stop. - 3) Send a status request to each supervised task to update our view of the state of the tasks under our supervision. - 4) Handle tasks that have exceeded `taskDuration` and should transition from the reading to publishing state. - 5) Handle tasks that have finished publishing and signal redundant replica tasks to stop. - 6) Handle tasks that have failed and clean up the supervisor's internal state. - 7) Compare the list of healthy tasks to the requested `taskCount` and `replicas` configurations and create additional tasks if required in case supervisor is not idle. - -The `detailedState` field will show additional values (those marked with "first iteration only") the first time the -supervisor executes this run loop after startup or after resuming from a suspension. This is intended to surface -initialization-type issues, where the supervisor is unable to reach a stable state (perhaps because it can't connect to -Kafka, it can't read from the Kafka topic, or it can't communicate with existing tasks). Once the supervisor is stable - -that is, once it has completed a full execution without encountering any issues - `detailedState` will show a `RUNNING` -state until it is idle, stopped, suspended, or hits a task failure threshold and transitions to an unhealthy state. - -## Getting Supervisor Ingestion Stats Report - -`GET /druid/indexer/v1/supervisor//stats` returns a snapshot of the current ingestion row counters for each task being managed by the supervisor, along with moving averages for the row counters. - -See [Task Reports: Row Stats](../../ingestion/tasks.md#row-stats) for more information. - -## Supervisor Health Check - -`GET /druid/indexer/v1/supervisor//health` returns `200 OK` if the supervisor is healthy and -`503 Service Unavailable` if it is unhealthy. Healthiness is determined by the supervisor's `state` (as returned by the -`/status` endpoint) and the `druid.supervisor.*` Overlord configuration thresholds. - -## Updating Existing Supervisors - -`POST /druid/indexer/v1/supervisor` can be used to update existing supervisor spec. -Calling this endpoint when there is already an existing supervisor for the same dataSource will cause: - -- The running supervisor to signal its managed tasks to stop reading and begin publishing. -- The running supervisor to exit. -- A new supervisor to be created using the configuration provided in the request body. This supervisor will retain the -existing publishing tasks and will create new tasks starting at the offsets the publishing tasks ended on. - -Seamless schema migrations can thus be achieved by simply submitting the new schema using this endpoint. - -## Suspending and Resuming Supervisors - -You can suspend and resume a supervisor using `POST /druid/indexer/v1/supervisor//suspend` and `POST /druid/indexer/v1/supervisor//resume`, respectively. - -Note that the supervisor itself will still be operating and emitting logs and metrics, -it will just ensure that no indexing tasks are running until the supervisor is resumed. - -## Resetting Supervisors - -The `POST /druid/indexer/v1/supervisor//reset` operation clears stored -offsets, causing the supervisor to start reading offsets from either the earliest or latest -offsets in Kafka (depending on the value of `useEarliestOffset`). After clearing stored -offsets, the supervisor kills and recreates any active tasks, so that tasks begin reading -from valid offsets. - -Use care when using this operation! Resetting the supervisor may cause Kafka messages -to be skipped or read twice, resulting in missing or duplicate data. - -The reason for using this operation is to recover from a state in which the supervisor -ceases operating due to missing offsets. The indexing service keeps track of the latest -persisted Kafka offsets in order to provide exactly-once ingestion guarantees across -tasks. Subsequent tasks must start reading from where the previous task completed in -order for the generated segments to be accepted. If the messages at the expected -starting offsets are no longer available in Kafka (typically because the message retention -period has elapsed or the topic was removed and re-created) the supervisor will refuse -to start and in flight tasks will fail. This operation enables you to recover from this condition. - -Note that the supervisor must be running for this endpoint to be available. - -## Resetting Offsets for a Supervisor - -The supervisor must be running for this endpoint to be available. - -The `POST /druid/indexer/v1/supervisor//resetOffsets` operation clears stored -offsets, causing the supervisor to start reading from the specified offsets. After resetting stored -offsets, the supervisor kills and recreates any active tasks pertaining to the specified partitions, -so that tasks begin reading from specified offsets. For partitions that are not specified in this operation, the supervisor -will resume from the last stored offset. - -Use care when using this operation! Resetting offsets for a supervisor may cause Kafka messages to be skipped or read -twice, resulting in missing or duplicate data. - -#### Sample request - -The following example shows how to reset offsets for a kafka supervisor with the name `social_media`. Let's say the supervisor is reading -from two kafka topics `ads_media_foo` and `ads_media_bar` and has the stored offsets: `{"ads_media_foo:0": 0, "ads_media_foo:1": 10, "ads_media_bar:0": 20, "ads_media_bar:1": 40}`. - - - - - - -```shell -curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/social_media/resetOffsets" ---header 'Content-Type: application/json' ---data-raw '{"type":"kafka","partitions":{"type":"end","stream":"ads_media_foo|ads_media_bar","partitionOffsetMap":{"ads_media_foo:0": 3, "ads_media_bar:1": 12}}}' -``` - - - - -```HTTP -POST /druid/indexer/v1/supervisor/social_media/resetOffsets HTTP/1.1 -Host: http://ROUTER_IP:ROUTER_PORT -Content-Type: application/json - -{ - "type": "kafka", - "partitions": { - "type": "end", - "stream": "ads_media_foo|ads_media_bar", - "partitionOffsetMap": { - "ads_media_foo:0": 3, - "ads_media_bar:1": 12 - } - } -} -``` - -The above operation will reset offsets for `ads_media_foo` partition 0 and `ads_media_bar` partition 1 to offsets 3 and 12 respectively. After a successful reset, -when the supervisor's tasks restart, they will resume reading from `{"ads_media_foo:0": 3, "ads_media_foo:1": 10, "ads_media_bar:0": 20, "ads_media_bar:1": 12}`. - - - - -#### Sample response - -
- Click to show sample response - - ```json -{ - "id": "social_media" -} - ``` -
- -## Terminating Supervisors - -The `POST /druid/indexer/v1/supervisor//terminate` operation terminates a supervisor and causes all -associated indexing tasks managed by this supervisor to immediately stop and begin -publishing their segments. This supervisor will still exist in the metadata store and its history may be retrieved -with the supervisor history API, but will not be listed in the 'get supervisors' API response nor can it's configuration -or status report be retrieved. The only way this supervisor can start again is by submitting a functioning supervisor -spec to the create API. - -## Capacity Planning - -Kafka indexing tasks run on MiddleManagers and are thus limited by the resources available in the MiddleManager -cluster. In particular, you should make sure that you have sufficient worker capacity (configured using the -`druid.worker.capacity` property) to handle the configuration in the supervisor spec. Note that worker capacity is -shared across all types of indexing tasks, so you should plan your worker capacity to handle your total indexing load -(e.g. batch processing, realtime tasks, merging tasks, etc.). If your workers run out of capacity, Kafka indexing tasks -will queue and wait for the next available worker. This may cause queries to return partial results but will not result -in data loss (assuming the tasks run before Kafka purges those offsets). - -A running task will normally be in one of two states: *reading* or *publishing*. A task will remain in reading state for -`taskDuration`, at which point it will transition to publishing state. A task will remain in publishing state for as long -as it takes to generate segments, push segments to deep storage, and have them be loaded and served by a Historical process -(or until `completionTimeout` elapses). - -The number of reading tasks is controlled by `replicas` and `taskCount`. In general, there will be `replicas * taskCount` -reading tasks, the exception being if taskCount > {numKafkaPartitions} in which case {numKafkaPartitions} tasks will -be used instead. When `taskDuration` elapses, these tasks will transition to publishing state and `replicas * taskCount` -new reading tasks will be created. Therefore to allow for reading tasks and publishing tasks to run concurrently, there -should be a minimum capacity of: - -``` -workerCapacity = 2 * replicas * taskCount -``` - -This value is for the ideal situation in which there is at most one set of tasks publishing while another set is reading. -In some circumstances, it is possible to have multiple sets of tasks publishing simultaneously. This would happen if the -time-to-publish (generate segment, push to deep storage, loaded on Historical) > `taskDuration`. This is a valid -scenario (correctness-wise) but requires additional worker capacity to support. In general, it is a good idea to have -`taskDuration` be large enough that the previous set of tasks finishes publishing before the current set begins. - -## Supervisor Persistence - -When a supervisor spec is submitted via the `POST /druid/indexer/v1/supervisor` endpoint, it is persisted in the -configured metadata database. There can only be a single supervisor per dataSource, and submitting a second spec for -the same dataSource will overwrite the previous one. - -When an Overlord gains leadership, either by being started or as a result of another Overlord failing, it will spawn -a supervisor for each supervisor spec in the metadata database. The supervisor will then discover running Kafka indexing -tasks and will attempt to adopt them if they are compatible with the supervisor's configuration. If they are not -compatible because they have a different ingestion spec or partition allocation, the tasks will be killed and the -supervisor will create a new set of tasks. In this way, the supervisors are persistent across Overlord restarts and -fail-overs. - -A supervisor is stopped via the `POST /druid/indexer/v1/supervisor//terminate` endpoint. This places a -tombstone marker in the database (to prevent the supervisor from being reloaded on a restart) and then gracefully -shuts down the currently running supervisor. When a supervisor is shut down in this way, it will instruct its -managed tasks to stop reading and begin publishing their segments immediately. The call to the shutdown endpoint will -return after all tasks have been signaled to stop but before the tasks finish publishing their segments. - -### Schema/Configuration Changes - -Schema and configuration changes are handled by submitting the new supervisor spec via the same -`POST /druid/indexer/v1/supervisor` endpoint used to initially create the supervisor. The Overlord will initiate a -graceful shutdown of the existing supervisor which will cause the tasks being managed by that supervisor to stop reading -and begin publishing their segments. A new supervisor will then be started which will create a new set of tasks that -will start reading from the offsets where the previous now-publishing tasks left off, but using the updated schema. -In this way, configuration changes can be applied without requiring any pause in ingestion. - -## Deployment notes on Kafka partitions and Druid segments - -Druid assigns each Kafka indexing task Kafka partitions. A task writes the events it consumes from Kafka into a single segment for the segment granularity interval until it reaches one of the following: `maxRowsPerSegment`, `maxTotalRows` or `intermediateHandoffPeriod` limit. At this point, the task creates a new partition for this segment granularity to contain subsequent events. - -The Kafka Indexing Task also does incremental hand-offs. Therefore segments become available as they are ready and you do not have to wait for all segments until the end of the task duration. When the task reaches one of `maxRowsPerSegment`, `maxTotalRows`, or `intermediateHandoffPeriod`, it hands off all the segments and creates a new new set of segments will be created for further events. This allows the task to run for longer durations without accumulating old segments locally on Middle Manager processes. - -The Kafka Indexing Service may still produce some small segments. For example, consider the following scenario: -- Task duration is 4 hours -- Segment granularity is set to an HOUR -- The supervisor was started at 9:10 -After 4 hours at 13:10, Druid starts a new set of tasks. The events for the interval 13:00 - 14:00 may be split across existing tasks and the new set of tasks which could result in small segments. To merge them together into new segments of an ideal size (in the range of ~500-700 MB per segment), you can schedule re-indexing tasks, optionally with a different segment granularity. - -For more detail, see [Segment size optimization](../../operations/segment-optimization.md). -There is also ongoing work to support automatic segment compaction of sharded segments as well as compaction not requiring -Hadoop (see [here](https://github.com/apache/druid/pull/5102)). diff --git a/docs/development/extensions-core/kafka-supervisor-reference.md b/docs/development/extensions-core/kafka-supervisor-reference.md deleted file mode 100644 index d141b23477ff..000000000000 --- a/docs/development/extensions-core/kafka-supervisor-reference.md +++ /dev/null @@ -1,261 +0,0 @@ ---- -id: kafka-supervisor-reference -title: "Apache Kafka supervisor reference" -sidebar_label: "Apache Kafka supervisor" -description: "Reference topic for Apache Kafka supervisors" ---- - - - -This topic contains configuration reference information for the Apache Kafka supervisor for Apache Druid. - -The following table outlines the high-level configuration options: - -|Property|Type|Description|Required| -|--------|----|-----------|--------| -|`type`|String|The supervisor type. For Kafka streaming, set to `kafka`.|Yes| -|`spec`|Object|The container object for the supervisor configuration.|Yes| -|`ioConfig`|Object|The I/O configuration object to define the Kafka connection and I/O-related settings for the supervisor and indexing task. See [Supervisor I/O configuration](#supervisor-io-configuration).|Yes| -|`dataSchema`|Object|The schema for the Kafka indexing task to use during ingestion.|Yes| -|`tuningConfig`|Object|The tuning configuration object to define performance-related settings for the supervisor and indexing tasks. See [Supervisor tuning configuration](#supervisor-tuning-configuration).|No| - -## Supervisor I/O configuration - -The following table outlines the configuration options for `ioConfig`: - -|Property|Type|Description|Required|Default| -|--------|----|-----------|--------|-------| -|`topic`|String|The Kafka topic to read from. Must be a specific topic. Druid does not support topic patterns.|Yes|| -|`inputFormat`|Object|The input format to define input data parsing. See [Specifying data format](#specifying-data-format) for details about specifying the input format.|Yes|| -|`consumerProperties`|String, Object|A map of properties to pass to the Kafka consumer. See [Consumer properties](#consumer-properties).|Yes|| -|`pollTimeout`|Long|The length of time to wait for the Kafka consumer to poll records, in milliseconds.|No|100| -|`replicas`|Integer|The number of replica sets, where 1 is a single set of tasks (no replication). Druid always assigns replicate tasks to different workers to provide resiliency against process failure.|No|1| -|`taskCount`|Integer|The maximum number of reading tasks in a replica set. The maximum number of reading tasks equals `taskCount * replicas`. The total number of tasks, reading and publishing, is greater than this count. See [Capacity planning](./kafka-supervisor-operations.md#capacity-planning) for more details. When `taskCount > {numKafkaPartitions}`, the actual number of reading tasks is less than the `taskCount` value.|No|1| -|`taskDuration`|ISO 8601 period|The length of time before tasks stop reading and begin publishing segments.|No|PT1H| -|`startDelay`|ISO 8601 period|The period to wait before the supervisor starts managing tasks.|No|PT5S| -|`period`|ISO 8601 period|Determines how often the supervisor executes its management logic. Note that the supervisor also runs in response to certain events, such as tasks succeeding, failing, and reaching their task duration. The `period` value specifies the maximum time between iterations.|No|PT30S| -|`useEarliestOffset`|Boolean|If a supervisor manages a datasource for the first time, it obtains a set of starting offsets from Kafka. This flag determines whether it retrieves the earliest or latest offsets in Kafka. Under normal circumstances, subsequent tasks start from where the previous segments ended. Druid only uses `useEarliestOffset` on the first run.|No|`false`| -|`completionTimeout`|ISO 8601 period|The length of time to wait before declaring a publishing task as failed and terminating it. If the value is too low, your tasks may never publish. The publishing clock for a task begins roughly after `taskDuration` elapses.|No|PT30M| -|`lateMessageRejectionStartDateTime`|ISO 8601 date time|Configure tasks to reject messages with timestamps earlier than this date time. For example, if this property is set to `2016-01-01T11:00Z` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps earlier than `2016-01-01T11:00Z`. This can prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments, such as a realtime and a nightly batch ingestion pipeline.|No|| -|`lateMessageRejectionPeriod`|ISO 8601 period|Configure tasks to reject messages with timestamps earlier than this period before the task was created. For example, if this property is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps earlier than `2016-01-01T11:00Z`. This may help prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments, such as a realtime and a nightly batch ingestion pipeline. Note that you can specify only one of the late message rejection properties.|No|| -|`earlyMessageRejectionPeriod`|ISO 8601 period|Configure tasks to reject messages with timestamps later than this period after the task reached its task duration. For example, if this property is set to `PT1H`, the task duration is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps later than `2016-01-01T14:00Z`. Tasks sometimes run past their task duration, such as in cases of supervisor failover. Setting `earlyMessageRejectionPeriod` too low may cause Druid to drop messages unexpectedly whenever a task runs past its originally configured task duration.|No|| -|`autoScalerConfig`|Object|Defines auto scaling behavior for Kafka ingest tasks. See [Task autoscaler properties](#task-autoscaler-properties).|No|null| -|`idleConfig`|Object|Defines how and when the Kafka supervisor can become idle. See [Idle supervisor configuration](#idle-supervisor-configuration) for more details.|No|null| - -### Task autoscaler properties - -The following table outlines the configuration options for `autoScalerConfig`: - -|Property|Description|Required|Default| -|--------|-----------|--------|-------| -|`enableTaskAutoScaler`|Enable or disable autoscaling. `false` or blank disables the `autoScaler` even when `autoScalerConfig` is not null.|No|`false`| -|`taskCountMax`|Maximum number of ingestion tasks. Set `taskCountMax >= taskCountMin`. If `taskCountMax > {numKafkaPartitions}`, Druid only scales reading tasks up to the `{numKafkaPartitions}`. In this case, `taskCountMax` is ignored.|Yes|| -|`taskCountMin`|Minimum number of ingestion tasks. When you enable the autoscaler, Druid ignores the value of `taskCount` in `ioConfig` and starts with the `taskCountMin` number of tasks.|Yes|| -|`minTriggerScaleActionFrequencyMillis`|Minimum time interval between two scale actions.|No|600000| -|`autoScalerStrategy`|The algorithm of `autoScaler`. Only supports `lagBased`. See [Lag based autoscaler strategy related properties](#lag-based-autoscaler-strategy-related-properties) for details.|No|`lagBased`| - -### Lag based autoscaler strategy related properties - -The following table outlines the configuration options for `autoScalerStrategy`: - -|Property|Description|Required|Default| -|--------|-----------|--------|-------| -|`lagCollectionIntervalMillis`|The time period during which Druid collects lag metric points.|No|30000| -|`lagCollectionRangeMillis`|The total time window of lag collection. Use with `lagCollectionIntervalMillis` to specify the intervals at which to collect lag metric points.|No|600000| -|`scaleOutThreshold`|The threshold of scale out action.|No|6000000| -|`triggerScaleOutFractionThreshold`|Enables scale out action if `triggerScaleOutFractionThreshold` percent of lag points is higher than `scaleOutThreshold`.|No|0.3| -|`scaleInThreshold`|The threshold of scale in action.|No|1000000| -|`triggerScaleInFractionThreshold`|Enables scale in action if `triggerScaleInFractionThreshold` percent of lag points is lower than `scaleOutThreshold`.|No|0.9| -|`scaleActionStartDelayMillis`|The number of milliseconds to delay after the supervisor starts before the first scale logic check.|No|300000| -|`scaleActionPeriodMillis`|The frequency in milliseconds to check if a scale action is triggered.|No|60000| -|`scaleInStep`|The number of tasks to reduce at once when scaling down.|No|1| -|`scaleOutStep`|The number of tasks to add at once when scaling out.|No|2| - -### Ingesting from multiple topics - -To ingest data from multiple topics, you have to set `topicPattern` in the supervisor I/O configuration and not set `topic`. -You can pass multiple topics as a regex pattern as the value for `topicPattern` in the I/O configuration. For example, to -ingest data from clicks and impressions, set `topicPattern` to `clicks|impressions` in the I/O configuration. -Similarly, you can use `metrics-.*` as the value for `topicPattern` if you want to ingest from all the topics that -start with `metrics-`. If new topics are added to the cluster that match the regex, Druid automatically starts -ingesting from those new topics. A topic name that only matches partially such as `my-metrics-12` will not be -included for ingestion. If you enable multi-topic ingestion for a datasource, downgrading to a version older than -28.0.0 will cause the ingestion for that datasource to fail. - -When ingesting data from multiple topics, partitions are assigned based on the hashcode of the topic name and the -id of the partition within that topic. The partition assignment might not be uniform across all the tasks. It's also -assumed that partitions across individual topics have similar load. It is recommended that you have a higher number of -partitions for a high load topic and a lower number of partitions for a low load topic. Assuming that you want to -ingest from both high and low load topic in the same supervisor. - -## Idle supervisor configuration - -:::info - Note that idle state transitioning is currently designated as experimental. -::: - -|Property|Description|Required| -|--------|-----------|--------| -|`enabled`|If `true`, the supervisor becomes idle if there is no data on input stream/topic for some time.|No|`false`| -|`inactiveAfterMillis`|The supervisor becomes idle if all existing data has been read from input topic and no new data has been published for `inactiveAfterMillis` milliseconds.|No|`600_000`| - -When the supervisor enters the idle state, no new tasks are launched subsequent to the completion of the currently executing tasks. This strategy may lead to reduced costs for cluster operators while using topics that get sporadic data. - -The following example demonstrates supervisor spec with `lagBased` autoscaler and idle configuration enabled: - -```json -{ - "type": "kafka", - "spec": { - "dataSchema": { - ... - }, - "ioConfig": { - "topic": "metrics", - "inputFormat": { - "type": "json" - }, - "consumerProperties": { - "bootstrap.servers": "localhost:9092" - }, - "autoScalerConfig": { - "enableTaskAutoScaler": true, - "taskCountMax": 6, - "taskCountMin": 2, - "minTriggerScaleActionFrequencyMillis": 600000, - "autoScalerStrategy": "lagBased", - "lagCollectionIntervalMillis": 30000, - "lagCollectionRangeMillis": 600000, - "scaleOutThreshold": 6000000, - "triggerScaleOutFractionThreshold": 0.3, - "scaleInThreshold": 1000000, - "triggerScaleInFractionThreshold": 0.9, - "scaleActionStartDelayMillis": 300000, - "scaleActionPeriodMillis": 60000, - "scaleInStep": 1, - "scaleOutStep": 2 - }, - "taskCount":1, - "replicas":1, - "taskDuration":"PT1H", - "idleConfig": { - "enabled": true, - "inactiveAfterMillis": 600000 - } - }, - "tuningConfig":{ - ... - } - } -} -``` - -## Consumer properties - -Consumer properties must contain a property `bootstrap.servers` with a list of Kafka brokers in the form: `:,:,...`. -By default, `isolation.level` is set to `read_committed`. If you use older versions of Kafka servers without transactions support or don't want Druid to consume only committed transactions, set `isolation.level` to `read_uncommitted`. - -In some cases, you may need to fetch consumer properties at runtime. For example, when `bootstrap.servers` is not known upfront, or is not static. To enable SSL connections, you must provide passwords for `keystore`, `truststore` and `key` secretly. You can provide configurations at runtime with a dynamic config provider implementation like the environment variable config provider that comes with Druid. For more information, see [Dynamic config provider](../../operations/dynamic-config-provider.md). - -For example, if you are using SASL and SSL with Kafka, set the following environment variables for the Druid user on the machines running the Overlord and the Peon services: - -``` -export KAFKA_JAAS_CONFIG="org.apache.kafka.common.security.plain.PlainLoginModule required username='admin_user' password='admin_password';" -export SSL_KEY_PASSWORD=mysecretkeypassword -export SSL_KEYSTORE_PASSWORD=mysecretkeystorepassword -export SSL_TRUSTSTORE_PASSWORD=mysecrettruststorepassword -``` - -``` - "druid.dynamic.config.provider": { - "type": "environment", - "variables": { - "sasl.jaas.config": "KAFKA_JAAS_CONFIG", - "ssl.key.password": "SSL_KEY_PASSWORD", - "ssl.keystore.password": "SSL_KEYSTORE_PASSWORD", - "ssl.truststore.password": "SSL_TRUSTSTORE_PASSWORD" - } - } - } -``` - -Verify that you've changed the values for all configurations to match your own environment. You can use the environment variable config provider syntax in the **Consumer properties** field on the **Connect tab** in the **Load Data** UI in the web console. When connecting to Kafka, Druid replaces the environment variables with their corresponding values. - -You can provide SSL connections with [Password provider](../../operations/password-provider.md) interface to define the `keystore`, `truststore`, and `key`, but this feature is deprecated. - -## Specifying data format - -The Kafka indexing service supports both [`inputFormat`](../../ingestion/data-formats.md#input-format) and [`parser`](../../ingestion/data-formats.md#parser) to specify the data format. -Use the `inputFormat` to specify the data format for Kafka indexing service unless you need a format only supported by the legacy `parser`. - -Druid supports the following input formats: - -- `csv` -- `tsv` -- `json` -- `kafka` -- `avro_stream` -- `avro_ocf` -- `protobuf` - -For more information, see [Data formats](../../ingestion/data-formats.md). You can also read [`thrift`](../extensions-contrib/thrift.md) formats using `parser`. - -## Supervisor tuning configuration - -The `tuningConfig` object is optional. If you don't specify the `tuningConfig` object, Druid uses the default configuration settings. - -|Property|Type|Description|Required|Default| -|--------|----|-----------|--------|-------| -|`type`|String|The indexing task type. This should always be `kafka`.|Yes|| -|`maxRowsInMemory`|Integer|The number of rows to aggregate before persisting. This number represents the post-aggregation rows. It is not equivalent to the number of input events, but the resulting number of aggregated rows. Druid uses `maxRowsInMemory` to manage the required JVM heap size. The maximum heap memory usage for indexing scales is `maxRowsInMemory * (2 + maxPendingPersists)`. Normally, you do not need to set this, but depending on the nature of data, if rows are short in terms of bytes, you may not want to store a million rows in memory and this value should be set.|No|150000| -|`maxBytesInMemory`|Long|The number of bytes to aggregate in heap memory before persisting. This is based on a rough estimate of memory usage and not actual usage. Normally, this is computed internally. The maximum heap memory usage for indexing is `maxBytesInMemory * (2 + maxPendingPersists)`.|No|One-sixth of max JVM memory| -|`skipBytesInMemoryOverheadCheck`|Boolean|The calculation of `maxBytesInMemory` takes into account overhead objects created during ingestion and each intermediate persist. To exclude the bytes of these overhead objects from the `maxBytesInMemory` check, set `skipBytesInMemoryOverheadCheck` to `true`.|No|`false`| -|`maxRowsPerSegment`|Integer|The number of rows to store in a segment. This number is post-aggregation rows. Handoff occurs when `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens first.|No|5000000| -|`maxTotalRows`|Long|The number of rows to aggregate across all segments; this number is post-aggregation rows. Handoff happens either if `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens earlier.|No|20000000| -|`intermediateHandoffPeriod`|ISO 8601 period|The period that determines how often tasks hand off segments. Handoff occurs if `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens first.|No|P2147483647D| -|`intermediatePersistPeriod`|ISO 8601 period|The period that determines the rate at which intermediate persists occur.|No|PT10M| -|`maxPendingPersists`|Integer|Maximum number of persists that can be pending but not started. If a new intermediate persist exceeds this limit, Druid blocks ingestion until the currently running persist finishes. One persist can be running concurrently with ingestion, and none can be queued up. The maximum heap memory usage for indexing scales is `maxRowsInMemory * (2 + maxPendingPersists)`.|No|0| -|`indexSpec`|Object|Defines how Druid indexes the data. See [IndexSpec](#indexspec) for more information.|No|| -|`indexSpecForIntermediatePersists`|Object|Defines segment storage format options to use at indexing time for intermediate persisted temporary segments. You can use `indexSpecForIntermediatePersists` to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. However, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published. See [IndexSpec](#indexspec) for possible values.|No|Same as `indexSpec`| -|`reportParseExceptions`|Boolean|DEPRECATED. If `true`, Druid throws exceptions encountered during parsing causing ingestion to halt. If `false`, Druid skips unparseable rows and fields. Setting `reportParseExceptions` to `true` overrides existing configurations for `maxParseExceptions` and `maxSavedParseExceptions`, setting `maxParseExceptions` to 0 and limiting `maxSavedParseExceptions` to not more than 1.|No|`false`| -|`handoffConditionTimeout`|Long|Number of milliseconds to wait for segment handoff. Set to a value >= 0, where 0 means to wait indefinitely.|No|900000 (15 minutes)| -|`resetOffsetAutomatically`|Boolean|Controls behavior when Druid needs to read Kafka messages that are no longer available, when `offsetOutOfRangeException` is encountered.
If `false`, the exception bubbles up causing tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation, potentially using the [Reset Supervisor API](../../api-reference/supervisor-api.md). This mode is useful for production, since it will make you aware of issues with ingestion.
If `true`, Druid will automatically reset to the earlier or latest offset available in Kafka, based on the value of the `useEarliestOffset` property (earliest if `true`, latest if `false`). Note that this can lead to dropping data (if `useEarliestSequenceNumber` is `false`) or duplicating data (if `useEarliestSequenceNumber` is `true`) without your knowledge. Druid logs messages indicating that a reset has occurred without interrupting ingestion. This mode is useful for non-production situations since it enables Druid to recover from problems automatically, even if they lead to quiet dropping or duplicating of data. This feature behaves similarly to the Kafka `auto.offset.reset` consumer property.|No|`false`| -|`workerThreads`|Integer|The number of threads that the supervisor uses to handle requests/responses for worker tasks, along with any other internal asynchronous operation.|No|`min(10, taskCount)`| -|`chatAsync`|Boolean|If `true`, use asynchronous communication with indexing tasks, and ignore the `chatThreads` parameter. If `false`, use synchronous communication in a thread pool of size `chatThreads`.|No|`true`| -|`chatThreads`|Integer|The number of threads to use for communicating with indexing tasks. Ignored if `chatAsync` is `true`.|No|`min(10, taskCount * replicas)`| -|`chatRetries`|Integer|The number of times HTTP requests to indexing tasks are retried before considering tasks unresponsive.|No|8| -|`httpTimeout`| ISO 8601 period|The period of time to wait for a HTTP response from an indexing task.|No|PT10S| -|`shutdownTimeout`|ISO 8601 period|The period of time to wait for the supervisor to attempt a graceful shutdown of tasks before exiting.|No|PT80S| -|`offsetFetchPeriod`|ISO 8601 period|Determines how often the supervisor queries Kafka and the indexing tasks to fetch current offsets and calculate lag. If the user-specified value is below the minimum value of `PT5S`, the supervisor ignores the value and uses the minimum value instead.|No|PT30S| -|`segmentWriteOutMediumFactory`|Object|The segment write-out medium to use when creating segments. See [Additional Peon configuration: SegmentWriteOutMediumFactory](../../configuration/index.md#segmentwriteoutmediumfactory) for explanation and available options.|No|If not specified, Druid uses the value from `druid.peon.defaultSegmentWriteOutMediumFactory.type`.| -|`logParseExceptions`|Boolean|If `true`, Druid logs an error message when a parsing exception occurs, containing information about the row where the error occurred.|No|`false`| -|`maxParseExceptions`|Integer|The maximum number of parse exceptions that can occur before the task halts ingestion and fails. Overridden if `reportParseExceptions` is set.|No|unlimited| -|`maxSavedParseExceptions`|Integer|When a parse exception occurs, Druid keeps track of the most recent parse exceptions. `maxSavedParseExceptions` limits the number of saved exception instances. These saved exceptions are available after the task finishes in the [task completion report](../../ingestion/tasks.md#task-reports). Overridden if `reportParseExceptions` is set.|No|0| - -### IndexSpec - -The following table outlines the configuration options for `indexSpec`: - -|Property|Type|Description|Required|Default| -|--------|----|-----------|--------|-------| -|`bitmap`|Object|Compression format for bitmap indexes. Druid supports roaring and concise bitmap types.|No|Roaring| -|`dimensionCompression`|String|Compression format for dimension columns. Choose from `LZ4`, `LZF`, `ZSTD` or `uncompressed`.|No|`LZ4`| -|`metricCompression`|String|Compression format for primitive type metric columns. Choose from `LZ4`, `LZF`, `ZSTD`, `uncompressed` or `none`.|No|`LZ4`| -|`longEncoding`|String|Encoding format for metric and dimension columns with type long. Choose from `auto` or `longs`. `auto` encodes the values using offset or lookup table depending on column cardinality, and store them with variable size. `longs` stores the value as is with 8 bytes each.|No|`longs`| \ No newline at end of file diff --git a/docs/development/extensions-core/kinesis-ingestion.md b/docs/development/extensions-core/kinesis-ingestion.md index 5071b1533665..d5f27efa1a02 100644 --- a/docs/development/extensions-core/kinesis-ingestion.md +++ b/docs/development/extensions-core/kinesis-ingestion.md @@ -1,7 +1,7 @@ --- id: kinesis-ingestion title: "Amazon Kinesis ingestion" -sidebar_label: "Amazon Kinesis" +sidebar_label: "Amazon Kinesis ingestion" --- import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; @@ -27,128 +27,33 @@ import TabItem from '@theme/TabItem'; ~ under the License. --> -When you enable the Kinesis indexing service, you can configure supervisors on the Overlord to manage the creation and lifetime of Kinesis indexing tasks. These indexing tasks read events using Kinesis' own shard and sequence number mechanism to guarantee exactly-once ingestion. The supervisor oversees the state of the indexing tasks to coordinate handoffs, manage failures, and ensure that scalability and replication requirements are maintained. +When you enable the Kinesis indexing service, you can configure supervisors on the Overlord to manage the creation and lifetime of Kinesis indexing tasks. These indexing tasks read events using the Kinesis shard and sequence number mechanism to guarantee exactly-once ingestion. The supervisor oversees the state of the indexing tasks to coordinate handoffs, manage failures, and ensure that scalability and replication requirements are maintained. This topic contains configuration reference information for the Kinesis indexing service supervisor for Apache Druid. ## Setup -To use the Kinesis indexing service, you must first load the `druid-kinesis-indexing-service` core extension on both the Overlord and the Middle Manager. See [Loading extensions](../../configuration/extensions.md#loading-extensions) for more information. -Review the [Kinesis known issues](#kinesis-known-issues) before deploying the `druid-kinesis-indexing-service` extension to production. +To use the Kinesis indexing service, you must first load the `druid-kinesis-indexing-service` core extension on both the Overlord and the MiddleManager. See [Loading extensions](../../configuration/extensions.md#loading-extensions) for more information. + +Review [Kinesis known issues](#kinesis-known-issues) before deploying the `druid-kinesis-indexing-service` extension to production. ## Supervisor spec -The following table outlines the high-level configuration options for the Kinesis supervisor object. -See [Supervisor API](../../api-reference/supervisor-api.md) for more information. +The following table outlines the high-level configuration options for the Kinesis [supervisor spec](../../ingestion/supervisor.md#supervisor-spec). |Property|Type|Description|Required| |--------|----|-----------|--------| |`type`|String|The supervisor type; this should always be `kinesis`.|Yes| |`spec`|Object|The container object for the supervisor configuration.|Yes| -|`ioConfig`|Object|The [I/O configuration](#supervisor-io-configuration) object for configuring Kinesis connection and I/O-related settings for the supervisor and indexing task.|Yes| +|`ioConfig`|Object|The [I/O configuration](#supervisor-io-configuration) object for configuring Kinesis connection and I/O-related settings for the supervisor and indexing tasks.|Yes| |`dataSchema`|Object|The schema used by the Kinesis indexing task during ingestion. See [`dataSchema`](../../ingestion/ingestion-spec.md#dataschema) for more information.|Yes| |`tuningConfig`|Object|The [tuning configuration](#supervisor-tuning-configuration) object for configuring performance-related settings for the supervisor and indexing tasks.|No| -Druid starts a new supervisor when you define a supervisor spec. -To create a supervisor, send a `POST` request to the `/druid/indexer/v1/supervisor` endpoint. -Once created, the supervisor persists in the configured metadata database. There can only be a single supervisor per datasource, and submitting a second spec for the same datasource overwrites the previous one. - -When an Overlord gains leadership, either by being started or as a result of another Overlord failing, it spawns -a supervisor for each supervisor spec in the metadata database. The supervisor then discovers running Kinesis indexing -tasks and attempts to adopt them if they are compatible with the supervisor's configuration. If they are not -compatible because they have a different ingestion spec or shard allocation, the tasks are killed and the -supervisor creates a new set of tasks. In this way, the supervisors persist across Overlord restarts and failovers. - -The following example shows how to submit a supervisor spec for a stream with the name `KinesisStream`. -In this example, `http://SERVICE_IP:SERVICE_PORT` is a placeholder for the server address of deployment and the service port. +The following example shows a supervisor spec for a stream with the name `KinesisStream`. - - - - -```shell -curl -X POST "http://SERVICE_IP:SERVICE_PORT/druid/indexer/v1/supervisor" \ --H "Content-Type: application/json" \ --d '{ - "type": "kinesis", - "spec": { - "ioConfig": { - "type": "kinesis", - "stream": "KinesisStream", - "inputFormat": { - "type": "json" - }, - "useEarliestSequenceNumber": true - }, - "tuningConfig": { - "type": "kinesis" - }, - "dataSchema": { - "dataSource": "KinesisStream", - "timestampSpec": { - "column": "timestamp", - "format": "iso" - }, - "dimensionsSpec": { - "dimensions": [ - "isRobot", - "channel", - "flags", - "isUnpatrolled", - "page", - "diffUrl", - { - "type": "long", - "name": "added" - }, - "comment", - { - "type": "long", - "name": "commentLength" - }, - "isNew", - "isMinor", - { - "type": "long", - "name": "delta" - }, - "isAnonymous", - "user", - { - "type": "long", - "name": "deltaBucket" - }, - { - "type": "long", - "name": "deleted" - }, - "namespace", - "cityName", - "countryName", - "regionIsoCode", - "metroCode", - "countryIsoCode", - "regionName" - ] - }, - "granularitySpec": { - "queryGranularity": "none", - "rollup": false, - "segmentGranularity": "hour" - } - } - } -}' -``` - - - -```HTTP -POST /druid/indexer/v1/supervisor -HTTP/1.1 -Host: http://SERVICE_IP:SERVICE_PORT -Content-Type: application/json +
Click to view the example +```json { "type": "kinesis", "spec": { @@ -220,17 +125,16 @@ Content-Type: application/json } } ``` - - +
-## Supervisor I/O configuration +### I/O configuration The following table outlines the configuration options for `ioConfig`: |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| |`stream`|String|The Kinesis stream to read.|Yes|| -|`inputFormat`|Object|The [input format](../../ingestion/data-formats.md#input-format) to specify how to parse input data. See [Specify data format](#specify-data-format) for more information.|Yes|| +|`inputFormat`|Object|The [input format](../../ingestion/data-formats.md#input-format) to specify how to parse input data.|Yes|| |`endpoint`|String|The AWS Kinesis stream endpoint for a region. You can find a list of endpoints in the [AWS service endpoints](http://docs.aws.amazon.com/general/latest/gr/rande.html#ak_region) document.|No|`kinesis.us-east-1.amazonaws.com`| |`replicas`|Integer|The number of replica sets, where 1 is a single set of tasks (no replication). Druid always assigns replicate tasks to different workers to provide resiliency against process failure.|No|1| |`taskCount`|Integer|The maximum number of reading tasks in a replica set. Multiply `taskCount` and `replicas` to measure the maximum number of reading tasks.
The total number of tasks (reading and publishing) is higher than the maximum number of reading tasks. See [Capacity planning](#capacity-planning) for more details. When `taskCount > {numKinesisShards}`, the actual number of reading tasks is less than the `taskCount` value.|No|1| @@ -239,30 +143,34 @@ The following table outlines the configuration options for `ioConfig`: |`period`|ISO 8601 period|Determines how often the supervisor executes its management logic. Note that the supervisor also runs in response to certain events, such as tasks succeeding, failing, and reaching their task duration, so this value specifies the maximum time between iterations.|No|PT30S| |`useEarliestSequenceNumber`|Boolean|If a supervisor is managing a datasource for the first time, it obtains a set of starting sequence numbers from Kinesis. This flag determines whether a supervisor retrieves the earliest or latest sequence numbers in Kinesis. Under normal circumstances, subsequent tasks start from where the previous segments ended so this flag is only used on the first run.|No|`false`| |`completionTimeout`|ISO 8601 period|The length of time to wait before Druid declares a publishing task has failed and terminates it. If this is set too low, your tasks may never publish. The publishing clock for a task begins roughly after `taskDuration` elapses.|No|PT6H| -|`lateMessageRejectionPeriod`|ISO 8601 period|Configure tasks to reject messages with timestamps earlier than this period before the task is created. For example, if `lateMessageRejectionPeriod` is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, messages with timestamps earlier than `2016-01-01T11:00Z` are dropped. This may help prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments, such as a streaming and a nightly batch ingestion pipeline.|No|| -|`earlyMessageRejectionPeriod`|ISO 8601 period|Configure tasks to reject messages with timestamps later than this period after the task reached its `taskDuration`. For example, if `earlyMessageRejectionPeriod` is set to `PT1H`, the `taskDuration` is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`. Messages with timestamps later than `2016-01-01T14:00Z` are dropped. **Note:** Tasks sometimes run past their task duration, for example, in cases of supervisor failover. Setting `earlyMessageRejectionPeriod` too low may cause messages to be dropped unexpectedly whenever a task runs past its originally configured task duration.|No|| +|`lateMessageRejectionPeriod`|ISO 8601 period|Configures tasks to reject messages with timestamps earlier than this period before the task is created. For example, if `lateMessageRejectionPeriod` is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, messages with timestamps earlier than `2016-01-01T11:00Z` are dropped. This may help prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments, such as a streaming and a nightly batch ingestion pipeline.|No|| +|`earlyMessageRejectionPeriod`|ISO 8601 period|Configures tasks to reject messages with timestamps later than this period after the task reached its `taskDuration`. For example, if `earlyMessageRejectionPeriod` is set to `PT1H`, the `taskDuration` is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`. Messages with timestamps later than `2016-01-01T14:00Z` are dropped. **Note:** Tasks sometimes run past their task duration, for example, in cases of supervisor failover. Setting `earlyMessageRejectionPeriod` too low may cause messages to be dropped unexpectedly whenever a task runs past its originally configured task duration.|No|| |`recordsPerFetch`|Integer|The number of records to request per call to fetch records from Kinesis.|No| See [Determine fetch settings](#determine-fetch-settings) for defaults.| |`fetchDelayMillis`|Integer|Time in milliseconds to wait between subsequent calls to fetch records from Kinesis. See [Determine fetch settings](#determine-fetch-settings).|No|0| |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional permissions.|No|| |`awsExternalId`|String|The AWS external ID to use for additional permissions.|No|| |`deaggregate`|Boolean|Whether to use the deaggregate function of the Kinesis Client Library (KCL).|No|| -|`autoScalerConfig`|Object|Defines autoscaling behavior for Kinesis ingest tasks. See [Task autoscaler properties](#task-autoscaler-properties) for more information.|No|null| +|`autoScalerConfig`|Object|Defines autoscaling behavior for ingestion tasks. See [Task autoscaler](../../ingestion/supervisor.md#task-autoscaler) for more information.|No|null| + +#### Task autoscaler -### Task autoscaler properties +You can optionally configure autoscaling behavior for ingestion tasks using the `autoScalerConfig` property of the `ioConfig` object. -The following table outlines the configuration options for `autoScalerConfig`: +The following table outlines the autoscaler configuration options: |Property|Description|Required|Default| |--------|-----------|--------|-------| |`enableTaskAutoScaler`|Enables the auto scaler. If not specified, Druid disables the auto scaler even when `autoScalerConfig` is not null.|No|`false`| -|`taskCountMax`|Maximum number of Kinesis ingestion tasks. Must be greater than or equal to `taskCountMin`. If greater than `{numKinesisShards}`, Druid sets the maximum number of reading tasks to `{numKinesisShards}` and ignores `taskCountMax`.|Yes|| -|`taskCountMin`|Minimum number of Kinesis ingestion tasks. When you enable the auto scaler, Druid ignores the value of `taskCount` in `IOConfig` and uses `taskCountMin` for the initial number of tasks to launch.|Yes|| +|`taskCountMax`|Maximum number of ingestion tasks. Must be greater than or equal to `taskCountMin`. If greater than `{numKinesisShards}`, Druid sets the maximum number of reading tasks to `{numKinesisShards}` and ignores `taskCountMax`.|Yes|| +|`taskCountMin`|Minimum number of ingestion tasks. When you enable the autoscaler, Druid ignores the value of `taskCount` in `ioConfig` and starts with the `taskCountMin` number of tasks to launch.|Yes|| |`minTriggerScaleActionFrequencyMillis`|Minimum time interval between two scale actions.| No|600000| -|`autoScalerStrategy`|The algorithm of `autoScaler`. Druid only supports the `lagBased` strategy. See [Lag based autoscaler strategy related properties](#lag-based-autoscaler-strategy-related-properties) for more information.|No|Defaults to `lagBased`.| +|`autoScalerStrategy`|The algorithm of `autoScaler`. Druid only supports the `lagBased` strategy. See [Autoscaler strategy](#autoscaler-strategy) for more information.|No|`lagBased`| -### Lag based autoscaler strategy related properties +##### Autoscaler strategy -Unlike the Kafka indexing service, Kinesis reports lag metrics measured in time difference in milliseconds between the current sequence number and latest sequence number, rather than message count. +:::info +Unlike the Kafka indexing service, Kinesis indexing service reports lag metrics measured in time difference in milliseconds between the current sequence number and latest sequence number, rather than message count. +::: The following table outlines the configuration options for `autoScalerStrategy`: @@ -279,107 +187,7 @@ The following table outlines the configuration options for `autoScalerStrategy`: |`scaleInStep`|The number of tasks to reduce at once when scaling down.|No|1| |`scaleOutStep`|The number of tasks to add at once when scaling out.|No|2| -The following example shows a supervisor spec with `lagBased` auto scaler enabled. - -
- Click to view the example - -```json -{ - "type": "kinesis", - "dataSchema": { - "dataSource": "metrics-kinesis", - "timestampSpec": { - "column": "timestamp", - "format": "auto" - }, - "dimensionsSpec": { - "dimensions": [], - "dimensionExclusions": [ - "timestamp", - "value" - ] - }, - "metricsSpec": [ - { - "name": "count", - "type": "count" - }, - { - "name": "value_sum", - "fieldName": "value", - "type": "doubleSum" - }, - { - "name": "value_min", - "fieldName": "value", - "type": "doubleMin" - }, - { - "name": "value_max", - "fieldName": "value", - "type": "doubleMax" - } - ], - "granularitySpec": { - "type": "uniform", - "segmentGranularity": "HOUR", - "queryGranularity": "NONE" - } - }, - "ioConfig": { - "stream": "metrics", - "autoScalerConfig": { - "enableTaskAutoScaler": true, - "taskCountMax": 6, - "taskCountMin": 2, - "minTriggerScaleActionFrequencyMillis": 600000, - "autoScalerStrategy": "lagBased", - "lagCollectionIntervalMillis": 30000, - "lagCollectionRangeMillis": 600000, - "scaleOutThreshold": 600000, - "triggerScaleOutFractionThreshold": 0.3, - "scaleInThreshold": 100000, - "triggerScaleInFractionThreshold": 0.9, - "scaleActionStartDelayMillis": 300000, - "scaleActionPeriodMillis": 60000, - "scaleInStep": 1, - "scaleOutStep": 2 - }, - "inputFormat": { - "type": "json" - }, - "endpoint": "kinesis.us-east-1.amazonaws.com", - "taskCount": 1, - "replicas": 1, - "taskDuration": "PT1H" - }, - "tuningConfig": { - "type": "kinesis", - "maxRowsPerSegment": 5000000 - } -} -``` - -
- -### Specify data format - -The Kinesis indexing service supports both [`inputFormat`](../../ingestion/data-formats.md#input-format) and [`parser`](../../ingestion/data-formats.md#parser) to specify the data format. -Use the `inputFormat` to specify the data format for the Kinesis indexing service unless you need a format only supported by the legacy `parser`. - -Supported values for `inputFormat` include: - -- `csv` -- `delimited` -- `json` -- `avro_stream` -- `avro_ocf` -- `protobuf` - -For more information, see [Data formats](../../ingestion/data-formats.md). You can also read [`thrift`](../extensions-contrib/thrift.md) formats using `parser`. - -## Supervisor tuning configuration +### Tuning configuration The `tuningConfig` object is optional. If you don't specify the `tuningConfig` object, Druid uses the default configuration settings. @@ -410,7 +218,7 @@ The following table outlines the configuration options for `tuningConfig`: |`recordBufferOfferTimeout`|Integer|The number of milliseconds to wait for space to become available in the buffer before timing out.|No|5000| |`recordBufferFullWait`|Integer|The number of milliseconds to wait for the buffer to drain before Druid attempts to fetch records from Kinesis again.|No|5000| |`fetchThreads`|Integer|The size of the pool of threads fetching data from Kinesis. There is no benefit in having more threads than Kinesis shards.|No| `procs * 2`, where `procs` is the number of processors available to the task.| -|`segmentWriteOutMediumFactory`|Object|The segment write-out medium to use when creating segments See [Additional Peon configuration: SegmentWriteOutMediumFactory](../../configuration/index.md#segmentwriteoutmediumfactory) for explanation and available options.|No|If not specified, Druid uses the value from `druid.peon.defaultSegmentWriteOutMediumFactory.type`.| +|`segmentWriteOutMediumFactory`|Object|The segment write-out medium to use when creating segments. See [Additional Peon configuration: SegmentWriteOutMediumFactory](../../configuration/index.md#segmentwriteoutmediumfactory) for explanation and available options.|No|If not specified, Druid uses the value from `druid.peon.defaultSegmentWriteOutMediumFactory.type`.| |`logParseExceptions`|Boolean|If `true`, Druid logs an error message when a parsing exception occurs, containing information about the row where the error occurred.|No|`false`| |`maxParseExceptions`|Integer|The maximum number of parse exceptions that can occur before the task halts ingestion and fails. Overridden if `reportParseExceptions` is set.|No|unlimited| |`maxSavedParseExceptions`|Integer|When a parse exception occurs, Druid keeps track of the most recent parse exceptions. `maxSavedParseExceptions` limits the number of saved exception instances. These saved exceptions are available after the task finishes in the [task completion report](../../ingestion/tasks.md#task-reports). Overridden if `reportParseExceptions` is set.|No|0| @@ -419,7 +227,7 @@ The following table outlines the configuration options for `tuningConfig`: |`offsetFetchPeriod`|ISO 8601 period|Determines how often the supervisor queries Kinesis and the indexing tasks to fetch current offsets and calculate lag. If the user-specified value is below the minimum value of PT5S, the supervisor ignores the value and uses the minimum value instead.|No|PT30S| |`useListShards`|Boolean|Indicates if `listShards` API of AWS Kinesis SDK can be used to prevent `LimitExceededException` during ingestion. You must set the necessary `IAM` permissions.|No|`false`| -### IndexSpec +#### IndexSpec The following table outlines the configuration options for `indexSpec`: @@ -430,11 +238,7 @@ The following table outlines the configuration options for `indexSpec`: |`metricCompression`|String|Compression format for primitive type metric columns. Choose from `LZ4`, `LZF`, `uncompressed`, or `none`.|No|`LZ4`| |`longEncoding`|String|Encoding format for metric and dimension columns with type long. Choose from `auto` or `longs`. `auto` encodes the values using sequence number or lookup table depending on column cardinality and stores them with variable sizes. `longs` stores the value as is with 8 bytes each.|No|`longs`| -## Operations - -This section describes how to use the [Supervisor API](../../api-reference/supervisor-api.md) with the Kinesis indexing service. - -### AWS authentication +## AWS authentication Druid uses AWS access and secret keys to authenticate Kinesis API requests. There are a few ways to provide this information to Druid: @@ -454,8 +258,10 @@ druid.kinesis.accessKey=AKIAWxxxxxxxxxx4NCKS druid.kinesis.secretKey=Jbytxxxxxxxxxxx2+555 ``` -> Note: AWS does not recommend providing long-term security credentials in configuration files since it might pose a security risk. +:::info +AWS does not recommend providing long-term security credentials in configuration files since it might pose a security risk. If you use this approach, it takes precedence over all other methods of providing credentials. +::: To ingest data from Kinesis, ensure that the policy attached to your IAM role contains the necessary permissions. The required permissions depend on the value of `useListShards`. @@ -513,125 +319,6 @@ The following is an example policy: ] ``` -### Get supervisor status report - -To retrieve the current status report for a single supervisor, send a `GET` request to the `/druid/indexer/v1/supervisor/:supervisorId/status` endpoint. - -The report contains the state of the supervisor tasks, the latest sequence numbers, and an array of recently thrown exceptions reported as `recentErrors`. You can control the maximum size of the exceptions using the `druid.supervisor.maxStoredExceptionEvents` configuration. - -The two properties related to the supervisor's state are `state` and `detailedState`. The `state` property contains a small number of generic states that apply to any type of supervisor, while the `detailedState` property contains a more descriptive, implementation-specific state that may provide more insight into the supervisor's activities. - -Possible `state` values are `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`, `UNHEALTHY_SUPERVISOR`, and `UNHEALTHY_TASKS`. - -The following table lists `detailedState` values and their corresponding `state` mapping: - -|Detailed state|Corresponding state|Description| -|--------------|-------------------|-----------| -|`UNHEALTHY_SUPERVISOR`|`UNHEALTHY_SUPERVISOR`|The supervisor encountered errors on previous `druid.supervisor.unhealthinessThreshold` iterations.| -|`UNHEALTHY_TASKS`|`UNHEALTHY_TASKS`|The last `druid.supervisor.taskUnhealthinessThreshold` tasks all failed.| -|`UNABLE_TO_CONNECT_TO_STREAM`|`UNHEALTHY_SUPERVISOR`|The supervisor is encountering connectivity issues with Kinesis and has not successfully connected in the past.| -|`LOST_CONTACT_WITH_STREAM`|`UNHEALTHY_SUPERVISOR`|The supervisor is encountering connectivity issues with Kinesis but has successfully connected in the past.| -|`PENDING` (first iteration only)|`PENDING`|The supervisor has been initialized but hasn't started connecting to the stream.| -|`CONNECTING_TO_STREAM` (first iteration only)|`RUNNING`|The supervisor is trying to connect to the stream and update partition data.| -|`DISCOVERING_INITIAL_TASKS` (first iteration only)|`RUNNING`|The supervisor is discovering already-running tasks.| -|`CREATING_TASKS` (first iteration only)|`RUNNING`|The supervisor is creating tasks and discovering state.| -|`RUNNING`|`RUNNING`|The supervisor has started tasks and is waiting for `taskDuration` to elapse.| -|`SUSPENDED`|`SUSPENDED`|The supervisor is suspended.| -|`STOPPING`|`STOPPING`|The supervisor is stopping.| - -On each iteration of the supervisor's run loop, the supervisor completes the following tasks in sequence: - -1. Fetch the list of shards from Kinesis and determine the starting sequence number for each shard (either based on the last processed sequence number if continuing, or starting from the beginning or ending of the stream if this is a new stream). -2. Discover any running indexing tasks that are writing to the supervisor's datasource and adopt them if they match the supervisor's configuration, else signal them to stop. -3. Send a status request to each supervised task to update the view of the state of the tasks under supervision. -4. Handle tasks that have exceeded `taskDuration` and should transition from the reading to publishing state. -5. Handle tasks that have finished publishing and signal redundant replica tasks to stop. -6. Handle tasks that have failed and clean up the supervisor's internal state. -7. Compare the list of healthy tasks to the requested `taskCount` and `replicas` configurations and create additional tasks if required. - -The `detailedState` property shows additional values (marked with "first iteration only" in the preceding table) the first time the -supervisor executes this run loop after startup or after resuming from a suspension. This is intended to surface -initialization-type issues, where the supervisor is unable to reach a stable state. For example, if the supervisor cannot connect to -Kinesis, if it's unable to read from the stream, or cannot communicate with existing tasks. Once the supervisor is stable; -that is, once it has completed a full execution without encountering any issues, `detailedState` will show a `RUNNING` -state until it is stopped, suspended, or hits a failure threshold and transitions to an unhealthy state. - -### Update existing supervisors - -To update an existing supervisor spec, send a `POST` request to the `/druid/indexer/v1/supervisor` endpoint. - -When you call this endpoint on an existing supervisor for the same datasource, the running supervisor signals its tasks to stop reading and begin publishing their segments, exiting itself. Druid then uses the provided configuration from the request body to create a new supervisor with a new set of tasks that start reading from the sequence numbers, where the previous now-publishing tasks left off, but using the updated schema. -In this way, configuration changes can be applied without requiring any pause in ingestion. - -You can achieve seamless schema migrations by submitting the new schema using the `/druid/indexer/v1/supervisor` endpoint. - -### Suspend and resume a supervisor - -To suspend a supervisor, send a `POST` request to the `/druid/indexer/v1/supervisor/:supervisorId/suspend` endpoint. -Suspending a supervisor does not prevent it from operating and emitting logs and metrics. It ensures that no indexing tasks are running until the supervisor resumes. - -To resume a supervisor, send a `POST` request to the `/druid/indexer/v1/supervisor/:supervisorId/resume` endpoint. - -### Reset a supervisor - -The supervisor must be running for this endpoint to be available - -To reset a supervisor, send a `POST` request to the `/druid/indexer/v1/supervisor/:supervisorId/reset` endpoint. This endpoint clears stored -sequence numbers, prompting the supervisor to start reading from either the earliest or the -latest sequence numbers in Kinesis (depending on the value of `useEarliestSequenceNumber`). -After clearing stored sequence numbers, the supervisor kills and recreates active tasks, -so that tasks begin reading from valid sequence numbers. - -This endpoint is useful when you need to recover from a stopped state due to missing sequence numbers in Kinesis. -Use this endpoint with caution as it may result in skipped messages, leading to data loss or duplicate data. - -The indexing service keeps track of the latest -persisted sequence number to provide exactly-once ingestion guarantees across -tasks. -Subsequent tasks must start reading from where the previous task completed -for the generated segments to be accepted. If the messages at the expected starting sequence numbers are -no longer available in Kinesis (typically because the message retention period has elapsed or the topic was -removed and re-created) the supervisor will refuse to start and in-flight tasks will fail. This endpoint enables you to recover from this condition. - -### Resetting Offsets for a supervisor - -To reset partition offsets for a supervisor, send a `POST` request to the `/druid/indexer/v1/supervisor/:supervisorId/resetOffsets` endpoint. This endpoint clears stored -sequence numbers, prompting the supervisor to start reading from the specified offsets. -After resetting stored offsets, the supervisor kills and recreates any active tasks pertaining to the specified partitions, -so that tasks begin reading specified offsets. For partitions that are not specified in this operation, the supervisor will resume from the last -stored offset. - -Use this endpoint with caution as it may result in skipped messages, leading to data loss or duplicate data. - -### Terminate a supervisor - -To terminate a supervisor and its associated indexing tasks, send a `POST` request to the `/druid/indexer/v1/supervisor/:supervisorId/terminate` endpoint. -This places a tombstone marker in the database to prevent the supervisor from being reloaded on a restart and then gracefully -shuts down the currently running supervisor. -The tasks stop reading and begin publishing their segments immediately. -The call returns after all tasks have been signaled to stop but before the tasks finish publishing their segments. - -The terminated supervisor continues exists in the metadata store and its history can be retrieved. -The only way to restart a terminated supervisor is by submitting a functioning supervisor spec to `/druid/indexer/v1/supervisor`. - -## Capacity planning - -Kinesis indexing tasks run on Middle Managers and are limited by the resources available in the Middle Manager cluster. In particular, you should make sure that you have sufficient worker capacity, configured using the -`druid.worker.capacity` property, to handle the configuration in the supervisor spec. Note that worker capacity is -shared across all types of indexing tasks, so you should plan your worker capacity to handle your total indexing load, such as batch processing, streaming tasks, and merging tasks. If your workers run out of capacity, Kinesis indexing tasks queue and wait for the next available worker. This may cause queries to return partial results but will not result in data loss, assuming the tasks run before Kinesis purges those sequence numbers. - -A running task can be in one of two states: reading or publishing. A task remains in reading state for the period defined in `taskDuration`, at which point it transitions to publishing state. A task remains in publishing state for as long as it takes to generate segments, push segments to deep storage, and have them loaded and served by a Historical process or until `completionTimeout` elapses. - -The number of reading tasks is controlled by `replicas` and `taskCount`. In general, there are `replicas * taskCount` reading tasks. An exception occurs if `taskCount > {numKinesisShards}`, in which case Druid uses `{numKinesisShards}` tasks. When `taskDuration` elapses, these tasks transition to publishing state and `replicas * taskCount` new reading tasks are created. To allow for reading tasks and publishing tasks to run concurrently, there should be a minimum capacity of: - -```text -workerCapacity = 2 * replicas * taskCount -``` - -This value is for the ideal situation in which there is at most one set of tasks publishing while another set is reading. -In some circumstances, it is possible to have multiple sets of tasks publishing simultaneously. This would happen if the -time-to-publish (generate segment, push to deep storage, load on Historical) is greater than `taskDuration`. This is a valid and correct scenario but requires additional worker capacity to support. In general, it is a good idea to have `taskDuration` be large enough that the previous set of tasks finishes publishing before the current set begins. - ## Shards and segment handoff Each Kinesis indexing task writes the events it consumes from Kinesis shards into a single segment for the segment granularity interval until it reaches one of the following limits: `maxRowsPerSegment`, `maxTotalRows`, or `intermediateHandoffPeriod`. @@ -639,7 +326,7 @@ At this point, the task creates a new shard for this segment granularity to cont The Kinesis indexing task also performs incremental hand-offs so that the segments created by the task are not held up until the task duration is over. When the task reaches one of the `maxRowsPerSegment`, `maxTotalRows`, or `intermediateHandoffPeriod` limits, it hands off all the segments and creates a new set of segments for further events. This allows the task to run for longer durations -without accumulating old segments locally on Middle Manager processes. +without accumulating old segments locally on MiddleManager services. The Kinesis indexing service may still produce some small segments. For example, consider the following scenario: @@ -650,7 +337,7 @@ For example, consider the following scenario: After 4 hours at 13:10, Druid starts a new set of tasks. The events for the interval 13:00 - 14:00 may be split across existing tasks and the new set of tasks which could result in small segments. To merge them together into new segments of an ideal size (in the range of ~500-700 MB per segment), you can schedule re-indexing tasks, optionally with a different segment granularity. -For more detail, see [Segment size optimization](../../operations/segment-optimization.md). +For information on how to optimize the segment size, see [Segment size optimization](../../operations/segment-optimization.md). ## Determine fetch settings @@ -719,9 +406,16 @@ If resharding occurs when the supervisor is suspended and `useEarliestSequence` ## Kinesis known issues -Before you deploy the Kinesis extension to production, consider the following known issues: +Before you deploy the `druid-kinesis-indexing-service` extension to production, consider the following known issues: - Avoid implementing more than one Kinesis supervisor that reads from the same Kinesis stream for ingestion. Kinesis has a per-shard read throughput limit and having multiple supervisors on the same stream can reduce available read throughput for an individual supervisor's tasks. Multiple supervisors ingesting to the same Druid datasource can also cause increased contention for locks on the datasource. - The only way to change the stream reset policy is to submit a new ingestion spec and set up a new supervisor. - If ingestion tasks get stuck, the supervisor does not automatically recover. You should monitor ingestion tasks and investigate if your ingestion falls behind. - A Kinesis supervisor can sometimes compare the checkpoint offset to retention window of the stream to see if it has fallen behind. These checks fetch the earliest sequence number for Kinesis which can result in `IteratorAgeMilliseconds` becoming very high in AWS CloudWatch. + +## Learn more + +See the following topics for more information: + +* [Supervisor API](../../api-reference/supervisor-api.md) for how to manage and monitor supervisors using the API. +* [Supervisor](../../ingestion/supervisor.md) for supervisor status and capacity planning. \ No newline at end of file diff --git a/docs/ingestion/data-formats.md b/docs/ingestion/data-formats.md index 7dd1b10c7fa9..da9639e1a6c8 100644 --- a/docs/ingestion/data-formats.md +++ b/docs/ingestion/data-formats.md @@ -398,8 +398,8 @@ For details, see the Schema Registry [documentation](http://docs.confluent.io/cu | url | String | Specifies the URL endpoint of the Schema Registry. | yes | | capacity | Integer | Specifies the max size of the cache (default = Integer.MAX_VALUE). | no | | urls | Array | Specifies the URL endpoints of the multiple Schema Registry instances. | yes (if `url` is not provided) | -| config | Json | To send additional configurations, configured for Schema Registry. This can be supplied via a [DynamicConfigProvider](../operations/dynamic-config-provider.md) | no | -| headers | Json | To send headers to the Schema Registry. This can be supplied via a [DynamicConfigProvider](../operations/dynamic-config-provider.md) | no | +| config | Json | To send additional configurations, configured for Schema Registry. This can be supplied via a [DynamicConfigProvider](../operations/dynamic-config-provider.md) | no | +| headers | Json | To send headers to the Schema Registry. This can be supplied via a [DynamicConfigProvider](../operations/dynamic-config-provider.md) | no | For a single schema registry instance, use Field `url` or `urls` for multi instances. @@ -549,52 +549,62 @@ For example: ### Kafka -`kafka` is a special input format that wraps a regular input format (which goes in `valueFormat`) and allows you -to parse the Kafka metadata (timestamp, headers, and key) that is part of Kafka messages. -It should only be used when ingesting from Apache Kafka. +The `kafka` input format lets you parse the Kafka metadata fields in addition to the Kafka payload value contents. +It should only be used when ingesting from Apache Kafka. -Configure the Kafka `inputFormat` as follows: +The `kafka` input format wraps around the payload parsing input format and augments the data it outputs with the Kafka event timestamp, topic name, event headers, and the key field that itself can be parsed using any available input format. -| Field | Type | Description | Required | -|-------|------|-------------|----------| -| `type` | String | Set value to `kafka`. | yes | -| `valueFormat` | [InputFormat](#input-format) | Any [InputFormat](#input-format) to parse the Kafka value payload. For details about specifying the input format, see [Specifying data format](../development/extensions-core/kafka-supervisor-reference.md#specifying-data-format). | yes | -| `timestampColumnName` | String | Name of the column for the kafka record's timestamp.| no (default = "kafka.timestamp") | -| `topicColumnName` | String |Name of the column for the kafka record's topic. It is useful when ingesting data from multiple topics.| no (default = "kafka.timestamp") | -| `headerColumnPrefix` | String | Custom prefix for all the header columns. | no (default = "kafka.header.") | -| `headerFormat` | Object | `headerFormat` specifies how to parse the Kafka headers. Supports String types. Because Kafka header values are bytes, the parser decodes them as UTF-8 encoded strings. To change this behavior, implement your own parser based on the encoding style. Change the 'encoding' type in `KafkaStringHeaderFormat` to match your custom implementation. | no | -| `keyFormat` | [InputFormat](#input-format) | Any [input format](#input-format) to parse the Kafka key. It only processes the first entry of the `inputFormat` field. For details, see [Specifying data format](../development/extensions-core/kafka-supervisor-reference.md#specifying-data-format). | no | -| `keyColumnName` | String | Name of the column for the kafka record's key.| no (default = "kafka.key") | +If there are conflicts between column names in the payload and those created from the metadata, the payload takes precedence. +This ensures that upgrading a Kafka ingestion to use the Kafka input format (by taking its existing input format and setting it as the `valueFormat`) can be done without losing any of the payload data. +Configure the Kafka `inputFormat` as follows: -The Kafka input format augments the payload with information from the Kafka timestamp, headers, and key. +| Field | Type | Description | Required | Default | +|-------|------|-------------|----------|---------| +| `type` | String | Set value to `kafka`. | yes || +| `valueFormat` | [InputFormat](#input-format) | The [input format](#input-format) to parse the Kafka value payload. | yes || +| `timestampColumnName` | String | The name of the column for the Kafka timestamp.| no |`kafka.timestamp`| +| `topicColumnName` | String |The name of the column for the Kafka topic. This field is useful when ingesting data from multiple topics into same datasource.| no |`kafka.topic`| +| `headerColumnPrefix` | String | The custom prefix for all the header columns. | no | `kafka.header`| +| `headerFormat` | Object | Specifies how to parse the Kafka headers. Supports String types. Because Kafka header values are bytes, the parser decodes them as UTF-8 encoded strings. To change this behavior, implement your own parser based on the encoding style. Change the `encoding` type in `KafkaStringHeaderFormat` to match your custom implementation. See [Header format](#header-format) for supported encoding formats.| no || +| `keyFormat` | [InputFormat](#input-format) | The [input format](#input-format) to parse the Kafka key. It only processes the first entry of the `inputFormat` field. If your key values are simple strings, you can use the `tsv` format to parse them. Note that for `tsv`,`csv`, and `regex` formats, you need to provide a `columns` array to make a valid input format. Only the first one is used, and its name will be ignored in favor of `keyColumnName`. | no || +| `keyColumnName` | String | The name of the column for the Kafka key.| no |`kafka.key`| + +#### Header format + +`headerFormat` supports the following encoding formats: + - `ISO-8859-1`: ISO Latin Alphabet No. 1, that is, ISO-LATIN-1. + - `US-ASCII`: Seven-bit ASCII. Also known as ISO646-US. The Basic Latin block of the Unicode character set. + - `UTF-8`: Eight-bit UCS Transformation Format. + - `UTF-16`: Sixteen-bit UCS Transformation Format, byte order identified by an optional byte-order mark. + - `UTF-16BE`: Sixteen-bit UCS Transformation Format, big-endian byte order. + - `UTF-16LE`: Sixteen-bit UCS Transformation Format, little-endian byte order. +- `headerColumnPrefix`: Supply a prefix to the Kafka headers to avoid any conflicts with columns from the payload. The default is `kafka.header.`. -If there are conflicts between column names in the payload and those created from the metadata, the payload takes precedence. -This ensures that upgrading a Kafka ingestion to use the Kafka input format (by taking its existing input format and setting it as the `valueFormat`) can be done without losing any of the payload data. +#### Example -Here is a minimal example that only augments the parsed payload with the Kafka timestamp column and kafka topic column: +Using `{ "type": "json" }` as the input format would only parse the payload value. +To parse the Kafka metadata in addition to the payload, use the `kafka` input format. -``` -"ioConfig": { - "inputFormat": { - "type": "kafka", - "valueFormat": { - "type": "json" - } - }, - ... -} -``` +For example, consider the following structure for a Kafka message that represents an edit in a development environment: -Here is a complete example: +- **Kafka timestamp**: `1680795276351` +- **Kafka topic**: `wiki-edits` +- **Kafka headers**: + - `env=development` + - `zone=z1` +- **Kafka key**: `wiki-edit` +- **Kafka payload value**: `{"channel":"#sv.wikipedia","timestamp":"2016-06-27T00:00:11.080Z","page":"Salo Toraut","delta":31,"namespace":"Main"}` -``` +You would configure it as follows: + +```json "ioConfig": { "inputFormat": { "type": "kafka", "valueFormat": { "type": "json" - } + }, "timestampColumnName": "kafka.timestamp", "topicColumnName": "kafka.topic", "headerFormat": { @@ -608,8 +618,24 @@ Here is a complete example: "columns": ["x"] }, "keyColumnName": "kafka.key", - }, - ... + } +} +``` + +You would parse the example message as follows: + +```json +{ + "channel": "#sv.wikipedia", + "timestamp": "2016-06-27T00:00:11.080Z", + "page": "Salo Toraut", + "delta": 31, + "namespace": "Main", + "kafka.timestamp": 1680795276351, + "kafka.topic": "wiki-edits", + "kafka.header.env": "development", + "kafka.header.zone": "z1", + "kafka.key": "wiki-edit" } ``` @@ -631,6 +657,80 @@ Similarly, if you want to use a timestamp extracted from the Kafka header: } ``` +Finally, add these Kafka metadata columns to the `dimensionsSpec` or set your `dimensionsSpec` to auto-detect columns. + +The following supervisor spec demonstrates how to ingest the Kafka header, key, timestamp, and topic into Druid dimensions: + +
+Click to view the example + +```json +{ + "type": "kafka", + "spec": { + "ioConfig": { + "type": "kafka", + "consumerProperties": { + "bootstrap.servers": "localhost:9092" + }, + "topic": "wiki-edits", + "inputFormat": { + "type": "kafka", + "valueFormat": { + "type": "json" + }, + "headerFormat": { + "type": "string" + }, + "keyFormat": { + "type": "tsv", + "findColumnsFromHeader": false, + "columns": ["x"] + } + }, + "useEarliestOffset": true + }, + "dataSchema": { + "dataSource": "wikiticker", + "timestampSpec": { + "column": "timestamp", + "format": "posix" + }, + "dimensionsSpec": "dimensionsSpec": { + "useSchemaDiscovery": true, + "includeAllDimensions": true + }, + "granularitySpec": { + "queryGranularity": "none", + "rollup": false, + "segmentGranularity": "day" + } + }, + "tuningConfig": { + "type": "kafka" + } + } +} +``` +
+ +After Druid ingests the data, you can query the Kafka metadata columns as follows: + +```sql +SELECT + "kafka.header.env", + "kafka.key", + "kafka.timestamp", + "kafka.topic" +FROM "wikiticker" +``` + +This query returns: + +| `kafka.header.env` | `kafka.key` | `kafka.timestamp` | `kafka.topic` | +|--------------------|-----------|---------------|---------------| +| `development` | `wiki-edit` | `1680795276351` | `wiki-edits` | + ## FlattenSpec You can use the `flattenSpec` object to flatten nested data, as an alternative to the Druid [nested columns](../querying/nested-columns.md) feature, and for nested input formats unsupported by the feature. It is an object within the `inputFormat` object. diff --git a/docs/ingestion/ingestion-spec.md b/docs/ingestion/ingestion-spec.md index 017b4f38bec5..0dc54ead97f4 100644 --- a/docs/ingestion/ingestion-spec.md +++ b/docs/ingestion/ingestion-spec.md @@ -503,7 +503,7 @@ is: |skipBytesInMemoryOverheadCheck|The calculation of maxBytesInMemory takes into account overhead objects created during ingestion and each intermediate persist. Setting this to true can exclude the bytes of these overhead objects from maxBytesInMemory check.|false| |indexSpec|Defines segment storage format options to use at indexing time.|See [`indexSpec`](#indexspec) for more information.| |indexSpecForIntermediatePersists|Defines segment storage format options to use at indexing time for intermediate persisted temporary segments.|See [`indexSpec`](#indexspec) for more information.| -|Other properties|Each ingestion method has its own list of additional tuning properties. See the documentation for each method for a full list: [Kafka indexing service](../development/extensions-core/kafka-supervisor-reference.md#supervisor-tuning-configuration), [Kinesis indexing service](../development/extensions-core/kinesis-ingestion.md#supervisor-tuning-configuration), [Native batch](native-batch.md#tuningconfig), and [Hadoop-based](hadoop.md#tuningconfig).|| +|Other properties|Each ingestion method has its own list of additional tuning properties. See the documentation for each method for a full list: [Kafka indexing service](../development/extensions-core/kafka-ingestion.md#tuning-configuration), [Kinesis indexing service](../development/extensions-core/kinesis-ingestion.md#tuning-configuration), [Native batch](native-batch.md#tuningconfig), and [Hadoop-based](hadoop.md#tuningconfig).|| ### `indexSpec` diff --git a/docs/ingestion/streaming.md b/docs/ingestion/streaming.md new file mode 100644 index 000000000000..ba15e6de6367 --- /dev/null +++ b/docs/ingestion/streaming.md @@ -0,0 +1,35 @@ +--- +id: streaming +title: "Streaming ingestion" +--- + + + +Apache Druid accepts data streams from the following external streaming sources: + +* Apache Kafka through the bundled [Kafka indexing service](../development/extensions-core/kafka-ingestion.md) extension. +* Amazon Kinesis through the bundled [Kinesis indexing service](../development/extensions-core/kinesis-ingestion.md) extension. + +Each indexing service provides real-time data ingestion with exactly-once stream processing guarantee. +To use either of the streaming ingestion methods, you must first load the associated extension on both the Overlord and the MiddleManager. See [Loading extensions](../configuration/extensions.md#loading-extensions) for more information. + +Streaming ingestion is controlled by a continuously running [supervisor](supervisor.md). +The supervisor oversees the state of indexing tasks to coordinate handoffs, manage failures, and ensure that scalability and replication requirements are maintained. +You start a supervisor by submitting a JSON specification, often referred to as the supervisor spec, either though the Druid web console or using the [Supervisor API](../api-reference/supervisor-api.md). \ No newline at end of file diff --git a/docs/ingestion/supervisor.md b/docs/ingestion/supervisor.md new file mode 100644 index 000000000000..c7785f2757b0 --- /dev/null +++ b/docs/ingestion/supervisor.md @@ -0,0 +1,117 @@ +--- +id: supervisor +title: Supervisor +sidebar_label: Supervisor +--- + + + +A supervisor manages streaming ingestion from external streaming sources into Apache Druid. +Supervisors oversee the state of indexing tasks to coordinate handoffs, manage failures, and ensure that the scalability and replication requirements are maintained. + +## Supervisor spec + +You use a JSON specification, often referred to as the supervisor spec, to define streaming ingestion tasks. +The supervisor spec specifies how Druid should consume, process, and index streaming data. + +Druid starts a new supervisor for a datasource when you create a supervisor spec. +You can create and manage supervisor specs using the data loader in the Druid web console or by calling the [Supervisor API](../api-reference/supervisor-api.md). +Once started, the supervisor persists in the configured metadata database. There can only be one supervisor per datasource, and submitting a second supervisor spec for the same datasource overwrites the previous one. + +When an Overlord gains leadership, either by being started or as a result of another Overlord failing, it spawns a supervisor for each supervisor spec in the metadata database. The supervisor then discovers running indexing tasks and attempts to adopt them if they are compatible with the supervisor's configuration. If they are not compatible, the tasks are terminated and the supervisor creates a new set of tasks. This way, the supervisors persist across Overlord restarts and failovers. + +### Schema and configuration changes + +Schema and configuration changes are handled by submitting the new supervisor spec. The Overlord initiates a graceful shutdown of the existing supervisor. The running supervisor signals its tasks to stop reading and begin publishing, exiting itself. Druid then uses the provided configuration to create a new supervisor. Druid submits a new schema while retaining existing publishing tasks and starts new tasks at the previous task offsets. +This way, configuration changes can be applied without requiring any pause in ingestion. + +## Status report + +The supervisor status report contains the state of the supervisor tasks and an array of recently thrown exceptions reported as `recentErrors`. +To retrieve the current status report for a single supervisor, send a `GET` request to the `/druid/indexer/v1/supervisor/:supervisorId/status` endpoint. +You can control the maximum size of the exceptions using the `druid.supervisor.maxStoredExceptionEvents` configuration. + +The two properties related to the supervisor's state are `state` and `detailedState`. The `state` property contains a small number of generic states that apply to any type of supervisor, while the `detailedState` property contains a more descriptive, implementation-specific state that may provide more insight into the supervisor's activities. + +Possible state values are `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`, `UNHEALTHY_SUPERVISOR`, and `UNHEALTHY_TASKS`. + +The following table lists `detailedState` values and their corresponding `state` mapping: + +|Detailed state|Corresponding state|Description| +|--------------|-------------------|-----------| +|`UNHEALTHY_SUPERVISOR`|`UNHEALTHY_SUPERVISOR`|The supervisor encountered errors on previous `druid.supervisor.unhealthinessThreshold` iterations.| +|`UNHEALTHY_TASKS`|`UNHEALTHY_TASKS`|The last `druid.supervisor.taskUnhealthinessThreshold` tasks all failed.| +|`UNABLE_TO_CONNECT_TO_STREAM`|`UNHEALTHY_SUPERVISOR`|The supervisor is encountering connectivity issues with the stream and has not successfully connected in the past.| +|`LOST_CONTACT_WITH_STREAM`|`UNHEALTHY_SUPERVISOR`|The supervisor is encountering connectivity issues with the stream but has successfully connected in the past.| +|`PENDING` (first iteration only)|`PENDING`|The supervisor has been initialized but hasn't started connecting to the stream.| +|`CONNECTING_TO_STREAM` (first iteration only)|`RUNNING`|The supervisor is trying to connect to the stream and update partition data.| +|`DISCOVERING_INITIAL_TASKS` (first iteration only)|`RUNNING`|The supervisor is discovering already-running tasks.| +|`CREATING_TASKS` (first iteration only)|`RUNNING`|The supervisor is creating tasks and discovering state.| +|`RUNNING`|`RUNNING`|The supervisor has started tasks and is waiting for `taskDuration` to elapse.| +|`IDLE`|`IDLE`|The supervisor is not creating tasks since the input stream has not received any new data and all the existing data is read.| +|`SUSPENDED`|`SUSPENDED`|The supervisor is suspended.| +|`STOPPING`|`STOPPING`|The supervisor is stopping.| + +On each iteration of the supervisor's run loop, the supervisor completes the following tasks in sequence: + +1. Fetch the list of units of parallelism, such as Kinesis shards or Kafka partitions, and determine the starting sequence number or offset for each unit (either based on the last processed sequence number or offset if continuing, or starting from the beginning or ending of the stream if this is a new stream). +2. Discover any running indexing tasks that are writing to the supervisor's datasource and adopt them if they match the supervisor's configuration, else signal them to stop. +3. Send a status request to each supervised task to update the view of the state of the tasks under supervision. +4. Handle tasks that have exceeded `taskDuration` and should transition from the reading to publishing state. +5. Handle tasks that have finished publishing and signal redundant replica tasks to stop. +6. Handle tasks that have failed and clean up the supervisor's internal state. +7. Compare the list of healthy tasks to the requested `taskCount` and `replicas` configurations and create additional tasks if required. + +The `detailedState` property shows additional values (marked with "first iteration only" in the preceding table) the first time the +supervisor executes this run loop after startup or after resuming from a suspension. This is intended to surface +initialization-type issues, where the supervisor is unable to reach a stable state. For example, if the supervisor cannot connect to +the stream, if it's unable to read from the stream, or cannot communicate with existing tasks. Once the supervisor is stable; +that is, once it has completed a full execution without encountering any issues, `detailedState` will show a `RUNNING` +state until it is stopped, suspended, or hits a failure threshold and transitions to an unhealthy state. + +:::info +For Kafka indexing service, the consumer lag per partition may be reported as negative values if the supervisor hasn't received the latest offset response from Kafka. The aggregate lag value will always be >= 0. +::: + +## Capacity planning + +Indexing tasks run on MiddleManagers and are limited by the resources available in the MiddleManager cluster. In particular, you should make sure that you have sufficient worker capacity, configured using the +`druid.worker.capacity` property, to handle the configuration in the supervisor spec. Note that worker capacity is +shared across all types of indexing tasks, so you should plan your worker capacity to handle your total indexing load, such as batch processing, streaming tasks, and merging tasks. If your workers run out of capacity, indexing tasks queue and wait for the next available worker. This may cause queries to return partial results but will not result in data loss, assuming the tasks run before the stream purges those sequence numbers. + +A running task can be in one of two states: reading or publishing. A task remains in reading state for the period defined in `taskDuration`, at which point it transitions to publishing state. A task remains in publishing state for as long as it takes to generate segments, push segments to deep storage, and have them loaded and served by a Historical service or until `completionTimeout` elapses. + +The number of reading tasks is controlled by `replicas` and `taskCount`. In general, there are `replicas * taskCount` reading tasks. An exception occurs if `taskCount` is over the number of shards in Kinesis or partitions in Kafka, in which case Druid uses the number of shards or partitions. When `taskDuration` elapses, these tasks transition to publishing state and `replicas * taskCount` new reading tasks are created. To allow for reading tasks and publishing tasks to run concurrently, there should be a minimum capacity of: + +```text +workerCapacity = 2 * replicas * taskCount +``` + +This value is for the ideal situation in which there is at most one set of tasks publishing while another set is reading. +In some circumstances, it is possible to have multiple sets of tasks publishing simultaneously. This would happen if the +time-to-publish (generate segment, push to deep storage, load on Historical) is greater than `taskDuration`. This is a valid and correct scenario but requires additional worker capacity to support. In general, it is a good idea to have `taskDuration` be large enough that the previous set of tasks finishes publishing before the current set begins. + +## Learn more + +See the following topics for more information: + +* [Supervisor API](../api-reference/supervisor-api.md) for how to manage and monitor supervisors using the API. +* [Apache Kafka ingestion](../development/extensions-core/kafka-ingestion.md) to learn about ingesting data from an Apache Kafka stream. +* [Amazon Kinesis ingestion](../development/extensions-core/kinesis-ingestion.md) to learn about ingesting data from an Amazon Kinesis stream. \ No newline at end of file diff --git a/docs/querying/sql-metadata-tables.md b/docs/querying/sql-metadata-tables.md index 8e9bce9fad95..829ed9433063 100644 --- a/docs/querying/sql-metadata-tables.md +++ b/docs/querying/sql-metadata-tables.md @@ -299,7 +299,7 @@ The supervisors table provides information about supervisors. |Column|Type|Notes| |------|-----|-----| |supervisor_id|VARCHAR|Supervisor task identifier| -|state|VARCHAR|Basic state of the supervisor. Available states: `UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`. Check [Kafka Docs](../development/extensions-core/kafka-supervisor-operations.md) for details.| +|state|VARCHAR|Basic state of the supervisor. Available states: `UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`. See [Supervisor reference](../ingestion/supervisor.md) for more information.| |detailed_state|VARCHAR|Supervisor specific state. (See documentation of the specific supervisor for details, e.g. [Kafka](../development/extensions-core/kafka-ingestion.md) or [Kinesis](../development/extensions-core/kinesis-ingestion.md))| |healthy|BIGINT|Boolean represented as long type where 1 = true, 0 = false. 1 indicates a healthy supervisor| |type|VARCHAR|Type of supervisor, e.g. `kafka`, `kinesis` or `materialized_view`| diff --git a/docs/tutorials/tutorial-kafka.md b/docs/tutorials/tutorial-kafka.md index 9e74f467c4f3..7e03671a6b6a 100644 --- a/docs/tutorials/tutorial-kafka.md +++ b/docs/tutorials/tutorial-kafka.md @@ -295,6 +295,4 @@ Check out the [Querying data tutorial](../tutorials/tutorial-query.md) to run so For more information, see the following topics: -- [Apache Kafka ingestion](../development/extensions-core/kafka-ingestion.md) for more information on loading data from Kafka streams. -- [Apache Kafka supervisor reference](../development/extensions-core/kafka-supervisor-reference.md) for Kafka supervisor configuration information. -- [Apache Kafka supervisor operations reference](../development/extensions-core/kafka-supervisor-operations.md) for information on running and maintaining Kafka supervisors for Druid. +- [Apache Kafka ingestion](../development/extensions-core/kafka-ingestion.md) for information on loading data from Kafka streams and maintaining Kafka supervisors for Druid. diff --git a/website/redirects.js b/website/redirects.js index db3160513e66..93cd96390b7c 100644 --- a/website/redirects.js +++ b/website/redirects.js @@ -118,7 +118,9 @@ const Redirects=[ "from": [ "/docs/latest/development/community-extensions/kafka-simple.html", "/docs/latest/development/community-extensions/rabbitmq.html", - "/docs/latest/development/kafka-simple-consumer-firehose.html" + "/docs/latest/development/kafka-simple-consumer-firehose.html", + "/docs/latest/development/extensions-core/kafka-supervisor-operations.html", + "/docs/latest/development/extensions-core/kafka-supervisor-reference.html" ], "to": "/docs/latest/development/extensions-core/kafka-ingestion" }, diff --git a/website/sidebars.json b/website/sidebars.json index 9e4267fd95f8..80dd3bafc9ab 100644 --- a/website/sidebars.json +++ b/website/sidebars.json @@ -80,9 +80,9 @@ "type": "category", "label": "Streaming", "items": [ + "ingestion/streaming", + "ingestion/supervisor", "development/extensions-core/kafka-ingestion", - "development/extensions-core/kafka-supervisor-reference", - "development/extensions-core/kafka-supervisor-operations", "development/extensions-core/kinesis-ingestion" ] }, From c14eb91e74e03827fd73854cb1017f5a5407bef0 Mon Sep 17 00:00:00 2001 From: Katya Macedo Date: Mon, 18 Dec 2023 21:46:28 -0600 Subject: [PATCH 02/15] Update property definition --- docs/development/extensions-core/kafka-ingestion.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/development/extensions-core/kafka-ingestion.md b/docs/development/extensions-core/kafka-ingestion.md index c842deef7c51..98b4bfeeffe6 100644 --- a/docs/development/extensions-core/kafka-ingestion.md +++ b/docs/development/extensions-core/kafka-ingestion.md @@ -323,7 +323,7 @@ The following table outlines the configuration options for `tuningConfig`: |`indexSpecForIntermediatePersists`|Object|Defines segment storage format options to use at indexing time for intermediate persisted temporary segments. You can use `indexSpecForIntermediatePersists` to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. However, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published. See [IndexSpec](#indexspec) for possible values.|No|Same as `indexSpec`| |`reportParseExceptions`|Boolean|DEPRECATED. If `true`, Druid throws exceptions encountered during parsing causing ingestion to halt. If `false`, Druid skips unparseable rows and fields. Setting `reportParseExceptions` to `true` overrides existing configurations for `maxParseExceptions` and `maxSavedParseExceptions`, setting `maxParseExceptions` to 0 and limiting `maxSavedParseExceptions` to not more than 1.|No|`false`| |`handoffConditionTimeout`|Long|Number of milliseconds to wait for segment handoff. Set to a value >= 0, where 0 means to wait indefinitely.|No|900000 (15 minutes)| -|`resetOffsetAutomatically`|Boolean|Controls behavior when Druid needs to read Kafka messages that are no longer available, when `offsetOutOfRangeException` is encountered.
If `false`, the exception bubbles up causing tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation, potentially using the [Reset Supervisor API](../../api-reference/supervisor-api.md#reset-a-supervisor). This mode is useful for production, since it will make you aware of issues with ingestion.
If `true`, Druid will automatically reset to the earlier or latest offset available in Kafka, based on the value of the `useEarliestOffset` property (earliest if `true`, latest if `false`). Note that this can lead to dropping data (if `useEarliestSequenceNumber` is `false`) or duplicating data (if `useEarliestSequenceNumber` is `true`) without your knowledge. Druid logs messages indicating that a reset has occurred without interrupting ingestion. This mode is useful for non-production situations since it enables Druid to recover from problems automatically, even if they lead to quiet dropping or duplicating of data. This feature behaves similarly to the Kafka `auto.offset.reset` consumer property.|No|`false`| +|`resetOffsetAutomatically`|Boolean| Determines how Druid reads Kafka messages when partitions in the topic have `offsetOutOfRangeException`. This feature behaves similarly to the Kafka `auto.offset.reset` consumer property. If `resetOffsetAutomatically` is set to `true`, Druid automatically resets to the earliest or latest offset available in Kafka, based on the value of the `useEarliestOffset` property (earliest if `true`, latest if `false`). Druid logs messages to the ingestion task log file indicating that a reset has occurred without interrupting ingestion. Setting `resetOffsetAutomatically` to `true` can lead to dropping data (if `useEarliestSequenceNumber` is `false`) or duplicating data (if `useEarliestSequenceNumber` is `true`) without your knowledge.
If only one partition in the topic has `offsetOutOfrangeException`, the offset is reset for that partition only.
If `resetOffsetAutomatically` is `false`, the exception bubbles up causing tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation, potentially using the [Reset Supervisor API](../../api-reference/supervisor-api.md#reset-a-supervisor). |No|`false`| |`workerThreads`|Integer|The number of threads that the supervisor uses to handle requests/responses for worker tasks, along with any other internal asynchronous operation.|No|`min(10, taskCount)`| |`chatAsync`|Boolean|If `true`, use asynchronous communication with indexing tasks, and ignore the `chatThreads` parameter. If `false`, use synchronous communication in a thread pool of size `chatThreads`.|No|`true`| |`chatThreads`|Integer|The number of threads to use for communicating with indexing tasks. Ignored if `chatAsync` is `true`.|No|`min(10, taskCount * replicas)`| From 423f46539aa95f0418981cb9d6c02053495b7f90 Mon Sep 17 00:00:00 2001 From: Katya Macedo Date: Wed, 17 Jan 2024 16:07:32 -0600 Subject: [PATCH 03/15] Update after review --- docs/api-reference/supervisor-api.md | 27 +++++++++---------- docs/configuration/index.md | 12 ++++----- .../extensions-core/kafka-ingestion.md | 19 +++++++------ .../extensions-core/kinesis-ingestion.md | 10 +++---- 4 files changed, 32 insertions(+), 36 deletions(-) diff --git a/docs/api-reference/supervisor-api.md b/docs/api-reference/supervisor-api.md index ddbb67c819e0..aeb777364362 100644 --- a/docs/api-reference/supervisor-api.md +++ b/docs/api-reference/supervisor-api.md @@ -2213,24 +2213,22 @@ Host: http://ROUTER_IP:ROUTER_PORT ### Create or update a supervisor -Creates a new supervisor or updates an existing one for the same datasource with a new schema and configuration. +Creates a new supervisor spec or updates an existing one with new configuration and schema information. When updating a supervisor spec, the datasource must remain the same as the previous supervisor. -You can define a supervisor spec for [Apache Kafka](../development/extensions-core/kinesis-ingestion.md#supervisor-spec) or [Amazon Kinesis](../development/extensions-core/kinesis-ingestion.md#supervisor-spec) streaming ingestion methods. Once created, the supervisor persists in the metadata database. +You can define a supervisor spec for [Apache Kafka](../development/extensions-core/kinesis-ingestion.md#supervisor-spec) or [Amazon Kinesis](../development/extensions-core/kinesis-ingestion.md#supervisor-spec) streaming ingestion methods. The following table lists the properties of a supervisor spec: |Property|Type|Description|Required| |--------|----|-----------|--------| -|`type`|String|The supervisor type. Choose from `kafka` or `kinesis`.|Yes| +|`type`|String|The supervisor type. One of`kafka` or `kinesis`.|Yes| |`spec`|Object|The container object for the supervisor configuration.|Yes| |`ioConfig`|Object|The I/O configuration object to define the connection and I/O-related settings for the supervisor and indexing task.|Yes| |`dataSchema`|Object|The schema for the indexing task to use during ingestion. See [`dataSchema`](../ingestion/ingestion-spec.md#dataschema) for more information.|Yes| |`tuningConfig`|Object|The tuning configuration object to define performance-related settings for the supervisor and indexing tasks.|No| -When you call this endpoint on an existing supervisor for the same datasource, the running supervisor signals its tasks to stop reading and begin publishing, exiting itself. Druid then uses the provided configuration from the request body to create a new supervisor. Druid submits a new schema while retaining existing publishing tasks and starts new tasks at the previous task offsets. -In this way, configuration changes can be applied without requiring any pause in ingestion. - -You can achieve seamless schema migrations by submitting the new schema using the `/druid/indexer/v1/supervisor` endpoint. +When you call this endpoint on an existing supervisor, the running supervisor signals its tasks to stop reading and begin publishing, exiting itself. Druid then uses the provided configuration from the request body to create a new supervisor. Druid submits a new schema while retaining existing publishing tasks and starts new tasks at the previous task offsets. +This way, you can apply configuration changes without a pause in ingestion. #### URL @@ -2399,7 +2397,7 @@ Content-Length: 1359 ### Suspend a running supervisor Suspends a single running supervisor. Returns the updated supervisor spec, where the `suspended` property is set to `true`. The suspended supervisor continues to emit logs and metrics. -Indexing tasks remain suspended until the supervisor is resumed. +Indexing tasks remain suspended until you [resume the supervisor](#resume-a-supervisor). #### URL POST /druid/indexer/v1/supervisor/:supervisorId/suspend @@ -3247,13 +3245,13 @@ Host: http://ROUTER_IP:ROUTER_PORT The supervisor must be running for this endpoint to be available. -Resets the specified supervisor. This endpoint clears all stored offsets in Kafka or sequence numbers in Kinesis, prompting the supervisor to resume data reading. The supervisor will start from the earliest or latest available position, depending on the platform (offsets in Kafka or sequence numbers in Kinesis). +Resets the specified supervisor. This endpoint clears all stored offsets in Kafka or sequence numbers in Kinesis, prompting the supervisor to resume data reading. The supervisor restarts from the earliest or latest available position, depending on the platform: offsets in Kafka or sequence numbers in Kinesis. After clearing all stored offsets in Kafka or sequence numbers in Kinesis, the supervisor kills and recreates active tasks, so that tasks begin reading from valid positions. Use this endpoint to recover from a stopped state due to missing offsets in Kafka or sequence numbers in Kinesis. Use this endpoint with caution as it may result in skipped messages and lead to data loss or duplicate data. -The indexing service keeps track of the latest persisted offsets in Kafka or sequence numbers in Kinesis to provide exactly-once ingestion guarantees across tasks. Subsequent tasks must start reading from where the previous task completed for the generated segments to be accepted. If the messages at the expected starting offsets in Kafka or sequence numbers in Kinesis are no longer available (typically because the message retention period has elapsed or the topic was removed and re-created) the supervisor will refuse to start and in flight tasks will fail. This endpoint enables you to recover from this condition. +The indexing service keeps track of the latest persisted offsets in Kafka or sequence numbers in Kinesis to provide exactly-once ingestion guarantees across tasks. Subsequent tasks must start reading from where the previous task completed for Druid to accept the generated segments. If the messages at the expected starting offsets in Kafka or sequence numbers in Kinesis are no longer available, the supervisor refuses to start and in-flight tasks fail. Possible causes for missing messages include the message retention period elapsing or the topic being removed and re-created. Use the `reset` endpoint to recover from this condition. #### URL @@ -3328,7 +3326,7 @@ If there are no stored offsets, the specified offsets are set in the metadata st After resetting stored offsets, the supervisor kills and recreates any active tasks pertaining to the specified partitions, so that tasks begin reading specified offsets. For partitions that are not specified in this operation, the supervisor resumes from the last stored offset. -Use this endpoint with caution as it may result in skipped messages, leading to data loss or duplicate data. +Use this endpoint with caution. It can cause skipped messages, leading to data loss or duplicate data. #### URL @@ -3374,8 +3372,7 @@ The following table defines the fields within the `partitions` object in the res #### Sample request -The following example shows how to reset offsets for a Kafka supervisor with the name `social_media`. Let's say the supervisor is reading -from a Kafka topic `ads_media_stream` and has the stored offsets: `{"0": 0, "1": 10, "2": 20, "3": 40}`. +The following example shows how to reset offsets for a Kafka supervisor with the name `social_media`. For example, the supervisor is reading from a Kafka topic `ads_media_stream` and has the stored offsets: `{"0": 0, "1": 10, "2": 20, "3": 40}`. @@ -3410,8 +3407,8 @@ Content-Type: application/json } ``` -The above operation will reset offsets only for partitions `0` and `2` to 100 and 650 respectively. After a successful reset, -when the supervisor's tasks restart, they will resume reading from `{"0": 100, "1": 10, "2": 650, "3": 40}`. +The example operation resets offsets only for partitions `0` and `2` to 100 and 650 respectively. After a successful reset, +when the supervisor's tasks restart, they resume reading from `{"0": 100, "1": 10, "2": 650, "3": 40}`.
diff --git a/docs/configuration/index.md b/docs/configuration/index.md index c539096e4ea7..8808da458525 100644 --- a/docs/configuration/index.md +++ b/docs/configuration/index.md @@ -510,7 +510,7 @@ These properties specify the JDBC connection and other configuration around the |Property|Description|Default| |--------|-----------|-------| -|`druid.metadata.storage.type`|The type of metadata storage to use. Choose from `mysql`, `postgresql`, or `derby`.|`derby`| +|`druid.metadata.storage.type`|The type of metadata storage to use. One of `mysql`, `postgresql`, or `derby`.|`derby`| |`druid.metadata.storage.connector.connectURI`|The JDBC URI for the database to connect to|none| |`druid.metadata.storage.connector.user`|The username to connect with.|none| |`druid.metadata.storage.connector.password`|The [Password Provider](../operations/password-provider.md) or String password used to connect with.|none| @@ -533,7 +533,7 @@ The configurations concern how to push and pull [Segments](../design/segments.md |Property|Description|Default| |--------|-----------|-------| -|`druid.storage.type`|The type of deep storage to use. Choose from `local`, `noop`, `s3`, `hdfs`, `c*`.|local| +|`druid.storage.type`|The type of deep storage to use. One of `local`, `noop`, `s3`, `hdfs`, `c*`.|local| #### Local deep storage @@ -1101,7 +1101,7 @@ These Overlord static configurations can be defined in the `overlord/runtime.pro |Property|Description|Default| |--------|-----------|-------| |`druid.indexer.runner.type`|Indicates whether tasks should be run locally using `local` or in a distributed environment using `remote`. The recommended option is `httpRemote`, which is similar to `remote` but uses HTTP to interact with Middle Managers instead of ZooKeeper.|`httpRemote`| -|`druid.indexer.storage.type`|Indicates whether incoming tasks should be stored locally (in heap) or in metadata storage. Choose from `local` or `metadata`. `local` is mainly for internal testing while `metadata` is recommended in production because storing incoming tasks in metadata storage allows for tasks to be resumed if the Overlord should fail.|`local`| +|`druid.indexer.storage.type`|Indicates whether incoming tasks should be stored locally (in heap) or in metadata storage. One of `local` or `metadata`. `local` is mainly for internal testing while `metadata` is recommended in production because storing incoming tasks in metadata storage allows for tasks to be resumed if the Overlord should fail.|`local`| |`druid.indexer.storage.recentlyFinishedThreshold`|Duration of time to store task results. Default is 24 hours. If you have hundreds of tasks running in a day, consider increasing this threshold.|`PT24H`| |`druid.indexer.tasklock.forceTimeChunkLock`|_**Setting this to false is still experimental**_
If set, all tasks are enforced to use time chunk lock. If not set, each task automatically chooses a lock type to use. This configuration can be overwritten by setting `forceTimeChunkLock` in the [task context](../ingestion/tasks.md#context). See [Task Locking & Priority](../ingestion/tasks.md#context) for more details about locking in tasks.|true| |`druid.indexer.tasklock.batchSegmentAllocation`| If set to true, Druid performs segment allocate actions in batches to improve throughput and reduce the average `task/action/run/time`. See [batching `segmentAllocate` actions](../ingestion/tasks.md#batching-segmentallocate-actions) for details.|true| @@ -1133,7 +1133,7 @@ If autoscaling is enabled, you can set these additional configs: |Property|Description|Default| |--------|-----------|-------| -|`druid.indexer.autoscale.strategy`|Sets the strategy to run when autoscaling is required. Choose from `noop`, `ec2` or `gce`.|`noop`| +|`druid.indexer.autoscale.strategy`|Sets the strategy to run when autoscaling is required. One of `noop`, `ec2` or `gce`.|`noop`| |`druid.indexer.autoscale.doAutoscale`|If set to true, autoscaling will be enabled.|false| |`druid.indexer.autoscale.provisionPeriod`|How often to check whether or not new MiddleManagers should be added.|`PT1M`| |`druid.indexer.autoscale.terminatePeriod`|How often to check when MiddleManagers should be removed.|`PT5M`| @@ -1159,7 +1159,7 @@ If autoscaling is enabled, you can set these additional configs: |`druid.supervisor.idleConfig.enabled`|If `true`, supervisor can become idle if there is no data on input stream/topic for some time.|false| |`druid.supervisor.idleConfig.inactiveAfterMillis`|Supervisor is marked as idle if all existing data has been read from input topic and no new data has been published for `inactiveAfterMillis` milliseconds.|`600_000`| -The `druid.supervisor.idleConfig.*` specified in the Overlord runtime properties defines the default behavior for the entire cluster. See [Idle Configuration in Kafka Supervisor IOConfig](../development/extensions-core/kinesis-ingestion.md#io-configuration) to override it for an individual supervisor. +The `druid.supervisor.idleConfig.*` specification in the Overlord runtime properties defines the default behavior for the entire cluster. See [Idle Configuration in Kafka Supervisor IOConfig](../development/extensions-core/kinesis-ingestion.md#io-configuration) to override it for an individual supervisor. #### Overlord dynamic configuration @@ -1483,7 +1483,7 @@ Additional Peon configs include: |Property|Description|Default| |--------|-----------|-------| -|`druid.peon.mode`|Choose from `local` and `remote`. Setting this property to `local` means you intend to run the Peon as a standalone process which is not recommended.|`remote`| +|`druid.peon.mode`|One of `local` or `remote`. Setting this property to `local` means you intend to run the Peon as a standalone process which is not recommended.|`remote`| |`druid.indexer.task.baseDir`|Base temporary working directory.|`System.getProperty("java.io.tmpdir")`| |`druid.indexer.task.baseTaskDir`|Base temporary working directory for tasks.|`${druid.indexer.task.baseDir}/persistent/task`| |`druid.indexer.task.batchProcessingMode`| Batch ingestion tasks have three operating modes to control construction and tracking for intermediary segments: `OPEN_SEGMENTS`, `CLOSED_SEGMENTS`, and `CLOSED_SEGMENT_SINKS`. `OPEN_SEGMENTS` uses the streaming ingestion code path and performs a `mmap` on intermediary segments to build a timeline to make these segments available to realtime queries. Batch ingestion doesn't require intermediary segments, so the default mode, `CLOSED_SEGMENTS`, eliminates `mmap` of intermediary segments. `CLOSED_SEGMENTS` mode still tracks the entire set of segments in heap. The `CLOSED_SEGMENTS_SINKS` mode is the most aggressive configuration and should have the smallest memory footprint. It eliminates in-memory tracking and `mmap` of intermediary segments produced during segment creation. `CLOSED_SEGMENTS_SINKS` mode isn't as well tested as other modes so is currently considered experimental. You can use `OPEN_SEGMENTS` mode if problems occur with the 2 newer modes. |`CLOSED_SEGMENTS`| diff --git a/docs/development/extensions-core/kafka-ingestion.md b/docs/development/extensions-core/kafka-ingestion.md index 98b4bfeeffe6..8f289040098d 100644 --- a/docs/development/extensions-core/kafka-ingestion.md +++ b/docs/development/extensions-core/kafka-ingestion.md @@ -52,7 +52,7 @@ The following table outlines the high-level configuration options for the Kafka |Property|Type|Description|Required| |--------|----|-----------|--------| -|`type`|String|The supervisor type; this should always be `kafka`.|Yes| +|`type`|String|The supervisor type; must be `kafka`.|Yes| |`spec`|Object|The container object for the supervisor configuration.|Yes| |`ioConfig`|Object|The I/O configuration object to define the connection and I/O-related settings for the supervisor and indexing tasks.|Yes| |`dataSchema`|Object|The schema for the indexing task to use during ingestion. See [`dataSchema`](../../ingestion/ingestion-spec.md#dataschema) for more information.|Yes| @@ -288,15 +288,14 @@ If you enable multi-topic ingestion for a datasource, downgrading to a version o 28.0.0 will cause the ingestion for that datasource to fail. ::: -To ingest data from multiple topics, you set `topicPattern` instead of `topic in the supervisor `ioConfig` object`. +To ingest data from multiple topics, you set `topicPattern` instead of `topic` in the supervisor `ioConfig` object. You can pass multiple topics as a regex pattern as the value for `topicPattern` in `ioConfig`. For example, to ingest data from clicks and impressions, set `topicPattern` to `clicks|impressions` in `ioCofig`. Similarly, you can use `metrics-.*` as the value for `topicPattern` if you want to ingest from all the topics that -start with `metrics-`. If new topics are added to the cluster that match the regex, Druid automatically starts -ingesting from those new topics. A topic name that only matches partially such as `my-metrics-12` will not be -included for ingestion. +start with `metrics-`. If you add a new topic that matches the regex to the cluster, Druid automatically starts +ingesting from those new topics. Topic names that match partially, such as `my-metrics-12`, are not included for ingestion. -When ingesting data from multiple topics, partitions are assigned based on the hashcode of the topic name and the +When ingesting data from multiple topics, Druid assigns partitions based on the hashcode of the topic name and the ID of the partition within that topic. The partition assignment might not be uniform across all the tasks. It's also assumed that partitions across individual topics have similar load. It is recommended that you have a higher number of partitions for a high load topic and a lower number of partitions for a low load topic. Assuming that you want to @@ -310,7 +309,7 @@ The following table outlines the configuration options for `tuningConfig`: |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| -|`type`|String|The indexing task type. This should always be `kafka`.|Yes|| +|`type`|String|The indexing task type; must be `kafka`.|Yes|| |`maxRowsInMemory`|Integer|The number of rows to aggregate before persisting. This number represents the post-aggregation rows. It is not equivalent to the number of input events, but the resulting number of aggregated rows. Druid uses `maxRowsInMemory` to manage the required JVM heap size. The maximum heap memory usage for indexing scales is `maxRowsInMemory * (2 + maxPendingPersists)`. Normally, you do not need to set this, but depending on the nature of data, if rows are short in terms of bytes, you may not want to store a million rows in memory and this value should be set.|No|150000| |`maxBytesInMemory`|Long|The number of bytes to aggregate in heap memory before persisting. This is based on a rough estimate of memory usage and not actual usage. Normally, this is computed internally. The maximum heap memory usage for indexing is `maxBytesInMemory * (2 + maxPendingPersists)`.|No|One-sixth of max JVM memory| |`skipBytesInMemoryOverheadCheck`|Boolean|The calculation of `maxBytesInMemory` takes into account overhead objects created during ingestion and each intermediate persist. To exclude the bytes of these overhead objects from the `maxBytesInMemory` check, set `skipBytesInMemoryOverheadCheck` to `true`.|No|`false`| @@ -343,9 +342,9 @@ The following table outlines the configuration options for `indexSpec`: |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| |`bitmap`|Object|Compression format for bitmap indexes. Druid supports roaring and concise bitmap types.|No|Roaring| -|`dimensionCompression`|String|Compression format for dimension columns. Choose from `LZ4`, `LZF`, `ZSTD` or `uncompressed`.|No|`LZ4`| -|`metricCompression`|String|Compression format for primitive type metric columns. Choose from `LZ4`, `LZF`, `ZSTD`, `uncompressed` or `none`.|No|`LZ4`| -|`longEncoding`|String|Encoding format for metric and dimension columns with type long. Choose from `auto` or `longs`. `auto` encodes the values using offset or lookup table depending on column cardinality, and store them with variable size. `longs` stores the value as is with 8 bytes each.|No|`longs`| +|`dimensionCompression`|String|Compression format for dimension columns. One of `LZ4`, `LZF`, `ZSTD` or `uncompressed`.|No|`LZ4`| +|`metricCompression`|String|Compression format for primitive type metric columns. One of `LZ4`, `LZF`, `ZSTD`, `uncompressed` or `none`.|No|`LZ4`| +|`longEncoding`|String|Encoding format for metric and dimension columns with type long. One of `auto` or `longs`. `auto` encodes the values using offset or lookup table depending on column cardinality, and store them with variable size. `longs` stores the value as is with 8 bytes each.|No|`longs`| ## Deployment notes on Kafka partitions and Druid segments diff --git a/docs/development/extensions-core/kinesis-ingestion.md b/docs/development/extensions-core/kinesis-ingestion.md index d5f27efa1a02..c8007d4e2e74 100644 --- a/docs/development/extensions-core/kinesis-ingestion.md +++ b/docs/development/extensions-core/kinesis-ingestion.md @@ -43,7 +43,7 @@ The following table outlines the high-level configuration options for the Kinesi |Property|Type|Description|Required| |--------|----|-----------|--------| -|`type`|String|The supervisor type; this should always be `kinesis`.|Yes| +|`type`|String|The supervisor type; must be `kinesis`.|Yes| |`spec`|Object|The container object for the supervisor configuration.|Yes| |`ioConfig`|Object|The [I/O configuration](#supervisor-io-configuration) object for configuring Kinesis connection and I/O-related settings for the supervisor and indexing tasks.|Yes| |`dataSchema`|Object|The schema used by the Kinesis indexing task during ingestion. See [`dataSchema`](../../ingestion/ingestion-spec.md#dataschema) for more information.|Yes| @@ -195,7 +195,7 @@ The following table outlines the configuration options for `tuningConfig`: |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| -|`type`|String|The indexing task type. This should always be `kinesis`.|Yes|| +|`type`|String|The indexing task type; must be `kinesis`.|Yes|| |`maxRowsInMemory`|Integer|The number of rows to aggregate before persisting. This number represents the post-aggregation rows. It is not equivalent to the number of input events, but the resulting number of aggregated rows. Druid uses `maxRowsInMemory` to manage the required JVM heap size. The maximum heap memory usage for indexing scales is `maxRowsInMemory * (2 + maxPendingPersists)`.|No|100000| |`maxBytesInMemory`|Long| The number of bytes to aggregate in heap memory before persisting. This is based on a rough estimate of memory usage and not actual usage. Normally, this is computed internally. The maximum heap memory usage for indexing is `maxBytesInMemory * (2 + maxPendingPersists)`.|No|One-sixth of max JVM memory| |`skipBytesInMemoryOverheadCheck`|Boolean|The calculation of `maxBytesInMemory` takes into account overhead objects created during ingestion and each intermediate persist. To exclude the bytes of these overhead objects from the `maxBytesInMemory` check, set `skipBytesInMemoryOverheadCheck` to `true`.|No|`false`| @@ -234,9 +234,9 @@ The following table outlines the configuration options for `indexSpec`: |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| |`bitmap`|Object|Compression format for bitmap indexes. Druid supports roaring and concise bitmap types.|No|Roaring| -|`dimensionCompression`|String|Compression format for dimension columns. Choose from `LZ4`, `LZF`, or `uncompressed`.|No|`LZ4`| -|`metricCompression`|String|Compression format for primitive type metric columns. Choose from `LZ4`, `LZF`, `uncompressed`, or `none`.|No|`LZ4`| -|`longEncoding`|String|Encoding format for metric and dimension columns with type long. Choose from `auto` or `longs`. `auto` encodes the values using sequence number or lookup table depending on column cardinality and stores them with variable sizes. `longs` stores the value as is with 8 bytes each.|No|`longs`| +|`dimensionCompression`|String|Compression format for dimension columns. One of `LZ4`, `LZF`, or `uncompressed`.|No|`LZ4`| +|`metricCompression`|String|Compression format for primitive type metric columns. One of `LZ4`, `LZF`, `uncompressed`, or `none`.|No|`LZ4`| +|`longEncoding`|String|Encoding format for metric and dimension columns with type long. One of `auto` or `longs`. `auto` encodes the values using sequence number or lookup table depending on column cardinality and stores them with variable sizes. `longs` stores the value as is with 8 bytes each.|No|`longs`| ## AWS authentication From 07f7931569121f90544a01eb639036be0d797176 Mon Sep 17 00:00:00 2001 From: Katya Macedo Date: Thu, 18 Jan 2024 15:16:42 -0600 Subject: [PATCH 04/15] Update known issues --- docs/development/extensions-core/kinesis-ingestion.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/docs/development/extensions-core/kinesis-ingestion.md b/docs/development/extensions-core/kinesis-ingestion.md index c8007d4e2e74..0111e31e6dd6 100644 --- a/docs/development/extensions-core/kinesis-ingestion.md +++ b/docs/development/extensions-core/kinesis-ingestion.md @@ -404,13 +404,11 @@ Note that when the supervisor is running and detects new partitions, tasks read If resharding occurs when the supervisor is suspended and `useEarliestSequence` is set to `false`, resuming the supervisor causes tasks to read the new shards from the latest sequence. This is by design so that the consumer can catch up quickly with any lag accumulated while the supervisor was suspended. -## Kinesis known issues +## Known issues Before you deploy the `druid-kinesis-indexing-service` extension to production, consider the following known issues: -- Avoid implementing more than one Kinesis supervisor that reads from the same Kinesis stream for ingestion. Kinesis has a per-shard read throughput limit and having multiple supervisors on the same stream can reduce available read throughput for an individual supervisor's tasks. Multiple supervisors ingesting to the same Druid datasource can also cause increased contention for locks on the datasource. -- The only way to change the stream reset policy is to submit a new ingestion spec and set up a new supervisor. -- If ingestion tasks get stuck, the supervisor does not automatically recover. You should monitor ingestion tasks and investigate if your ingestion falls behind. +- Kinesis imposes a read throughput limit per shard. If you have multiple supervisors reading from the same Kinesis stream, consider adding more shards to ensure sufficient read throughput for all supervisors. - A Kinesis supervisor can sometimes compare the checkpoint offset to retention window of the stream to see if it has fallen behind. These checks fetch the earliest sequence number for Kinesis which can result in `IteratorAgeMilliseconds` becoming very high in AWS CloudWatch. ## Learn more From 2ac12d8eca096171efc009ca80f397089e2d118b Mon Sep 17 00:00:00 2001 From: Katya Macedo Date: Tue, 30 Jan 2024 09:58:41 -0600 Subject: [PATCH 05/15] Move kinesis and kafka topics to ingestion, add redirects --- docs/api-reference/supervisor-api.md | 4 +-- docs/configuration/extensions.md | 4 +-- docs/configuration/index.md | 4 +-- docs/design/storage.md | 4 +-- docs/development/extensions-core/protobuf.md | 2 +- docs/ingestion/data-formats.md | 8 ++--- docs/ingestion/index.md | 2 +- docs/ingestion/ingestion-spec.md | 6 ++-- .../kafka-ingestion.md | 30 +++++++++---------- .../kinesis-ingestion.md | 22 +++++++------- docs/ingestion/partitioning.md | 4 +-- docs/ingestion/rollup.md | 4 +-- docs/ingestion/standalone-realtime.md | 4 +-- docs/ingestion/streaming.md | 4 +-- docs/ingestion/supervisor.md | 6 ++-- docs/ingestion/tasks.md | 6 ++-- docs/ingestion/tranquility.md | 4 +-- docs/operations/basic-cluster-tuning.md | 2 +- docs/operations/dynamic-config-provider.md | 2 +- docs/operations/metrics.md | 4 +-- docs/querying/arrays.md | 4 +-- docs/querying/multi-value-dimensions.md | 4 +-- docs/querying/nested-columns.md | 2 +- docs/querying/sql-metadata-tables.md | 2 +- docs/tutorials/tutorial-kafka.md | 2 +- website/redirects.js | 6 +++- website/sidebars.json | 4 +-- 27 files changed, 77 insertions(+), 73 deletions(-) rename docs/{development/extensions-core => ingestion}/kafka-ingestion.md (93%) rename docs/{development/extensions-core => ingestion}/kinesis-ingestion.md (93%) diff --git a/docs/api-reference/supervisor-api.md b/docs/api-reference/supervisor-api.md index aeb777364362..a88b1a984a72 100644 --- a/docs/api-reference/supervisor-api.md +++ b/docs/api-reference/supervisor-api.md @@ -37,7 +37,7 @@ The following table lists the properties of a supervisor object: |---|---|---| |`id`|String|Unique identifier.| |`state`|String|Generic state of the supervisor. Available states:`UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`. See [Supervisor reference](../ingestion/supervisor.md#status-report) for more information.| -|`detailedState`|String|Detailed state of the supervisor. This property contains a more descriptive, implementation-specific state that may provide more insight into the supervisor's activities than the `state` property. See [Apache Kafka ingestion](../development/extensions-core/kafka-ingestion.md) and [Amazon Kinesis ingestion](../development/extensions-core/kinesis-ingestion.md) for supervisor-specific states.| +|`detailedState`|String|Detailed state of the supervisor. This property contains a more descriptive, implementation-specific state that may provide more insight into the supervisor's activities than the `state` property. See [Apache Kafka ingestion](../ingestion/kafka-ingestion.md) and [Amazon Kinesis ingestion](../ingestion/kinesis-ingestion.md) for supervisor-specific states.| |`healthy`|Boolean|Supervisor health indicator.| |`spec`|Object|Container object for the supervisor configuration.| |`suspended`|Boolean|Indicates whether the supervisor is in a suspended state.| @@ -2215,7 +2215,7 @@ Host: http://ROUTER_IP:ROUTER_PORT Creates a new supervisor spec or updates an existing one with new configuration and schema information. When updating a supervisor spec, the datasource must remain the same as the previous supervisor. -You can define a supervisor spec for [Apache Kafka](../development/extensions-core/kinesis-ingestion.md#supervisor-spec) or [Amazon Kinesis](../development/extensions-core/kinesis-ingestion.md#supervisor-spec) streaming ingestion methods. +You can define a supervisor spec for [Apache Kafka](../ingestion/kafka-ingestion.md#supervisor-spec) or [Amazon Kinesis](../ingestion/kinesis-ingestion.md#supervisor-spec) streaming ingestion methods. The following table lists the properties of a supervisor spec: diff --git a/docs/configuration/extensions.md b/docs/configuration/extensions.md index 5fbb20e74efe..3c3c063d2b86 100644 --- a/docs/configuration/extensions.md +++ b/docs/configuration/extensions.md @@ -45,8 +45,8 @@ Core extensions are maintained by Druid committers. |druid-hdfs-storage|HDFS deep storage.|[link](../development/extensions-core/hdfs.md)| |druid-histogram|Approximate histograms and quantiles aggregator. Deprecated, please use the [DataSketches quantiles aggregator](../development/extensions-core/datasketches-quantiles.md) from the `druid-datasketches` extension instead.|[link](../development/extensions-core/approximate-histograms.md)| |druid-kafka-extraction-namespace|Apache Kafka-based namespaced lookup. Requires namespace lookup extension.|[link](../development/extensions-core/kafka-extraction-namespace.md)| -|druid-kafka-indexing-service|Supervised exactly-once Apache Kafka ingestion for the indexing service.|[link](../development/extensions-core/kafka-ingestion.md)| -|druid-kinesis-indexing-service|Supervised exactly-once Kinesis ingestion for the indexing service.|[link](../development/extensions-core/kinesis-ingestion.md)| +|druid-kafka-indexing-service|Supervised exactly-once Apache Kafka ingestion for the indexing service.|[link](../ingestion/kafka-ingestion.md)| +|druid-kinesis-indexing-service|Supervised exactly-once Kinesis ingestion for the indexing service.|[link](../ingestion/kinesis-ingestion.md)| |druid-kerberos|Kerberos authentication for druid processes.|[link](../development/extensions-core/druid-kerberos.md)| |druid-lookups-cached-global|A module for [lookups](../querying/lookups.md) providing a jvm-global eager caching for lookups. It provides JDBC and URI implementations for fetching lookup data.|[link](../development/extensions-core/lookups-cached-global.md)| |druid-lookups-cached-single| Per lookup caching module to support the use cases where a lookup need to be isolated from the global pool of lookups |[link](../development/extensions-core/druid-lookups.md)| diff --git a/docs/configuration/index.md b/docs/configuration/index.md index 8808da458525..b62b6639ff31 100644 --- a/docs/configuration/index.md +++ b/docs/configuration/index.md @@ -516,7 +516,7 @@ These properties specify the JDBC connection and other configuration around the |`druid.metadata.storage.connector.password`|The [Password Provider](../operations/password-provider.md) or String password used to connect with.|none| |`druid.metadata.storage.connector.createTables`|If Druid requires a table and it doesn't exist, create it?|true| |`druid.metadata.storage.tables.base`|The base name for tables.|`druid`| -|`druid.metadata.storage.tables.dataSource`|The table to use to look for datasources created by [Kafka Indexing Service](../development/extensions-core/kafka-ingestion.md).|`druid_dataSource`| +|`druid.metadata.storage.tables.dataSource`|The table to use to look for datasources created by [Kafka Indexing Service](../ingestion/kafka-ingestion.md).|`druid_dataSource`| |`druid.metadata.storage.tables.pendingSegments`|The table to use to look for pending segments.|`druid_pendingSegments`| |`druid.metadata.storage.tables.segments`|The table to use to look for segments.|`druid_segments`| |`druid.metadata.storage.tables.rules`|The table to use to look for segment load/drop rules.|`druid_rules`| @@ -1159,7 +1159,7 @@ If autoscaling is enabled, you can set these additional configs: |`druid.supervisor.idleConfig.enabled`|If `true`, supervisor can become idle if there is no data on input stream/topic for some time.|false| |`druid.supervisor.idleConfig.inactiveAfterMillis`|Supervisor is marked as idle if all existing data has been read from input topic and no new data has been published for `inactiveAfterMillis` milliseconds.|`600_000`| -The `druid.supervisor.idleConfig.*` specification in the Overlord runtime properties defines the default behavior for the entire cluster. See [Idle Configuration in Kafka Supervisor IOConfig](../development/extensions-core/kinesis-ingestion.md#io-configuration) to override it for an individual supervisor. +The `druid.supervisor.idleConfig.*` specification in the Overlord runtime properties defines the default behavior for the entire cluster. See [Idle Configuration in Kafka Supervisor IOConfig](../ingestion/kinesis-ingestion.md#io-configuration) to override it for an individual supervisor. #### Overlord dynamic configuration diff --git a/docs/design/storage.md b/docs/design/storage.md index da0df61f5458..f8d6e806b01d 100644 --- a/docs/design/storage.md +++ b/docs/design/storage.md @@ -114,14 +114,14 @@ Druid has an architectural separation between ingestion and querying, as describ On the ingestion side, Druid's primary [ingestion methods](../ingestion/index.md#ingestion-methods) are all pull-based and offer transactional guarantees. This means that you are guaranteed that ingestion using these methods will publish in an all-or-nothing manner: -- Supervised "seekable-stream" ingestion methods like [Kafka](../development/extensions-core/kafka-ingestion.md) and [Kinesis](../development/extensions-core/kinesis-ingestion.md). With these methods, Druid commits stream offsets to its [metadata store](#metadata-storage) alongside segment metadata, in the same transaction. Note that ingestion of data that has not yet been published can be rolled back if ingestion tasks fail. In this case, partially-ingested data is +- Supervised "seekable-stream" ingestion methods like [Kafka](../ingestion/kafka-ingestion.md) and [Kinesis](../ingestion/kinesis-ingestion.md). With these methods, Druid commits stream offsets to its [metadata store](#metadata-storage) alongside segment metadata, in the same transaction. Note that ingestion of data that has not yet been published can be rolled back if ingestion tasks fail. In this case, partially-ingested data is discarded, and Druid will resume ingestion from the last committed set of stream offsets. This ensures exactly-once publishing behavior. - [Hadoop-based batch ingestion](../ingestion/hadoop.md). Each task publishes all segment metadata in a single transaction. - [Native batch ingestion](../ingestion/native-batch.md). In parallel mode, the supervisor task publishes all segment metadata in a single transaction after the subtasks are finished. In simple (single-task) mode, the single task publishes all segment metadata in a single transaction after it is complete. Additionally, some ingestion methods offer an _idempotency_ guarantee. This means that repeated executions of the same ingestion will not cause duplicate data to be ingested: -- Supervised "seekable-stream" ingestion methods like [Kafka](../development/extensions-core/kafka-ingestion.md) and [Kinesis](../development/extensions-core/kinesis-ingestion.md) are idempotent due to the fact that stream offsets and segment metadata are stored together and updated in lock-step. +- Supervised "seekable-stream" ingestion methods like [Kafka](../ingestion/kafka-ingestion.md) and [Kinesis](../ingestion/kinesis-ingestion.md) are idempotent due to the fact that stream offsets and segment metadata are stored together and updated in lock-step. - [Hadoop-based batch ingestion](../ingestion/hadoop.md) is idempotent unless one of your input sources is the same Druid datasource that you are ingesting into. In this case, running the same task twice is non-idempotent, because you are adding to existing data instead of overwriting it. - [Native batch ingestion](../ingestion/native-batch.md) is idempotent unless [`appendToExisting`](../ingestion/native-batch.md) is true, or one of your input sources is the same Druid datasource that you are ingesting into. In either of these two cases, running the same task twice is non-idempotent, because you are adding to existing data instead of overwriting it. diff --git a/docs/development/extensions-core/protobuf.md b/docs/development/extensions-core/protobuf.md index 3c87809f72b5..08b9cc1185f0 100644 --- a/docs/development/extensions-core/protobuf.md +++ b/docs/development/extensions-core/protobuf.md @@ -30,7 +30,7 @@ for [stream ingestion](../../ingestion/index.md#streaming). See corresponding do ## Example: Load Protobuf messages from Kafka -This example demonstrates how to load Protobuf messages from Kafka. Please read the [Load from Kafka tutorial](../../tutorials/tutorial-kafka.md) first, and see [Kafka Indexing Service](./kafka-ingestion.md) documentation for more details. +This example demonstrates how to load Protobuf messages from Kafka. Please read the [Load from Kafka tutorial](../../tutorials/tutorial-kafka.md) first, and see [Kafka Indexing Service](../../ingestion/kafka-ingestion.md) documentation for more details. The files used in this example are found at [`./examples/quickstart/protobuf` in your Druid directory](https://github.com/apache/druid/tree/master/examples/quickstart/protobuf). diff --git a/docs/ingestion/data-formats.md b/docs/ingestion/data-formats.md index da9639e1a6c8..c9c23896a286 100644 --- a/docs/ingestion/data-formats.md +++ b/docs/ingestion/data-formats.md @@ -798,8 +798,8 @@ Each entry in the `fields` list can have the following components: ## Parser :::info - The Parser is deprecated for [native batch tasks](./native-batch.md), [Kafka indexing service](../development/extensions-core/kafka-ingestion.md), -and [Kinesis indexing service](../development/extensions-core/kinesis-ingestion.md). + The Parser is deprecated for [native batch tasks](./native-batch.md), [Kafka indexing service](../ingestion/kafka-ingestion.md), +and [Kinesis indexing service](../ingestion/kinesis-ingestion.md). Consider using the [input format](#input-format) instead for these types of ingestion. ::: @@ -1564,8 +1564,8 @@ Multiple Instances: ## ParseSpec :::info - The Parser is deprecated for [native batch tasks](./native-batch.md), [Kafka indexing service](../development/extensions-core/kafka-ingestion.md), -and [Kinesis indexing service](../development/extensions-core/kinesis-ingestion.md). + The Parser is deprecated for [native batch tasks](./native-batch.md), [Kafka indexing service](../ingestion/kafka-ingestion.md), +and [Kinesis indexing service](../ingestion/kinesis-ingestion.md). Consider using the [input format](#input-format) instead for these types of ingestion. ::: diff --git a/docs/ingestion/index.md b/docs/ingestion/index.md index fe3e6e4ec5b5..cefe3be936d2 100644 --- a/docs/ingestion/index.md +++ b/docs/ingestion/index.md @@ -54,7 +54,7 @@ page. There are two available options for streaming ingestion. Streaming ingestion is controlled by a continuously-running supervisor. -| **Method** | [Kafka](../development/extensions-core/kafka-ingestion.md) | [Kinesis](../development/extensions-core/kinesis-ingestion.md) | +| **Method** | [Kafka](../ingestion/kafka-ingestion.md) | [Kinesis](../ingestion/kinesis-ingestion.md) | |---|-----|--------------| | **Supervisor type** | `kafka` | `kinesis`| | **How it works** | Druid reads directly from Apache Kafka. | Druid reads directly from Amazon Kinesis.| diff --git a/docs/ingestion/ingestion-spec.md b/docs/ingestion/ingestion-spec.md index 0dc54ead97f4..32aab2148d15 100644 --- a/docs/ingestion/ingestion-spec.md +++ b/docs/ingestion/ingestion-spec.md @@ -96,8 +96,8 @@ For more examples, refer to the documentation for each ingestion method. You can also load data visually, without the need to write an ingestion spec, using the "Load data" functionality available in Druid's [web console](../operations/web-console.md). Druid's visual data loader supports -[Kafka](../development/extensions-core/kafka-ingestion.md), -[Kinesis](../development/extensions-core/kinesis-ingestion.md), and +[Kafka](../ingestion/kafka-ingestion.md), +[Kinesis](../ingestion/kinesis-ingestion.md), and [native batch](native-batch.md) mode. ## `dataSchema` @@ -503,7 +503,7 @@ is: |skipBytesInMemoryOverheadCheck|The calculation of maxBytesInMemory takes into account overhead objects created during ingestion and each intermediate persist. Setting this to true can exclude the bytes of these overhead objects from maxBytesInMemory check.|false| |indexSpec|Defines segment storage format options to use at indexing time.|See [`indexSpec`](#indexspec) for more information.| |indexSpecForIntermediatePersists|Defines segment storage format options to use at indexing time for intermediate persisted temporary segments.|See [`indexSpec`](#indexspec) for more information.| -|Other properties|Each ingestion method has its own list of additional tuning properties. See the documentation for each method for a full list: [Kafka indexing service](../development/extensions-core/kafka-ingestion.md#tuning-configuration), [Kinesis indexing service](../development/extensions-core/kinesis-ingestion.md#tuning-configuration), [Native batch](native-batch.md#tuningconfig), and [Hadoop-based](hadoop.md#tuningconfig).|| +|Other properties|Each ingestion method has its own list of additional tuning properties. See the documentation for each method for a full list: [Kafka indexing service](../ingestion/kafka-ingestion.md#tuning-configuration), [Kinesis indexing service](../ingestion/kinesis-ingestion.md#tuning-configuration), [Native batch](native-batch.md#tuningconfig), and [Hadoop-based](hadoop.md#tuningconfig).|| ### `indexSpec` diff --git a/docs/development/extensions-core/kafka-ingestion.md b/docs/ingestion/kafka-ingestion.md similarity index 93% rename from docs/development/extensions-core/kafka-ingestion.md rename to docs/ingestion/kafka-ingestion.md index 8f289040098d..de2024129834 100644 --- a/docs/development/extensions-core/kafka-ingestion.md +++ b/docs/ingestion/kafka-ingestion.md @@ -32,7 +32,7 @@ This topic contains configuration reference information for the Kafka indexing s ## Setup -To use the Kafka indexing service, you must first load the `druid-kafka-indexing-service` extension on both the Overlord and the MiddleManager. See [Loading extensions](../../configuration/extensions.md) for more information. +To use the Kafka indexing service, you must first load the `druid-kafka-indexing-service` extension on both the Overlord and the MiddleManager. See [Loading extensions](../configuration/extensions.md) for more information. ### Kafka support @@ -46,7 +46,7 @@ If your Kafka cluster enables consumer-group based ACLs, you can set `group.id` ## Supervisor spec -Similar to the ingestion spec for batch ingestion, the [supervisor spec](../../ingestion/supervisor.md#supervisor-spec) configures the data ingestion for Kafka streaming ingestion. +Similar to the ingestion spec for batch ingestion, the [supervisor spec](../ingestion/supervisor.md#supervisor-spec) configures the data ingestion for Kafka streaming ingestion. The following table outlines the high-level configuration options for the Kafka supervisor spec: @@ -55,7 +55,7 @@ The following table outlines the high-level configuration options for the Kafka |`type`|String|The supervisor type; must be `kafka`.|Yes| |`spec`|Object|The container object for the supervisor configuration.|Yes| |`ioConfig`|Object|The I/O configuration object to define the connection and I/O-related settings for the supervisor and indexing tasks.|Yes| -|`dataSchema`|Object|The schema for the indexing task to use during ingestion. See [`dataSchema`](../../ingestion/ingestion-spec.md#dataschema) for more information.|Yes| +|`dataSchema`|Object|The schema for the indexing task to use during ingestion. See [`dataSchema`](../ingestion/ingestion-spec.md#dataschema) for more information.|Yes| |`tuningConfig`|Object|The tuning configuration object to define performance-related settings for the supervisor and indexing tasks.|No| The following example shows a supervisor spec for the Kafka indexing service. @@ -136,11 +136,11 @@ The following table outlines the configuration options for `ioConfig`: |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| |`topic`|String|The Kafka topic to read from. Must be a specific topic. Druid does not support topic patterns. To ingest data from multiple topic, see [Ingest from multiple topics](#ingest-from-multiple-topics). |Yes|| -|`inputFormat`|Object|The [input format](../../ingestion/data-formats.md#input-format) to define input data parsing.|Yes|| +|`inputFormat`|Object|The [input format](../ingestion/data-formats.md#input-format) to define input data parsing.|Yes|| |`consumerProperties`|String, Object|A map of properties to pass to the Kafka consumer. See [Consumer properties](#consumer-properties) for details.|Yes|| |`pollTimeout`|Long|The length of time to wait for the Kafka consumer to poll records, in milliseconds.|No|100| |`replicas`|Integer|The number of replica sets, where 1 is a single set of tasks (no replication). Druid always assigns replicate tasks to different workers to provide resiliency against process failure.|No|1| -|`taskCount`|Integer|The maximum number of reading tasks in a replica set. The maximum number of reading tasks equals `taskCount * replicas`. The total number of tasks, reading and publishing, is greater than this count. See [Capacity planning](../../ingestion/supervisor.md#capacity-planning) for more details. When `taskCount > {numKafkaPartitions}`, the actual number of reading tasks is less than the `taskCount` value.|No|1| +|`taskCount`|Integer|The maximum number of reading tasks in a replica set. The maximum number of reading tasks equals `taskCount * replicas`. The total number of tasks, reading and publishing, is greater than this count. See [Capacity planning](../ingestion/supervisor.md#capacity-planning) for more details. When `taskCount > {numKafkaPartitions}`, the actual number of reading tasks is less than the `taskCount` value.|No|1| |`taskDuration`|ISO 8601 period|The length of time before tasks stop reading and begin publishing segments.|No|PT1H| |`startDelay`|ISO 8601 period|The period to wait before the supervisor starts managing tasks.|No|PT5S| |`period`|ISO 8601 period|Determines how often the supervisor executes its management logic. Note that the supervisor also runs in response to certain events, such as tasks succeeding, failing, and reaching their task duration. The `period` value specifies the maximum time between iterations.|No|PT30S| @@ -149,7 +149,7 @@ The following table outlines the configuration options for `ioConfig`: |`lateMessageRejectionStartDateTime`|ISO 8601 date time|Configures tasks to reject messages with timestamps earlier than this date time. For example, if this property is set to `2016-01-01T11:00Z` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps earlier than `2016-01-01T11:00Z`. This can prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments, such as a realtime and a nightly batch ingestion pipeline.|No|| |`lateMessageRejectionPeriod`|ISO 8601 period|Configures tasks to reject messages with timestamps earlier than this period before the task was created. For example, if this property is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps earlier than `2016-01-01T11:00Z`. This may help prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments, such as a realtime and a nightly batch ingestion pipeline. Note that you can specify only one of the late message rejection properties.|No|| |`earlyMessageRejectionPeriod`|ISO 8601 period|Configures tasks to reject messages with timestamps later than this period after the task reached its task duration. For example, if this property is set to `PT1H`, the task duration is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps later than `2016-01-01T14:00Z`. Tasks sometimes run past their task duration, such as in cases of supervisor failover. Setting `earlyMessageRejectionPeriod` too low may cause Druid to drop messages unexpectedly whenever a task runs past its originally configured task duration.|No|| -|`autoScalerConfig`|Object|Defines auto scaling behavior for ingestion tasks. See [Task autoscaler](../../ingestion/supervisor.md#task-autoscaler) for more information.|No|null| +|`autoScalerConfig`|Object|Defines auto scaling behavior for ingestion tasks. See [Task autoscaler](../ingestion/supervisor.md#task-autoscaler) for more information.|No|null| |`idleConfig`|Object|Defines how and when the Kafka supervisor can become idle. See [Idle supervisor configuration](#idle-supervisor-configuration) for more details.|No|null| #### Consumer properties @@ -157,7 +157,7 @@ The following table outlines the configuration options for `ioConfig`: Consumer properties must contain a property `bootstrap.servers` with a list of Kafka brokers in the form: `:,:,...`. By default, `isolation.level` is set to `read_committed`. If you use older versions of Kafka servers without transactions support or don't want Druid to consume only committed transactions, set `isolation.level` to `read_uncommitted`. -In some cases, you may need to fetch consumer properties at runtime. For example, when `bootstrap.servers` is not known upfront, or is not static. To enable SSL connections, you must provide passwords for `keystore`, `truststore`, and `key` secretly. You can provide configurations at runtime with a dynamic config provider implementation like the environment variable config provider that comes with Druid. For more information, see [Dynamic config provider](../../operations/dynamic-config-provider.md). +In some cases, you may need to fetch consumer properties at runtime. For example, when `bootstrap.servers` is not known upfront, or is not static. To enable SSL connections, you must provide passwords for `keystore`, `truststore`, and `key` secretly. You can provide configurations at runtime with a dynamic config provider implementation like the environment variable config provider that comes with Druid. For more information, see [Dynamic config provider](../operations/dynamic-config-provider.md). For example, if you are using SASL and SSL with Kafka, set the following environment variables for the Druid user on the machines running the Overlord and the Peon services: @@ -322,7 +322,7 @@ The following table outlines the configuration options for `tuningConfig`: |`indexSpecForIntermediatePersists`|Object|Defines segment storage format options to use at indexing time for intermediate persisted temporary segments. You can use `indexSpecForIntermediatePersists` to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. However, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published. See [IndexSpec](#indexspec) for possible values.|No|Same as `indexSpec`| |`reportParseExceptions`|Boolean|DEPRECATED. If `true`, Druid throws exceptions encountered during parsing causing ingestion to halt. If `false`, Druid skips unparseable rows and fields. Setting `reportParseExceptions` to `true` overrides existing configurations for `maxParseExceptions` and `maxSavedParseExceptions`, setting `maxParseExceptions` to 0 and limiting `maxSavedParseExceptions` to not more than 1.|No|`false`| |`handoffConditionTimeout`|Long|Number of milliseconds to wait for segment handoff. Set to a value >= 0, where 0 means to wait indefinitely.|No|900000 (15 minutes)| -|`resetOffsetAutomatically`|Boolean| Determines how Druid reads Kafka messages when partitions in the topic have `offsetOutOfRangeException`. This feature behaves similarly to the Kafka `auto.offset.reset` consumer property. If `resetOffsetAutomatically` is set to `true`, Druid automatically resets to the earliest or latest offset available in Kafka, based on the value of the `useEarliestOffset` property (earliest if `true`, latest if `false`). Druid logs messages to the ingestion task log file indicating that a reset has occurred without interrupting ingestion. Setting `resetOffsetAutomatically` to `true` can lead to dropping data (if `useEarliestSequenceNumber` is `false`) or duplicating data (if `useEarliestSequenceNumber` is `true`) without your knowledge.
If only one partition in the topic has `offsetOutOfrangeException`, the offset is reset for that partition only.
If `resetOffsetAutomatically` is `false`, the exception bubbles up causing tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation, potentially using the [Reset Supervisor API](../../api-reference/supervisor-api.md#reset-a-supervisor). |No|`false`| +|`resetOffsetAutomatically`|Boolean| Determines how Druid reads Kafka messages when partitions in the topic have `offsetOutOfRangeException`. This feature behaves similarly to the Kafka `auto.offset.reset` consumer property. If `resetOffsetAutomatically` is set to `true`, Druid automatically resets to the earliest or latest offset available in Kafka, based on the value of the `useEarliestOffset` property (earliest if `true`, latest if `false`). Druid logs messages to the ingestion task log file indicating that a reset has occurred without interrupting ingestion. Setting `resetOffsetAutomatically` to `true` can lead to dropping data (if `useEarliestSequenceNumber` is `false`) or duplicating data (if `useEarliestSequenceNumber` is `true`) without your knowledge.
If only one partition in the topic has `offsetOutOfrangeException`, the offset is reset for that partition only.
If `resetOffsetAutomatically` is `false`, the exception bubbles up causing tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation, potentially using the [Reset Supervisor API](../api-reference/supervisor-api.md#reset-a-supervisor). |No|`false`| |`workerThreads`|Integer|The number of threads that the supervisor uses to handle requests/responses for worker tasks, along with any other internal asynchronous operation.|No|`min(10, taskCount)`| |`chatAsync`|Boolean|If `true`, use asynchronous communication with indexing tasks, and ignore the `chatThreads` parameter. If `false`, use synchronous communication in a thread pool of size `chatThreads`.|No|`true`| |`chatThreads`|Integer|The number of threads to use for communicating with indexing tasks. Ignored if `chatAsync` is `true`.|No|`min(10, taskCount * replicas)`| @@ -330,10 +330,10 @@ The following table outlines the configuration options for `tuningConfig`: |`httpTimeout`| ISO 8601 period|The period of time to wait for a HTTP response from an indexing task.|No|PT10S| |`shutdownTimeout`|ISO 8601 period|The period of time to wait for the supervisor to attempt a graceful shutdown of tasks before exiting.|No|PT80S| |`offsetFetchPeriod`|ISO 8601 period|Determines how often the supervisor queries Kafka and the indexing tasks to fetch current offsets and calculate lag. If the user-specified value is below the minimum value of `PT5S`, the supervisor ignores the value and uses the minimum value instead.|No|PT30S| -|`segmentWriteOutMediumFactory`|Object|The segment write-out medium to use when creating segments. See [Additional Peon configuration: SegmentWriteOutMediumFactory](../../configuration/index.md#segmentwriteoutmediumfactory) for explanation and available options.|No|If not specified, Druid uses the value from `druid.peon.defaultSegmentWriteOutMediumFactory.type`.| +|`segmentWriteOutMediumFactory`|Object|The segment write-out medium to use when creating segments. See [Additional Peon configuration: SegmentWriteOutMediumFactory](../configuration/index.md#segmentwriteoutmediumfactory) for explanation and available options.|No|If not specified, Druid uses the value from `druid.peon.defaultSegmentWriteOutMediumFactory.type`.| |`logParseExceptions`|Boolean|If `true`, Druid logs an error message when a parsing exception occurs, containing information about the row where the error occurred.|No|`false`| |`maxParseExceptions`|Integer|The maximum number of parse exceptions that can occur before the task halts ingestion and fails. Overridden if `reportParseExceptions` is set.|No|unlimited| -|`maxSavedParseExceptions`|Integer|When a parse exception occurs, Druid keeps track of the most recent parse exceptions. `maxSavedParseExceptions` limits the number of saved exception instances. These saved exceptions are available after the task finishes in the [task completion report](../../ingestion/tasks.md#task-reports). Overridden if `reportParseExceptions` is set.|No|0| +|`maxSavedParseExceptions`|Integer|When a parse exception occurs, Druid keeps track of the most recent parse exceptions. `maxSavedParseExceptions` limits the number of saved exception instances. These saved exceptions are available after the task finishes in the [task completion report](../ingestion/tasks.md#task-reports). Overridden if `reportParseExceptions` is set.|No|0| #### IndexSpec @@ -357,13 +357,13 @@ The Kafka indexing service may still produce some small segments. For example, c - Segment granularity is set to an HOUR. - The supervisor was started at 9:10. After 4 hours at 13:10, Druid starts a new set of tasks. The events for the interval 13:00 - 14:00 may be split across existing tasks and the new set of tasks which could result in small segments. To merge them together into new segments of an ideal size (in the range of ~500-700 MB per segment), you can schedule re-indexing tasks, optionally with a different segment granularity. -For information on how to optimize the segment size, see [Segment size optimization](../../operations/segment-optimization.md). +For information on how to optimize the segment size, see [Segment size optimization](../operations/segment-optimization.md). ## Learn more See the following topics for more information: -* [Supervisor API](../../api-reference/supervisor-api.md) for how to manage and monitor supervisors using the API. -* [Supervisor](../../ingestion/supervisor.md) for supervisor status and capacity planning. -* [Loading from Apache Kafka](../../tutorials/tutorial-kafka.md) for a tutorial on streaming data from Apache Kafka. -* [Kafka input format](../../ingestion/data-formats.md) to learn about the `kafka` input format. \ No newline at end of file +* [Supervisor API](../api-reference/supervisor-api.md) for how to manage and monitor supervisors using the API. +* [Supervisor](../ingestion/supervisor.md) for supervisor status and capacity planning. +* [Loading from Apache Kafka](../tutorials/tutorial-kafka.md) for a tutorial on streaming data from Apache Kafka. +* [Kafka input format](../ingestion/data-formats.md) to learn about the `kafka` input format. \ No newline at end of file diff --git a/docs/development/extensions-core/kinesis-ingestion.md b/docs/ingestion/kinesis-ingestion.md similarity index 93% rename from docs/development/extensions-core/kinesis-ingestion.md rename to docs/ingestion/kinesis-ingestion.md index 0111e31e6dd6..8a550c3d4018 100644 --- a/docs/development/extensions-core/kinesis-ingestion.md +++ b/docs/ingestion/kinesis-ingestion.md @@ -33,20 +33,20 @@ This topic contains configuration reference information for the Kinesis indexing ## Setup -To use the Kinesis indexing service, you must first load the `druid-kinesis-indexing-service` core extension on both the Overlord and the MiddleManager. See [Loading extensions](../../configuration/extensions.md#loading-extensions) for more information. +To use the Kinesis indexing service, you must first load the `druid-kinesis-indexing-service` core extension on both the Overlord and the MiddleManager. See [Loading extensions](../configuration/extensions.md#loading-extensions) for more information. Review [Kinesis known issues](#kinesis-known-issues) before deploying the `druid-kinesis-indexing-service` extension to production. ## Supervisor spec -The following table outlines the high-level configuration options for the Kinesis [supervisor spec](../../ingestion/supervisor.md#supervisor-spec). +The following table outlines the high-level configuration options for the Kinesis [supervisor spec](../ingestion/supervisor.md#supervisor-spec). |Property|Type|Description|Required| |--------|----|-----------|--------| |`type`|String|The supervisor type; must be `kinesis`.|Yes| |`spec`|Object|The container object for the supervisor configuration.|Yes| |`ioConfig`|Object|The [I/O configuration](#supervisor-io-configuration) object for configuring Kinesis connection and I/O-related settings for the supervisor and indexing tasks.|Yes| -|`dataSchema`|Object|The schema used by the Kinesis indexing task during ingestion. See [`dataSchema`](../../ingestion/ingestion-spec.md#dataschema) for more information.|Yes| +|`dataSchema`|Object|The schema used by the Kinesis indexing task during ingestion. See [`dataSchema`](../ingestion/ingestion-spec.md#dataschema) for more information.|Yes| |`tuningConfig`|Object|The [tuning configuration](#supervisor-tuning-configuration) object for configuring performance-related settings for the supervisor and indexing tasks.|No| The following example shows a supervisor spec for a stream with the name `KinesisStream`. @@ -134,7 +134,7 @@ The following table outlines the configuration options for `ioConfig`: |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| |`stream`|String|The Kinesis stream to read.|Yes|| -|`inputFormat`|Object|The [input format](../../ingestion/data-formats.md#input-format) to specify how to parse input data.|Yes|| +|`inputFormat`|Object|The [input format](../ingestion/data-formats.md#input-format) to specify how to parse input data.|Yes|| |`endpoint`|String|The AWS Kinesis stream endpoint for a region. You can find a list of endpoints in the [AWS service endpoints](http://docs.aws.amazon.com/general/latest/gr/rande.html#ak_region) document.|No|`kinesis.us-east-1.amazonaws.com`| |`replicas`|Integer|The number of replica sets, where 1 is a single set of tasks (no replication). Druid always assigns replicate tasks to different workers to provide resiliency against process failure.|No|1| |`taskCount`|Integer|The maximum number of reading tasks in a replica set. Multiply `taskCount` and `replicas` to measure the maximum number of reading tasks.
The total number of tasks (reading and publishing) is higher than the maximum number of reading tasks. See [Capacity planning](#capacity-planning) for more details. When `taskCount > {numKinesisShards}`, the actual number of reading tasks is less than the `taskCount` value.|No|1| @@ -150,7 +150,7 @@ The following table outlines the configuration options for `ioConfig`: |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional permissions.|No|| |`awsExternalId`|String|The AWS external ID to use for additional permissions.|No|| |`deaggregate`|Boolean|Whether to use the deaggregate function of the Kinesis Client Library (KCL).|No|| -|`autoScalerConfig`|Object|Defines autoscaling behavior for ingestion tasks. See [Task autoscaler](../../ingestion/supervisor.md#task-autoscaler) for more information.|No|null| +|`autoScalerConfig`|Object|Defines autoscaling behavior for ingestion tasks. See [Task autoscaler](../ingestion/supervisor.md#task-autoscaler) for more information.|No|null| #### Task autoscaler @@ -208,7 +208,7 @@ The following table outlines the configuration options for `tuningConfig`: |`indexSpecForIntermediatePersists`|Object|Defines segment storage format options to use at indexing time for intermediate persisted temporary segments. You can use `indexSpecForIntermediatePersists` to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. However, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published. See [IndexSpec](#indexspec) for possible values.|No|Same as `indexSpec`| |`reportParseExceptions`|Boolean|If `true`, Druid throws exceptions encountered during parsing causing ingestion to halt. If `false`, Druid skips unparseable rows and fields.|No|`false`| |`handoffConditionTimeout`|Long|Number of milliseconds to wait for segment handoff. Set to a value >= 0, where 0 means to wait indefinitely.|No|0| -|`resetOffsetAutomatically`|Boolean|Controls behavior when Druid needs to read Kinesis messages that are no longer available.
If `false`, the exception bubbles up causing tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation, potentially using the [Reset Supervisor API](../../api-reference/supervisor-api.md). This mode is useful for production, since it highlights issues with ingestion.
If `true`, Druid automatically resets to the earliest or latest sequence number available in Kinesis, based on the value of the `useEarliestSequenceNumber` property (earliest if `true`, latest if `false`). Note that this can lead to dropping data (if `useEarliestSequenceNumber` is `false`) or duplicating data (if `useEarliestSequenceNumber` is `true`) without your knowledge. Druid logs messages indicating that a reset has occurred without interrupting ingestion. This mode is useful for non-production situations since it enables Druid to recover from problems automatically, even if they lead to quiet dropping or duplicating of data.|No|`false`| +|`resetOffsetAutomatically`|Boolean|Controls behavior when Druid needs to read Kinesis messages that are no longer available.
If `false`, the exception bubbles up causing tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation, potentially using the [Reset Supervisor API](../api-reference/supervisor-api.md). This mode is useful for production, since it highlights issues with ingestion.
If `true`, Druid automatically resets to the earliest or latest sequence number available in Kinesis, based on the value of the `useEarliestSequenceNumber` property (earliest if `true`, latest if `false`). Note that this can lead to dropping data (if `useEarliestSequenceNumber` is `false`) or duplicating data (if `useEarliestSequenceNumber` is `true`) without your knowledge. Druid logs messages indicating that a reset has occurred without interrupting ingestion. This mode is useful for non-production situations since it enables Druid to recover from problems automatically, even if they lead to quiet dropping or duplicating of data.|No|`false`| |`skipSequenceNumberAvailabilityCheck`|Boolean|Whether to enable checking if the current sequence number is still available in a particular Kinesis shard. If `false`, the indexing task attempts to reset the current sequence number, depending on the value of `resetOffsetAutomatically`.|No|`false`| |`workerThreads`|Integer|The number of threads that the supervisor uses to handle requests/responses for worker tasks, along with any other internal asynchronous operation.|No| `min(10, taskCount)`| |`chatRetries`|Integer|The number of times Druid retries HTTP requests to indexing tasks before considering tasks unresponsive.|No|8| @@ -218,10 +218,10 @@ The following table outlines the configuration options for `tuningConfig`: |`recordBufferOfferTimeout`|Integer|The number of milliseconds to wait for space to become available in the buffer before timing out.|No|5000| |`recordBufferFullWait`|Integer|The number of milliseconds to wait for the buffer to drain before Druid attempts to fetch records from Kinesis again.|No|5000| |`fetchThreads`|Integer|The size of the pool of threads fetching data from Kinesis. There is no benefit in having more threads than Kinesis shards.|No| `procs * 2`, where `procs` is the number of processors available to the task.| -|`segmentWriteOutMediumFactory`|Object|The segment write-out medium to use when creating segments. See [Additional Peon configuration: SegmentWriteOutMediumFactory](../../configuration/index.md#segmentwriteoutmediumfactory) for explanation and available options.|No|If not specified, Druid uses the value from `druid.peon.defaultSegmentWriteOutMediumFactory.type`.| +|`segmentWriteOutMediumFactory`|Object|The segment write-out medium to use when creating segments. See [Additional Peon configuration: SegmentWriteOutMediumFactory](../configuration/index.md#segmentwriteoutmediumfactory) for explanation and available options.|No|If not specified, Druid uses the value from `druid.peon.defaultSegmentWriteOutMediumFactory.type`.| |`logParseExceptions`|Boolean|If `true`, Druid logs an error message when a parsing exception occurs, containing information about the row where the error occurred.|No|`false`| |`maxParseExceptions`|Integer|The maximum number of parse exceptions that can occur before the task halts ingestion and fails. Overridden if `reportParseExceptions` is set.|No|unlimited| -|`maxSavedParseExceptions`|Integer|When a parse exception occurs, Druid keeps track of the most recent parse exceptions. `maxSavedParseExceptions` limits the number of saved exception instances. These saved exceptions are available after the task finishes in the [task completion report](../../ingestion/tasks.md#task-reports). Overridden if `reportParseExceptions` is set.|No|0| +|`maxSavedParseExceptions`|Integer|When a parse exception occurs, Druid keeps track of the most recent parse exceptions. `maxSavedParseExceptions` limits the number of saved exception instances. These saved exceptions are available after the task finishes in the [task completion report](../ingestion/tasks.md#task-reports). Overridden if `reportParseExceptions` is set.|No|0| |`maxRecordsPerPoll`|Integer|The maximum number of records to be fetched from buffer per poll. The actual maximum will be `Max(maxRecordsPerPoll, Max(bufferSize, 1))`.|No| See [Determine fetch settings](#determine-fetch-settings) for defaults.| |`repartitionTransitionDuration`|ISO 8601 period|When shards are split or merged, the supervisor recomputes shard to task group mappings. The supervisor also signals any running tasks created under the old mappings to stop early at current time + `repartitionTransitionDuration`. Stopping the tasks early allows Druid to begin reading from the new shards more quickly. The repartition transition wait time controlled by this property gives the stream additional time to write records to the new shards after the split or merge, which helps avoid issues with [empty shard handling](https://github.com/apache/druid/issues/7600).|No|PT2M| |`offsetFetchPeriod`|ISO 8601 period|Determines how often the supervisor queries Kinesis and the indexing tasks to fetch current offsets and calculate lag. If the user-specified value is below the minimum value of PT5S, the supervisor ignores the value and uses the minimum value instead.|No|PT30S| @@ -337,7 +337,7 @@ For example, consider the following scenario: After 4 hours at 13:10, Druid starts a new set of tasks. The events for the interval 13:00 - 14:00 may be split across existing tasks and the new set of tasks which could result in small segments. To merge them together into new segments of an ideal size (in the range of ~500-700 MB per segment), you can schedule re-indexing tasks, optionally with a different segment granularity. -For information on how to optimize the segment size, see [Segment size optimization](../../operations/segment-optimization.md). +For information on how to optimize the segment size, see [Segment size optimization](../operations/segment-optimization.md). ## Determine fetch settings @@ -415,5 +415,5 @@ Before you deploy the `druid-kinesis-indexing-service` extension to production, See the following topics for more information: -* [Supervisor API](../../api-reference/supervisor-api.md) for how to manage and monitor supervisors using the API. -* [Supervisor](../../ingestion/supervisor.md) for supervisor status and capacity planning. \ No newline at end of file +* [Supervisor API](../api-reference/supervisor-api.md) for how to manage and monitor supervisors using the API. +* [Supervisor](../ingestion/supervisor.md) for supervisor status and capacity planning. \ No newline at end of file diff --git a/docs/ingestion/partitioning.md b/docs/ingestion/partitioning.md index 422c07de8058..6cf5b0a74d28 100644 --- a/docs/ingestion/partitioning.md +++ b/docs/ingestion/partitioning.md @@ -69,8 +69,8 @@ The following table shows how each ingestion method handles partitioning: |[Native batch](native-batch.md)|Configured using [`partitionsSpec`](native-batch.md#partitionsspec) inside the `tuningConfig`.| |[SQL](../multi-stage-query/index.md)|Configured using [`PARTITIONED BY`](../multi-stage-query/concepts.md#partitioning) and [`CLUSTERED BY`](../multi-stage-query/concepts.md#clustering).| |[Hadoop](hadoop.md)|Configured using [`partitionsSpec`](hadoop.md#partitionsspec) inside the `tuningConfig`.| -|[Kafka indexing service](../development/extensions-core/kafka-ingestion.md)|Kafka topic partitioning defines how Druid partitions the datasource. You can also [reindex](../data-management/update.md#reindex) or [compact](../data-management/compaction.md) to repartition after initial ingestion.| -|[Kinesis indexing service](../development/extensions-core/kinesis-ingestion.md)|Kinesis stream sharding defines how Druid partitions the datasource. You can also [reindex](../data-management/update.md#reindex) or [compact](../data-management/compaction.md) to repartition after initial ingestion.| +|[Kafka indexing service](../ingestion/kafka-ingestion.md)|Kafka topic partitioning defines how Druid partitions the datasource. You can also [reindex](../data-management/update.md#reindex) or [compact](../data-management/compaction.md) to repartition after initial ingestion.| +|[Kinesis indexing service](../ingestion/kinesis-ingestion.md)|Kinesis stream sharding defines how Druid partitions the datasource. You can also [reindex](../data-management/update.md#reindex) or [compact](../data-management/compaction.md) to repartition after initial ingestion.| ## Learn more diff --git a/docs/ingestion/rollup.md b/docs/ingestion/rollup.md index 241ffba367ec..212708649a43 100644 --- a/docs/ingestion/rollup.md +++ b/docs/ingestion/rollup.md @@ -87,8 +87,8 @@ The following table shows how each method handles rollup: |[Native batch](native-batch.md)|`index_parallel` and `index` type may be either perfect or best-effort, based on configuration.| |[SQL-based batch](../multi-stage-query/index.md)|Always perfect.| |[Hadoop](hadoop.md)|Always perfect.| -|[Kafka indexing service](../development/extensions-core/kafka-ingestion.md)|Always best-effort.| -|[Kinesis indexing service](../development/extensions-core/kinesis-ingestion.md)|Always best-effort.| +|[Kafka indexing service](../ingestion/kafka-ingestion.md)|Always best-effort.| +|[Kinesis indexing service](../ingestion/kinesis-ingestion.md)|Always best-effort.| ## Learn more diff --git a/docs/ingestion/standalone-realtime.md b/docs/ingestion/standalone-realtime.md index 7a3a9e0e35a6..94a8565baad7 100644 --- a/docs/ingestion/standalone-realtime.md +++ b/docs/ingestion/standalone-realtime.md @@ -41,5 +41,5 @@ suffered from limitations which made it not possible to achieve exactly once ing The extensions `druid-kafka-eight`, `druid-kafka-eight-simpleConsumer`, `druid-rabbitmq`, and `druid-rocketmq` were also removed at this time, since they were built to operate on the realtime nodes. -Please consider using the [Kafka Indexing Service](../development/extensions-core/kafka-ingestion.md) or -[Kinesis Indexing Service](../development/extensions-core/kinesis-ingestion.md) for stream pull ingestion instead. +Please consider using the [Kafka Indexing Service](../ingestion/kafka-ingestion.md) or +[Kinesis Indexing Service](../ingestion/kinesis-ingestion.md) for stream pull ingestion instead. diff --git a/docs/ingestion/streaming.md b/docs/ingestion/streaming.md index ba15e6de6367..de0fd5b3eb01 100644 --- a/docs/ingestion/streaming.md +++ b/docs/ingestion/streaming.md @@ -24,8 +24,8 @@ title: "Streaming ingestion" Apache Druid accepts data streams from the following external streaming sources: -* Apache Kafka through the bundled [Kafka indexing service](../development/extensions-core/kafka-ingestion.md) extension. -* Amazon Kinesis through the bundled [Kinesis indexing service](../development/extensions-core/kinesis-ingestion.md) extension. +* Apache Kafka through the bundled [Kafka indexing service](kafka-ingestion.md) extension. +* Amazon Kinesis through the bundled [Kinesis indexing service](kinesis-ingestion.md) extension. Each indexing service provides real-time data ingestion with exactly-once stream processing guarantee. To use either of the streaming ingestion methods, you must first load the associated extension on both the Overlord and the MiddleManager. See [Loading extensions](../configuration/extensions.md#loading-extensions) for more information. diff --git a/docs/ingestion/supervisor.md b/docs/ingestion/supervisor.md index c7785f2757b0..31a332148239 100644 --- a/docs/ingestion/supervisor.md +++ b/docs/ingestion/supervisor.md @@ -35,7 +35,7 @@ Druid starts a new supervisor for a datasource when you create a supervisor spec You can create and manage supervisor specs using the data loader in the Druid web console or by calling the [Supervisor API](../api-reference/supervisor-api.md). Once started, the supervisor persists in the configured metadata database. There can only be one supervisor per datasource, and submitting a second supervisor spec for the same datasource overwrites the previous one. -When an Overlord gains leadership, either by being started or as a result of another Overlord failing, it spawns a supervisor for each supervisor spec in the metadata database. The supervisor then discovers running indexing tasks and attempts to adopt them if they are compatible with the supervisor's configuration. If they are not compatible, the tasks are terminated and the supervisor creates a new set of tasks. This way, the supervisors persist across Overlord restarts and failovers. +When an Overlord gains leadership, either by being started or as a result of another Overlord failing, it spawns a supervisor for each supervisor spec in the metadata database. The supervisor then discovers running indexing tasks and attempts to adopt them if they are compatible with the supervisor's configuration. If they are not compatible, the tasks are terminated and the supervisor creates a new set of tasks. This way, the supervised tasks persist across Overlord restarts and failovers. ### Schema and configuration changes @@ -113,5 +113,5 @@ time-to-publish (generate segment, push to deep storage, load on Historical) is See the following topics for more information: * [Supervisor API](../api-reference/supervisor-api.md) for how to manage and monitor supervisors using the API. -* [Apache Kafka ingestion](../development/extensions-core/kafka-ingestion.md) to learn about ingesting data from an Apache Kafka stream. -* [Amazon Kinesis ingestion](../development/extensions-core/kinesis-ingestion.md) to learn about ingesting data from an Amazon Kinesis stream. \ No newline at end of file +* [Apache Kafka ingestion](../ingestion/kafka-ingestion.md) to learn about ingesting data from an Apache Kafka stream. +* [Amazon Kinesis ingestion](../ingestion/kinesis-ingestion.md) to learn about ingesting data from an Amazon Kinesis stream. \ No newline at end of file diff --git a/docs/ingestion/tasks.md b/docs/ingestion/tasks.md index 4b3081cef911..d9f0758ede6a 100644 --- a/docs/ingestion/tasks.md +++ b/docs/ingestion/tasks.md @@ -438,7 +438,7 @@ Task storage sizes are configured through a combination of three properties: While it seems like one task might use multiple directories, only one directory from the list of base directories will be used for any given task, as such, each task is only given a singular directory for scratch space. -The actual amount of memory assigned to any given task is computed by determining the largest size that enables all task slots to be given an equivalent amount of disk storage. For example, with 5 slots, 2 directories (A and B) and a size of 300 GB, 3 slots would be given to directory A, 2 slots to directory B and each slot would be allowed 100 GB +The actual amount of memory assigned to any given task is computed by determining the largest size that enables all task slots to be given an equivalent amount of disk storage. For example, with 5 slots, 2 directories (A and B) and a size of 300 GB, 3 slots would be given to directory A, 2 slots to directory B and each slot would be allowed 100 GB ## All task types @@ -453,12 +453,12 @@ See [Hadoop-based ingestion](hadoop.md). ### `index_kafka` Submitted automatically, on your behalf, by a -[Kafka-based ingestion supervisor](../development/extensions-core/kafka-ingestion.md). +[Kafka-based ingestion supervisor](../ingestion/kafka-ingestion.md). ### `index_kinesis` Submitted automatically, on your behalf, by a -[Kinesis-based ingestion supervisor](../development/extensions-core/kinesis-ingestion.md). +[Kinesis-based ingestion supervisor](../ingestion/kinesis-ingestion.md). ### `compact` diff --git a/docs/ingestion/tranquility.md b/docs/ingestion/tranquility.md index f66464456188..9cc0636fd6cc 100644 --- a/docs/ingestion/tranquility.md +++ b/docs/ingestion/tranquility.md @@ -30,7 +30,7 @@ release. It may still work with the latest Druid servers, but not all features a due to limitations of older Druid APIs on the Tranquility side. For new projects that require streaming ingestion, we recommend using Druid's native support for -[Apache Kafka](../development/extensions-core/kafka-ingestion.md) or -[Amazon Kinesis](../development/extensions-core/kinesis-ingestion.md). +[Apache Kafka](../ingestion/kafka-ingestion.md) or +[Amazon Kinesis](../ingestion/kinesis-ingestion.md). For more details, check out the [Tranquility GitHub page](https://github.com/druid-io/tranquility/). diff --git a/docs/operations/basic-cluster-tuning.md b/docs/operations/basic-cluster-tuning.md index 538ae33d75f2..8a904e239e2e 100644 --- a/docs/operations/basic-cluster-tuning.md +++ b/docs/operations/basic-cluster-tuning.md @@ -256,7 +256,7 @@ The total memory usage of the MiddleManager + Tasks: ###### Kafka/Kinesis ingestion -If you use the [Kafka Indexing Service](../development/extensions-core/kafka-ingestion.md) or [Kinesis Indexing Service](../development/extensions-core/kinesis-ingestion.md), the number of tasks required will depend on the number of partitions and your taskCount/replica settings. +If you use the [Kafka Indexing Service](../ingestion/kafka-ingestion.md) or [Kinesis Indexing Service](../ingestion/kinesis-ingestion.md), the number of tasks required will depend on the number of partitions and your taskCount/replica settings. On top of those requirements, allocating more task slots in your cluster is a good idea, so that you have free task slots available for other tasks, such as [compaction tasks](../data-management/compaction.md). diff --git a/docs/operations/dynamic-config-provider.md b/docs/operations/dynamic-config-provider.md index a0413d856229..b641efd7a0ea 100644 --- a/docs/operations/dynamic-config-provider.md +++ b/docs/operations/dynamic-config-provider.md @@ -24,7 +24,7 @@ title: "Dynamic Config Providers" Druid relies on dynamic config providers to supply multiple related sets of credentials, secrets, and configurations within a Druid extension. Dynamic config providers are intended to eventually replace [PasswordProvider](./password-provider.md). -By default, Druid includes an environment variable dynamic config provider that supports Kafka consumer configuration in [Kafka ingestion](../development/extensions-core/kafka-ingestion.md). +By default, Druid includes an environment variable dynamic config provider that supports Kafka consumer configuration in [Kafka ingestion](../ingestion/kafka-ingestion.md). To develop a custom extension of the `DynamicConfigProvider` interface that is registered at Druid process startup, see [Adding a new DynamicConfigProvider implementation](../development/modules.md#adding-a-new-dynamicconfigprovider-implementation). diff --git a/docs/operations/metrics.md b/docs/operations/metrics.md index 8fefc8b6e133..1510f35199d6 100644 --- a/docs/operations/metrics.md +++ b/docs/operations/metrics.md @@ -201,7 +201,7 @@ field in the `context` field of the ingestion spec. `tags` is expected to be a m ### Ingestion metrics for Kafka -These metrics apply to the [Kafka indexing service](../development/extensions-core/kafka-ingestion.md). +These metrics apply to the [Kafka indexing service](../ingestion/kafka-ingestion.md). |Metric|Description|Dimensions|Normal value| |------|-----------|----------|------------| @@ -212,7 +212,7 @@ These metrics apply to the [Kafka indexing service](../development/extensions-co ### Ingestion metrics for Kinesis -These metrics apply to the [Kinesis indexing service](../development/extensions-core/kinesis-ingestion.md). +These metrics apply to the [Kinesis indexing service](../ingestion/kinesis-ingestion.md). |Metric|Description|Dimensions|Normal value| |------|-----------|----------|------------| diff --git a/docs/querying/arrays.md b/docs/querying/arrays.md index 904802c2b1fc..dbeb3ec6e028 100644 --- a/docs/querying/arrays.md +++ b/docs/querying/arrays.md @@ -42,7 +42,7 @@ The following sections describe inserting, filtering, and grouping behavior base ## Ingesting arrays ### Native batch and streaming ingestion -When using native [batch](../ingestion/native-batch.md) or streaming ingestion such as with [Apache Kafka](../development/extensions-core/kafka-ingestion.md), arrays can be ingested using the [`"auto"`](../ingestion/ingestion-spec.md#dimension-objects) type dimension schema which is shared with [type-aware schema discovery](../ingestion/schema-design.md#type-aware-schema-discovery). +When using native [batch](../ingestion/native-batch.md) or streaming ingestion such as with [Apache Kafka](../ingestion/kafka-ingestion.md), arrays can be ingested using the [`"auto"`](../ingestion/ingestion-spec.md#dimension-objects) type dimension schema which is shared with [type-aware schema discovery](../ingestion/schema-design.md#type-aware-schema-discovery). When ingesting from TSV or CSV data, you can specify the array delimiters using the `listDelimiter` field in the `inputFormat`. JSON data must be formatted as a JSON array to be ingested as an array type. JSON data does not require `inputFormat` configuration. @@ -238,7 +238,7 @@ Avoid confusing string arrays with [multi-value dimensions](multi-value-dimensio Use care during ingestion to ensure you get the type you want. -To get arrays when performing an ingestion using JSON ingestion specs, such as [native batch](../ingestion/native-batch.md) or streaming ingestion such as with [Apache Kafka](../development/extensions-core/kafka-ingestion.md), use dimension type `auto` or enable `useSchemaDiscovery`. When performing a [SQL-based ingestion](../multi-stage-query/index.md), write a query that generates arrays and set the context parameter `"arrayIngestMode": "array"`. Arrays may contain strings or numbers. +To get arrays when performing an ingestion using JSON ingestion specs, such as [native batch](../ingestion/native-batch.md) or streaming ingestion such as with [Apache Kafka](../ingestion/kafka-ingestion.md), use dimension type `auto` or enable `useSchemaDiscovery`. When performing a [SQL-based ingestion](../multi-stage-query/index.md), write a query that generates arrays and set the context parameter `"arrayIngestMode": "array"`. Arrays may contain strings or numbers. To get multi-value dimensions when performing an ingestion using JSON ingestion specs, use dimension type `string` and do not enable `useSchemaDiscovery`. When performing a [SQL-based ingestion](../multi-stage-query/index.md), wrap arrays in [`ARRAY_TO_MV`](multi-value-dimensions.md#sql-based-ingestion), which ensures you get multi-value dimensions in any `arrayIngestMode`. Multi-value dimensions can only contain strings. diff --git a/docs/querying/multi-value-dimensions.md b/docs/querying/multi-value-dimensions.md index 9680d5603974..2b33737a36fc 100644 --- a/docs/querying/multi-value-dimensions.md +++ b/docs/querying/multi-value-dimensions.md @@ -49,7 +49,7 @@ The following sections describe inserting, filtering, and grouping behavior base ## Ingestion ### Native batch and streaming ingestion -When using native [batch](../ingestion/native-batch.md) or streaming ingestion such as with [Apache Kafka](../development/extensions-core/kafka-ingestion.md), the Druid web console data loader can detect multi-value dimensions and configure the `dimensionsSpec` accordingly. +When using native [batch](../ingestion/native-batch.md) or streaming ingestion such as with [Apache Kafka](../ingestion/kafka-ingestion.md), the Druid web console data loader can detect multi-value dimensions and configure the `dimensionsSpec` accordingly. For TSV or CSV data, you can specify the multi-value delimiters using the `listDelimiter` field in the `inputFormat`. JSON data must be formatted as a JSON array to be ingested as a multi-value dimension. JSON data does not require `inputFormat` configuration. @@ -507,7 +507,7 @@ Avoid confusing string arrays with [multi-value dimensions](multi-value-dimensio Use care during ingestion to ensure you get the type you want. -To get arrays when performing an ingestion using JSON ingestion specs, such as [native batch](../ingestion/native-batch.md) or streaming ingestion such as with [Apache Kafka](../development/extensions-core/kafka-ingestion.md), use dimension type `auto` or enable `useSchemaDiscovery`. When performing a [SQL-based ingestion](../multi-stage-query/index.md), write a query that generates arrays and set the context parameter `"arrayIngestMode": "array"`. Arrays may contain strings or numbers. +To get arrays when performing an ingestion using JSON ingestion specs, such as [native batch](../ingestion/native-batch.md) or streaming ingestion such as with [Apache Kafka](../ingestion/kafka-ingestion.md), use dimension type `auto` or enable `useSchemaDiscovery`. When performing a [SQL-based ingestion](../multi-stage-query/index.md), write a query that generates arrays and set the context parameter `"arrayIngestMode": "array"`. Arrays may contain strings or numbers. To get multi-value dimensions when performing an ingestion using JSON ingestion specs, use dimension type `string` and do not enable `useSchemaDiscovery`. When performing a [SQL-based ingestion](../multi-stage-query/index.md), wrap arrays in [`ARRAY_TO_MV`](multi-value-dimensions.md#sql-based-ingestion), which ensures you get multi-value dimensions in any `arrayIngestMode`. Multi-value dimensions can only contain strings. diff --git a/docs/querying/nested-columns.md b/docs/querying/nested-columns.md index 01a86e49a78a..d50f907c7714 100644 --- a/docs/querying/nested-columns.md +++ b/docs/querying/nested-columns.md @@ -227,7 +227,7 @@ PARTITIONED BY ALL You can ingest nested data into Druid using the [streaming method](../ingestion/index.md#streaming)—for example, from a Kafka topic. -When you [define your supervisor spec](../development/extensions-core/kafka-ingestion.md#define-a-supervisor-spec), include a dimension with type `json` for each nested column. For example, the following supervisor spec from the [Kafka ingestion tutorial](../tutorials/tutorial-kafka.md) contains dimensions for the nested columns `event`, `agent`, and `geo_ip` in datasource `kttm-kafka`. +When you [define your supervisor spec](../ingestion/kafka-ingestion.md#define-a-supervisor-spec), include a dimension with type `json` for each nested column. For example, the following supervisor spec from the [Kafka ingestion tutorial](../tutorials/tutorial-kafka.md) contains dimensions for the nested columns `event`, `agent`, and `geo_ip` in datasource `kttm-kafka`. ```json { diff --git a/docs/querying/sql-metadata-tables.md b/docs/querying/sql-metadata-tables.md index 829ed9433063..56264a28c9ea 100644 --- a/docs/querying/sql-metadata-tables.md +++ b/docs/querying/sql-metadata-tables.md @@ -300,7 +300,7 @@ The supervisors table provides information about supervisors. |------|-----|-----| |supervisor_id|VARCHAR|Supervisor task identifier| |state|VARCHAR|Basic state of the supervisor. Available states: `UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`. See [Supervisor reference](../ingestion/supervisor.md) for more information.| -|detailed_state|VARCHAR|Supervisor specific state. (See documentation of the specific supervisor for details, e.g. [Kafka](../development/extensions-core/kafka-ingestion.md) or [Kinesis](../development/extensions-core/kinesis-ingestion.md))| +|detailed_state|VARCHAR|Supervisor specific state. See documentation of the specific supervisor for details: [Kafka](../ingestion/kafka-ingestion.md) or [Kinesis](../ingestion/kinesis-ingestion.md).| |healthy|BIGINT|Boolean represented as long type where 1 = true, 0 = false. 1 indicates a healthy supervisor| |type|VARCHAR|Type of supervisor, e.g. `kafka`, `kinesis` or `materialized_view`| |source|VARCHAR|Source of the supervisor, e.g. Kafka topic or Kinesis stream| diff --git a/docs/tutorials/tutorial-kafka.md b/docs/tutorials/tutorial-kafka.md index 7e03671a6b6a..4bad708d57d4 100644 --- a/docs/tutorials/tutorial-kafka.md +++ b/docs/tutorials/tutorial-kafka.md @@ -295,4 +295,4 @@ Check out the [Querying data tutorial](../tutorials/tutorial-query.md) to run so For more information, see the following topics: -- [Apache Kafka ingestion](../development/extensions-core/kafka-ingestion.md) for information on loading data from Kafka streams and maintaining Kafka supervisors for Druid. +- [Apache Kafka ingestion](../ingestion/kafka-ingestion.md) for information on loading data from Kafka streams and maintaining Kafka supervisors for Druid. diff --git a/website/redirects.js b/website/redirects.js index 93cd96390b7c..bddbbc95893f 100644 --- a/website/redirects.js +++ b/website/redirects.js @@ -122,7 +122,11 @@ const Redirects=[ "/docs/latest/development/extensions-core/kafka-supervisor-operations.html", "/docs/latest/development/extensions-core/kafka-supervisor-reference.html" ], - "to": "/docs/latest/development/extensions-core/kafka-ingestion" + "to": "/docs/latest/ingestion/kafka-ingestion" + }, + { + "from": "/docs/latest/development/extensions-core/kinesis-ingestion.html", + "to": "/docs/latest/ingestion/kinesis-ingestion" }, { "from": "/docs/latest/development/extensions-contrib/orc.html", diff --git a/website/sidebars.json b/website/sidebars.json index 80dd3bafc9ab..6bd623906af3 100644 --- a/website/sidebars.json +++ b/website/sidebars.json @@ -82,8 +82,8 @@ "items": [ "ingestion/streaming", "ingestion/supervisor", - "development/extensions-core/kafka-ingestion", - "development/extensions-core/kinesis-ingestion" + "ingestion/kafka-ingestion", + "ingestion/kinesis-ingestion" ] }, { From cf08a5faea5bf213f209e59adb66dceb6e18aa54 Mon Sep 17 00:00:00 2001 From: Katya Macedo Date: Fri, 2 Feb 2024 09:15:01 -0600 Subject: [PATCH 06/15] Saving changes --- docs/assets/supervisor-info-dialog.png | Bin 0 -> 107805 bytes docs/assets/supervisor-view.png | Bin 0 -> 63532 bytes docs/ingestion/kafka-ingestion.md | 290 +++++++++---------------- docs/ingestion/kinesis-ingestion.md | 126 +++-------- docs/ingestion/supervisor.md | 276 ++++++++++++++++++++++- 5 files changed, 392 insertions(+), 300 deletions(-) create mode 100644 docs/assets/supervisor-info-dialog.png create mode 100644 docs/assets/supervisor-view.png diff --git a/docs/assets/supervisor-info-dialog.png b/docs/assets/supervisor-info-dialog.png new file mode 100644 index 0000000000000000000000000000000000000000..3be424a413ef0772af5ed015ca34e04618274bfe GIT binary patch literal 107805 zcmZ^r1yqz@^Y{VjTtq;+1f&H7mM$rg?iP?-YUvKCrKM{DrMtUhX;|rQ5R^tr(*OGQ z|M!01bDjftIq*F5+Aee16u0Eo6xLJu zo3EAn#uS3_1QeF@@m@mDl+B&1LvC+bsLB?6@k0km~ke=QzD?9YHf zzW;e|6iHeMI!??_i>UuoLRH8%`9I!1$U&C2_bDC8gE?W1?r8wwOGy|wbS8&u6tAcvH~ zU8a|=S?jb0|2ff^f+)|nq#WKZ(Ea1>PU9rryvg5))94Nsg4BM<~}FwXR z{MYgZ2B5aSgPd1o7B0DS!-cFrLmsS%w}mEzI6*_S_OGnIb_TZna2S+`_50qdSL^UV z!oqd}#OeOk3@oWAJ2*W%3q%OoNwnehU6^goPBAdWY1K+=f~WFsj8RVN-&nSTN1OF(yeH6e;~=(ztg}PR8~+(I0-O3o^~iXLM~r*bd1VMr&pi2bjX8S0^x)IXYN zNHlRh)APjsq=@Qp^?STv<%5C($Go56@3!xvEgB&}G!ScsVb8J9e*5@*{|^?)U~_um zAy&?$5MXnyB_7%LI<%}RkzhuoG!-6`yWQe>$@3o`ZKj>Ua=RbM)$1tA&nH$t`w5A+ zSW4NBf97>nz-wXoQH?(-)6l1ieknoqKuJCWzwRFJZM`Nl^dvY8VhfRGLVo)F8493d zEaQ_ABp9&!>$_>`^rYYEk0lxy$bSzL?2f~SA>yNsmVggbh(zPnzO{{lp@`xw#z*lQxP|qFIIk|NXqi_ z(sp)z0pYTcBom>aLm0UrPTJ3#u){ekPca)eD*k%=2OS@r$#nSl6Ka!g@B&X_>o}!iUKErhSwa5;WOp_MP z6Wb@NEQ9A}ojXB^IGRfxA@B(OkHKpK;aRgNbC)Yj(aW*;U#K>=>i+c56Ikr9{P9B*%YHM%gwSFHK z{ah@+uR^wie+8MiRUdNBj~cM@7;|Y#S@-z9BS}Y7LBS@y$+CHs0A+)cjQD7fy0T2) zhw<%nV6-*`hnCjU#aK|G%$wT_h2+j3Hd7N5FTW8dtE=nNDJgwnERQCSXPM=5OGq_Y zaEQ$U2Nn*5xksq&o;Qw*etAO830@^P3$S#i4$v@9JX}a80;GV=<>YPO*0LM73VnWP zb1sS{*-}qP{Lmff|CG#ha&J6$i$`!__0%^hJ!gjT^O56wJ4~rruXf@p#O3CD*!p!=NCk4o!PcWMrV&h#g*?MKuxZMUqLlY9@_k3M=v3tuQebe9IastP5$r5bG%L(66;y zu4&V_J2~hHcB9U?+fLL4nNU1pf(LGks~lAiVBf!6M?^=fN?JaIRj6wjZ4)bgc$5Dw zT|`^EV)1)}gG$AOannp8^1~9Q6P%sloB;L3(0H1daW%fvYD{G0r{dz5i;bdkk%~gs z-kT@-X|GN5J1I|AX3x?mAC98RZjanlj_l|-1zS?1DR`yYVqGZDqXp$`?CcVf%LGtE zzO-S1gIdczO0Qq6shgN^X_tMkduOUvVz*YyKwD{4qfxvH$|3`1Cp%S1|Dt>*ZO0qo z9EKcN^)(lWd+g_8rJ_~l?^_R(GlMdVKi1bub6nHG#xsSr8tvCPWcmw?xX0}h|JXNQ zrj6;laPVvKb+rz+-kXO7uu{`t6t4*8(#&~J_o%^gc4K%JAiU>!VGLYAHgwNoi`G|? z)`i)QY%7&6n#?SNjS$}t+~~>M=dN78!9GSt4{bB2xNHjs?0uDvhS$(w0Z(7mzJGLW zXw~g>?m$@F{nr;i5}AkwVv;`gK3%NkTdq1s9lY2f`C;E?662<6p<_Z39X=L0GRM?i z$nW>CwNVC`r{*8%?8F4i8U?U#F)2N-;EE}mci4Zqe!L`Mah->)Hq`DU9wS|i-pOW|0=)v?Hd-h zicc38uOFn0jEop+YHDgdhi${H1w2&)$TIlkNx1YFl*W7NdR=akd;0tPvI5#8!N!pS zbPi+1rc+fWGzQT*yOs^KtC^Wqa{coU@eFk{`sf9vrLx*pi}@?3-9VcyYsu(0QT>*a z&xwc_WMmjQjWpS;dA(5zIy?S?{1ypq?4xmhJc%K!%wma9}F)MfwPJ8U&x6w`^YFli#1VtzY zqr<$EtZeHX?=FT5GAW-^SRnpI2K#x zWBl~g+P8T?t*=0kWpXl~txUj*n=`joKezGZq9l`LID*H21yiriy=-Q#!ObX#+a8*f zlr&mN>4R<+H3Zd5{BWLD12_FX(2gkKE??zMcN*O($R@ZR#+p=1r2`8Y)1yO3p7&LDA*6TqSFW)9%5y>jN zbFI&I=?7pOX?vw*wk`!HK&KQ>4o-z6UIIjTTUzHG)@uxC;r;#nHp`cIc(ved_mc9` zV+hOH-SaFKSPZqG=Iy0os*HGr>y6}m*WQ?ZL`+OdPD*_9Y1>OUg^NB@VDr^PC?Pv?H;v@D2uZ*dVt1!~ zxH!l&%XHy>BP0Ub2=#-Vj%nAxn3O0Pu+G4>t~dIAp6T579}#=31cEBb!N(`#Z&)uy zQ@CLZAR?+D;A(TZ`k{DvVbl1FQkGq)eM5jx;2=u+RTOiVdjlx&4%ulCtGwF0&$r$%BC%dC^0_W$5V%>pD{vvfWX&(ACw=?ZpY< zqa4+WZoY9TDnV+pzvJ{%lPISbF(lu6xl1O+Vm3x+NgMlndQ0o}r-Exr*amBD140GNMaEn0}9U9}c=PNk@fTLo+*PV#Tr@J(2M(DQ6lT1;4|5 znmc??2xeT)SPpJK-Pt{!Kai0$t$_^=A{6_*z0{CilYd#~4!u7)tT;pecL>?JtBl-W zaZI#$f3mRc80#`0_Ofd9g45+RbAwR5xY%0a z*zm0+(lzsWxM)BQFWm@(;S*H#6-U)>>UpW=!0olj7CDOdu?;nj>&Swts^#-4BIXtr z_%khiE*!gFhiRL)rie9hBRA(vJKw^bJH9yladP69B{0W;2+X`uD#d)8+Y4JG81}wi z&d*wl`g?MBJZT!#WmFy6WP@#L!R-2{6%c;8ykLzdL#VAsX#T`T%CL;0Vp^$r90Z|B z0shKJg+gaWrpAHP2NIH zT39rc>ow^Ggk$)KeBL5x-eP+QQ(Csz-vP5iC2{%od+0xZ{#*eF*JX3DQ~_f$6%(*A z>S_o^eZx9CV@-JXTzrb|b{s0Eh6dEBEKyMCxQOC5G!oS-LFW-azQW1?%qtuh?}EMb zX>j(wY6guS+J3X;h^7>K^Te>tS;;M1MYmp3HQ0jajb5EaXw&8QD8Qa?<<~4x8J(KC zsEVQkbeMPhg~W%DOR92-B?A<~EojB&1cpf{Fkq{{r_}Odh+RQRX$thxm@)=vhd3O5 zBGoCcZHb|fT0Mq}5EE=%Wz;d7a0Fho(}-%avVlHKr)1%)*`a?Dcci1kZ+9DIu6(x`-op2c(9Le zOKFs(#t>Gz5PeWZxb%c%t9c+dnnpnV4n@(~(44NhXA8SOjbZy2{Cerk9HA%>QJ-_I zh2#CPW@C;QV^Oq7^DoMMFf)Hm;DC;(7JW3?adrfWuBkIm8`Yc)Ayb)6AIek_PWhDA zKiq!TY^9C1I5+=VArbk>?f%N*@gEH(|6$ zgqS87rV-zu3&>oWPJ}(d%*H62D4=IABD$maji0}NK%pB$S>^c2@u+D7;J?)8B937c8!{^d|jss$(#F0%L2$puCwYu9vE-CFYPwvTXXZ^))sP;=qwZbSA``S17*ax)G24l&*w;2 zwID7Y)ad)Mw#J#qbK!*N)?c$l7CWqR1{*fsLe5|3RFn6NGOXa6`XM#3=$3mM#KEN~ zc~*LE`|Jo2TYbBLcn%d1H{QSO#^7>_)Bxy@VueZ7BeNw+C>FOCW(+=&bkLoXXmA)? zYtg~RR_xbhwDsttWP=ZR+1`mb>BL3>2=mJ z4$B$ZMZe1w(&qT?#}NExPIi;Urz*}1 z6G66Lu)uJwOu$pQMKlRWV>LXaAgimO&QK%kZj!LF^f;th0&c*dfP9M;gn5mh*DO5$ zV6jYi>E|YR*^u2Pwz(we^ge9ud?ON1k>3~6d)uwghs~~H+P^Byp@lGH1#_#Hc$XhT zurLW~gqi2*U=PjK1=6T}Zq?1r?;004f=#LL6*WZqo?mP2?R_aN)z@qDuN&Q?!<78a zP`-mO?$2?ulL1Rl_GC@fYV^I_w0<0xmkczYmWw*9U^bwJ(?${uK8jgwQKOiWq&!py z!Z6Uc&6DY1rhf13pJ$_6RyKR+0rKE1hSwSHxjyfl439H`D`m?sr|=_{`k$WSjzYFFAhV1a=GY5 z>?FrefxuvJDmU>Z?&c_zvPZ)iL!~#HYw%|hH}L~lnUe7Pi?*dIwh*9W@28d=M3w>K z@$tF&vd516E-eZw4Kt5RdO zu`Q&lOZU}Rw~!iIbZX-?Xun9+UQ>w_BE)s#L&vlMycTd+WBJvB2#n0@ztBFCmwO@N zjf#0ChFPahYp{!n?m4d!bfQ30QVZ~vDYfTrz|&nIeS~tAqAZ@fn-3k zAZhRAcQ#m%XWIp*DY;e8&v7WP75eDm_wmDmED3rL-hFerKgd#%%5%Z-Y5+^GkUW3E<{=PRY2GP-Wv$2t}cbCFfJYb zS4CZl%IwT&+{&=|AU6IlJX~wLG0TX;KyPNCA^|5u7?f`6CY^SWg0PGIa~1(f_uZ$a z$F!#K7{EMsF7LCxw{Q&Iug)9$7Odj`#PfVB++{|81ZtN>> zvJS`wyHCKB48djML$6rF%=MPBxjjjIrhXS!;1BpAjM9*1LupgJw3E5bTV687T|pzhAaNlqnwcVsff z67lb08d>{&DW|#snVEA&L50;O13lq%{}V@punClG;(__Ms8N*_(^cPpeT+@DL~W!I za;?7jDnE~5?+ec2b4d`zl8TRho>Eeo&j@(I@1Su_;d?d+Hbo21$M+<2A!WelB1q$H zr|phJ>!sLI-$d(!gnsmr=!wcARKakd3p;>MS%5CksOH{P7fQY3X)f%wAW2aaU8IMv zyP3`i%wfDyfvtSR#6DZ5JRz8;QUUdN2Fd+=25HXTCHhE)a&)q8lE1eGi^$^}APoUB z$U@65A!&k6I)gW{x^E#mul)}(-A;bKfJwdyevgq_E!uL^Dn5V}rSCD1*UZ@Chm+#E z!Rxa7LQq$St8A`b*WqSVn1}u?Jw3z2>bTmvRy%Mn_+~rS_Au&>H z{h#<}REQ$oaSKsD`iK0{i7!uP8=hU8T|DV$hka2-PtJioR1x^qL<4=?b>DR!pF7!Jy{4>Wje#Npsg=4ofAXfZKvWUE!iFC~G}lU19UeqJ_;)SiDYoLpUDSTYcUzit)>4LSb#DTE zY}DnAGI)A9W(6&w;rwH{nQC}j2Jmb@*4)kAh#a-{ey|J4TJ*N0$T`B8r&%5I7?Y2T zvLU9nCW{{1Re%oP1UG&a=S=OrLJ-O|9t6b`z&!%g?YLB!7FsaY&?riII*=;WjI_Z> z&2=@vtzEfPyp_Zc8@us5z2>4s~S8;(#i}erO}| zoQ9s(4E9B78LH1WJQ|zTXqWuJdpk72Zou+gxl^f^$(Y5a|_z}c03*>D5IaaWAazc{jqqOl|t%1r>@^Y(!Ut1gqy zfBB(d55|e%ivUgD#w=f~VST~Beg$EXfK%GA7SO5&igJL@A)FPS2iV&?|J3Ct_3#-J ze8hwGTGk>hE|VR6F<|-|n{4QVtb@+c6BvA+x(FUHl966U2dk= zICb(6{~DQ_>lBGVKus5(78U;*$;5J%nDR+n_^;9*6&n`R4xsAVzCP_@;xa#c6jQPx z-IWRshzX?s8|Qn`#xJo|$VZd-TbM20pW-57-9N~y|? zu-_7J{6nk7vV)|r!1B#P;~P1d(pQ1!^YXb9YyN#t2|!cw7ZdKrd2S*@I|mgKFHxW9 zVq>T;QPAMq>-!0=uewz}sd{oG=(uXsB${;)bb95H6S#OgObo&VO|80M!<2nL^m=`) z#%_kw=1cbk(J!Xhk3NfDr@KcFY2`35z&O3OTJanR3N-dO{fzDV_{=aM#)sqhta(=| zKARs4aBR295b>NuN>Z|p1&fYtdy9n>&?1E})+KsL3a8R|Hl#TH{8S_B` z5>^N)@}B2z1^donZYo>Uc`SK>iMJ-;4X{InUK1xb}FuHr6H_D8%S+ z+7ywkE1r)UsO(AC{bcY^FAZ&qNFguxznXFX$OyX`CGA~7WMg0FKQUA|JL&Y> zoD+Iy9(A-&PpTR)NLYUBKc1IEZ(BXk?f!?8p*_R|hO0RzxA$lk?g(MaT&hvc%XvAVfjx{8>;~Sgb`+OYH7SCTtlpU=+lM$A0n&?e()Lx6k8GM-oy= zBUcY5{c-{Gc|~1R!AeTe`z&V$lW0^@Mt z(tN()u!A6G()p{VUItey6cK{9^F#OJ1$z38T$?ESIV$WGwawV*1sTzX*L*z|O(jM6 zIorpyDDq249`X#I12Cvk!v7TI{|JJFiTZK|Z9={IXsSU9n#rZQ%cU`wp8|B)VziMS z>^aHa%1l==w1du2;j)J*4T5*${_l7H^8h%lgc|%^!$Am;bdmS9OK4F!`77cOV#;^@!sYckuIFIMhNg6U3NC^- zf5T8Fj47H9|3**As0n#}9Cjnzx|uJ+G*wypl$4E#Gn)tVvU@osHm_FeF-SRyZK)a&Jc z%mDJ!(@=@F{2FcHJ633YnT$!;vjn$RIDO@)lcxdTDs&?^qe-;QPTnr1Nk|e-h&BJ>7=7w#)rgX6}^Bn-=O9Ft3b@#;wdj z#wnM%aPd9LKU#+1VO&{|qu$8AfuIGjx2w7R&}_D$M1dm8lSCTz>sbuXS$qQ_o2pD6P$`QhNzGPyC%nGLk^j%i_2l zq&KVC_)S0cKx5mA)DX^8Vb-6A3=?>mW*^tabYlMpyM(~F0B}_OY!=Z{T2B5(;%_3> ze4EijitBAHfIVAv1B0%HlFf_kt?j1=1BxNih|K7J(p3K(n_wO> zG#xy5E-LnNV`5CCKd!D7EZi@qoSQjqy3)_#>H*6JhdC$xdsF-e-nXgTZDvic{KO`& z`uUbT8yIf%Mli}CXxz5mN<7ww8(zfzlrBPiEEGsOXD=4v6h;hH;*5%lN{x>0Nq|D> z&(6Sff@+7){z6!U)FBkeWwWf8mv{&)ht#sQenyx|--C3!H6E~6#t5yT zehluKnz|k84$^ByDrGJE_TDe}eSG!5Ui<$H>f-|Y&Egl*tbZgY6{AqXniCS}+m13p z2f%e$^J#r#-Nli(J~2xsd#GecCZ>fmO@qhk5>rSfPJl(3ysz)a=*(A2wg>wKxw)l& zY*9Y{M)Cf|XKEqRt3P;m z6}`NYXQ;{vx{^n!|N0dq(g-0@G4V4qFZ5c>;#r8_(3%;hVS&T03zZ-^b#ROqy$#JP zTSbr|`E)*)FNdST-Dysy?o+jGf4xx@khr&bFs@Vk6+4M)gmQZp(WjY$f|aUj!*83I zyoc~nWb9tW6q7l2hbd$@JMb zRA5@wM2=;tAZF>Sj(hrF=^kw~mITD0_7_E4_pWj@sN&?gHu!m?QnZU{2r>G*UR_9C zCbM^4SY0b%^YB~-aU<=QZSwzyC;vSm6BU@nfa_@i{$Lkj&$yiyh)r|k2MJgU7N&tYxBA^M z%9>vhZ@rsya^IYJ)>*~)9?Av+w#9~by@Z864i8iR-L?E5WpL#}sR#i|2cOTf!YdC# zK;_M+q+2VDHq{t$(>IfgHuiYoCK`|fV`d&XM0wyp6#pM%z%42+oiZcEUBDP;Y3wsx zchN!q+~dt`a!V;_fJ4bHdx?;dd#a|nf;t4}Au&<(#b5b|@5ZQOQidRI_@FMhqI|2D z%|`k;HrSuZ=2iLTJI<2EZ=hta^SC6asMjip&*k-Bi1Ei$^`hEIt<1+xn1!K1!y@Y1 zikJLd?P}tf?kl!>jUg^+Y_{1kDKGHW$~RvHA2mP}-$kcJ5Xp`PB0uC;SHEer8&26u zsQqNL4+hupxcvkQs_L@0dF(~b9(PjIQeAy9Cy9~?`+%P$8O-xxd}&-=L*u}mYew|L z=DU{rsq>uzowWqVP9yg5@m<#;vDXVhTDf_7JM()voZ{j~x-CCaKD%#*eTx9J;WvjUmDX1C(qQ!r}j8G-6P9)1^PzV;)D8&H* z4;QKF97cA)&dVG+LkI+lO2g1(oiE_HsOfToZG~@FD)>x*1fpCrI8Bc1K;Q3h2!O6PU-b0|!GJ=+RuGLT*$!n#0 zDd=MxRiZ7Cg*~Ng`T~Xqpn>nC?Au+31bod2c+*{3La^c!JeJL;x8{3CwAgB2giAeR zf{wS(MdSM`)NZ{-SbloXgcl3zjK>46xtQu9(df87TR@)_AnIX4q3ysRiyQw%FWBNN zK}+4}XpNq8CbSs=D*l;i+iWk&|2mQcZ4d@dq@)YVUmdT`%i8faO6I#T&v+fMgf!Cx z-1k~eSvD}whm<7u+vYDli_$k0WUoGy)m0_|TgwL!M+3nmlfPva9`Wl#ssL8mI9)61 zcnuAH+*`)J$6|t>)y5882Usi|9FG0se0+MGzh-}^8)>jP-XBO4hi`3f@2r?)(+>|T z{M=uE75@7w+5TJSW#0wgZ=94@NFaq@oAeG5xY9#Q7i)Ycx>G5P)C_rv=26vAr9$HzM# za^rlfmbM%vHFdZ%-BPqJkkVWntyps{;;`fDYZq z$J;w~Ro!yGF>)Sx&W@c_9!L1=lH%foYq4KB?RSsA^ARPeJ(P2&$xy2jr_2tQIQV3D z5SjBd2Z16OoNnmaqbJmrmQ!3TFL+-qI6^)b{arlv@+_e9=Sv{EH}{7BUznr!04K%) zuczPd=`iRV`hUu}f5A)VJDSo~Y@N~W%~W!%!*Y{VyMf2o_blGCI(->i1=14o^YOiv z?vspwKyY=Vad_z}-Ez@QR@IHBk2?;Xk$AbW#HQa5G+qZszw0CM=BpfOT7IvI z7Xw`8MO75MWC||!lV3KiJBMQ13as8j-emw~9XftEN5saa;88Rw82bmYL&c>LFvt$P zE$0)5fyYRCQ4VdY;e7xrrzB>ks7tM;ruHaDlPK%W9nQC7;mUZjY36-$7FO0|3%0r8 z?`qOoX>H%G1}{7SF~_ajCvYxxh4lI?rTcXN17vy;0HrQ5sz`OKa| zYI{VBVrP}$ky*#kJsTVBL}b)uT4w%W2Hnf?$++KdpZ@UUr`sP=TZ1VkneTvqwC-yc zow)F^YSWS5eNL5DbL&BDnZ?Y|7j zn;H|`(17vn%iZtd^YhfbN4gT0Vds<8d6w&jgT|qxVRN?!k;^gA&&P0+ko>zkaZ=+s z#{Y7ZnD!`^n|PvbR*ZWmFZDx!K2qo~pXuOw-TsnY%ey$gs5$^f1-!E&Squ6Ad~_cf zyp2)9n_8HlMKGOW$CIu4*Th>Q%>_$*1X*lNE3PfRza~>{FI~SH%_}GS6r)tc$0|2UB}$z??Cy&-!<#QylIQ1S`yPNz?$ zj$+61iMOG$=nre+En1$Qw*+RC?45!+dG$U-+eN0xb48DUNLcKIpSWd z6Hr->jE+w5_w(_wJDC%vPGrE(WbSI)p}d(lpcyU;@X@ga9k`Ovof4E?? z3YIQxh;|-*9zS>a+$1+?>IP>Xo^T19#0r^Zuek3K@dO;d7QD&N)6~>&Sn~UbYOW=< z6O6+KW53wWQ|V0l?e`cwP!>#4Ty%BZe#atswGhhVGxx0JtT6l!x*6Ml{1(wo)9Tdo zt|5d1Se>I^aWp=z!zQZx<%SiOzQ@eNBgJ_?o<8!k(11F*;efv^nQj-81mv-Ho&@zc zF4>pGE)!|aH5+r1taP;2tRbQ|^71NJ0beKHHt{^HdO2TA;;dqVm3Vx03-!->Y8^HL z{dt@}r1(De$Nl`i&E*mGfdbj6p_%Bv7E@vZ(={7#tr2y{!oKbcR^ju4mR?PxA&wo0 zQdguDlw**?Rl$ebY^dSsR&)#%1B^^A__^zdg51crE@2h1-#_JU_PWFpgdN&SlR>O+ zGKC%6-cG4&>=Z6jYt3t7N~!$*cCycp^##u%iC`@tIMNi*voU~4Xoa9zYV_^(bTczD zz6~O5c&1}&F@O{SVmGawS+Q_(&K_A~qKEiTAj+`=Br zcnz+{e&^ICqN>V%E&i(dgyT75~~gG>h(^XSNER|G{Oi9X7YdKW%R7E*|D@;k5Gq)Wy; zaMkT@mfYv4hS}JG2ug!RSG}qS3APS~;msS~P{gpIe>rL)ePp5Wy$Jo&p_B};YUY;$ z2wi+ybRO3t9V$h**bV~m)HcU1Swwum5GQ_9+WM~d_GCcZu$06?VEr+T((RW>S12%; z%uswVma2q%|I;U=vuQvZ+{mYCDdLbv~s^C*aiTFWZTHX{oXVgj@Q|w$+>NZ&ePeA?Gh(i?f<#4MtKc{iN9(i^49i zry0u&@b!Qiyf(ex&hz5Bk|-RTH4zNW%Q*4NNJ|FM?Y+TKvLc!)Jz2acASSpv<@P+W zN?sW*=0+^KKXj9!xpCk~dceIqmO)?j`}~PsgLmS9X7H;zkWS?@lH}shEQDL;tVRm9 zpyKiDMZwVqA%e~f2jq8ls_|AithGy5T)fZ1 zK*n9P+H~;w`VcVb-$!-TTE;aKU7uw`7%oU9NUykuno9=Lf~^LJQlkoBy6k1Slrn8} z`*uFSi^#bMhN-r;wwEx$@`d2@nA9?1)~Al$m%kFC0$8l>w>)o7tk%^a+vD;79Y0QE z{)rz?2esZc#|}YDxwkYSpa_CJ9%V}QwUrlk)@(53@I>Vqg%BwzL@FvK9tJ~2fcdkg zbMy0MjKq&=9Kk5&6K0Xcva(^zCBk%Yi(K^W=lo5YN^=wVQOzEYm-V=mX`9dS2X5eP zzh`kXoP?p`j$teA=O=sytk?;ZwBtwcEFS|bu)H=>5FVN~c;Ral{hK}=@VVEYN6YuUV=GaHQTY=w5IyZ^Gp>alRn%aRU1UCs1MxLv+%5NN*-;qXVb z%aiG((TAy;j7zk=WpDN~`+u(BI9l^-_fgFpW<;`oAz1SWzBov&?44@8CRG8W0tNK&jtgagcAH`^JvMFuqSQ&Y}@_Y;E7{olIOae#TW(yUv(bi79= zoT73SyU`!rf*u9i3O}BI!Lb^q`Rx>e?e&t6uVKIUzAujLB@E4-Zu6hr%m0z^5d#D+ zg#CmIpR|?Z#XFpCNIEq6fO_d*UvfvN+;2;E z4@2+CfV6#dBvrG8H++y5Q>v^y+F%AljDwB!02?ezNmWao;OH=uf zsHC8ju(IOoMvN|i>Hl33o-JZyU2VbLPua)#Tuv=+zh#tj(i3-S~HI^`3sC8?To9w%bzN3YC zji=exn>9f|MS#7Iug9RkWkkSwozYPzYVTz76&1jBvWelKSu@qr?M8C83s{c(&qn%x z{l|gN=2*=nTXu8yh$Bve`ngc!xluYFF^Tr}V4NB#ul6VQVvTufpAfd0hM8IJ3nrK% zSfQlV2;Epn7E(vmpWV|#e^d*>@nx@$L{w?1{i$lW*bgkfW*jgchW_|@B4pw%ZT5taU#Jlg5l<0K&#WV5{kHbnTx}zfwlGiwmMFU|E8Lv#EnQ#{xKyio5se6+T&n^jlaF4xwu8tEUf4&`j zjr80Q_l{ubQ|pqx+g`Ank)g;63kI7n4Etb~&G_k4B1R!rsnRk~eto$VqK?tOw6gMX z*-Q}}74c5_Z9=|r#|t4QecMt{er~S1?SUQmU}-yJDVrBF<)f3pYqMs$u(U<02GPOY z&c8@W{K=4xpwhyQ);jd)Vf1)Pc;uv&?Ya6HD9Zy% z27En8X5KyfoJ>Lk#so#on3w)cl|9^uIo$h9iYELya;Eiu2#$l~J0-LS8U3zExNMNC{MBR~pyO zw8|Sdon^X7!(APdg4!DhG(n`D9^)Cp-^zA%lH=gp*q{HrZc?Cp-~E#nW`UpZMK3$F zd0+s&;Ufs~;z>!*5FK*FWwZ35GBOESS$PqJ?;rSWcV0DwNEfQ>PW52pA46??x~b%3GSH>dO$nM4&`)FlKHynAmniPGRTb*n;j0=p(b&*u=s{xEtc3|bi#YP z9LoiTjoJ+p5cw!vssxsYJ1gsfk#U12MJ5hkYDJ~pRhF*H1wpvFVMy#+e=ioCikFRy%g~EwB z56LqYIh63j)qUH@DPuZL50mVjot?LJ|BtY@j*7DF+QubRLSjI=hmb~65D@8-?(Rll zXpmBBXpoW;De3MO7`mjphVGI^_%816d*9!4-`>yq{o`6LoolZ1JomZd*vCEw?|TTn zy0pAJFsCjdBW4<~2ju9XpCL&Vm8ES(VTSs4?4%Ay@H6H9dRX?| zbNFoIb)C>33E*G7woBo+Cb0e4Y<$RwlG4t;rkYA2-y_Na;W1&H;9I0lp25MjNNf4& z>a49U#liIan5+sW58{HQooo zGqW|#SftgK9APE^k#(w-X)O37nCAk-mHLvE!LyMfmm!ZGFty97}0J1Po)BAMNUqzEI9(#)@X)%i3MOl zWgBdYeRcH|7#u8zK!7);4~Ytko%gR3>21PQH@(ny8H zus#DLb4*NDbRLybWyQgKwPbS2O0u*$Ala!RTmiatBJ}GUc77v{O6A^i5*Vk*Ig~@apPeGpDs~YI4U{IBKH#@hwNufcY zGeHA6J$cqBke5uUFMYzhoaS2I=aSMhi+p*-5~CEuVLpx`t#G)n4rQXJk0a{|pPe^# z!m82Cu}^=)VGMg&?K=0hSyo1-A28r|+jF_K(`a^30Z=@mD8vMz^a)4k~5 zVyC8IDs#!I&d$Yt^!Cda3Zssa-*0L1>g``*8+URkRQJj)eQjeynsYWPG_7Qf+vKIYL zu58PR`ihk08RNF~ylHd#ysfdHR7s4J!tP@Q$wh69gOlbn^&g_LbIdNw-)VA_qCO*x zCW>4zPyZM4$~PTcKk-zcgbb^LLZ!;v3rEP8(nV6oI3879Ub;)m@6fWdvqz?)V6TMA z%H(8Zln=HGhu5uT(>w^8$PErEWAxYU5W5h_9mdkh$qg#2sxn7J$YD|q^(2RSI)p@g z!qaz()gD(W1`udMP1jDu}LUY~!-q5vVK$2V^E&*9NUZBjjN!|GVS@N|GNMM&WT`$CF88Epz; zxJ8Q>s(e{QQb83ZokJr9Gq1jl3O~xmo_dT+45MR!JqGcl{0YENA_$=r^ zF)^{XxcDbPCZo#JHfSvs^c;?uy!W*RH?Zz2&nhX~pC7NoMge%_8wPT65u-mHfSw-s zeUcs$J@7fdF1>btK=kYA%Zu~Re*~j%o1p1Ni!999=*sc)zKyl5%eGJh%?YmlJAVJz z1w-`v@%^rk@4m*8MLr)*5oZR~v?8$p52fw?E%Ft3XmmQ*HTuto2IEX8Hcrr$XIVexl{);VPgU}#U z&Cdv_S+jFri1AATXZj*&;ZM`pIz|SF4!<7szedM!Gt;sjjb%&KPhb}7Jq&EoE~-hv)hh$~Mul(@|v<}d_Lruvce_?5z$w-wv? zF0iOGBIEl%c|x9yfN&g@;OTRClN{E(Vk@8TH+3hol?@rFs9q(Z8%*e=AwGo>D9Hc%s={j|+@7e5oZdCe_uT>T*dGp(<${py?L= zlisiZd>nn7Cl)1bpaC<=2+a(xV?em0^7d#ol$B^Sd7Yv!?LvSBKwHg%Yg@PGe1T3pTf*F- zPeDi2em{bWO6+>TDpq{psf7##k`MGoWv+}y(0~fz)a)X<7|_sSa9ecExjh|JFa=sr z#n)G-Dp4`v;m%hZbF_}bX;VFXtqtUjj) zF4Xh~XF2QQp+MXgZo-@HR|v@56mMc0l%b8wQ@Q;U8n-gU#KeFs41%<*Y(m@`%~eMP zR&S!g{rHLr#yJ9k_r-VHg$7yHrnT?ta%@*C?;iwH1nx&uPR7&iV@wp9q=3@)(?WBZ zeUylhz4a(!m1%mxw3d& zwB9$ZZ4&emwKk0!Av|pxn;{;W<86L_Sg`H^G$s3vyUD?o8|ep{B3>&F~wS;BXRtWlG1mc(oo1dmk~Nb^OBM{_yamb1H1@W3c{D zQ0v7x=V+r_0jbX~kLyFHyF;4#>pp}2CefkKwf9%qfH*~BzDoB)ma957-ZyvX#i;P+ z)U9}z8MfekqjplciuE7~!>4J$Hr#soz3IBvt@?mnk}X>2&N(TY=3>;>ez8Jj=w|SY z5m{VJtjB41o}tqG9l&k+OCe&ew$|%-J@Z@R_r=TLrM~+#mrPPOxH!O{*mG1%y*pi9 zyS8CfwSjeYNw{FN+;s(WeO)MirALSRB0K?5oge~c95su<{RyS<{iO}EcMkjFjwd0K zJw}r0yVyT`To$cPZu*DfsU!J>EbU6QM4$8pAHJFDE57WsIE<0Tl?0YK>R(Ua~a}(Xyr>QD8vrG@zE-2 zIe4R0|Y?}yH6cktC00&IK|@EnG*uo94A};zN-Dr zi3T2$7xyQvrM<6~SmJNUy-NCrP0=8FZytYr{g!G7m&_iV3=B|idhYBKQZ)Ee@z#x_rz{kwmPyX^vH1 zzj~)W^VPKqINxJ9-smYjaDQmAYbXsf?$dqCrAq2grWnz@e;sQG6TlZeS5Y4_S$Tmo zoJ8CwN;~Pk_mxF@t>Rxj&Ru{`_m$)1w$K=40dk>o(u3ryDr_~eBL}o6mZAX;lP>lL z3=_S2YyaEL4ubWM!3NO)7eY1aekn@-bA$ic$tkCf@qM*#j=NBMVtO+jKxsYk-P9{I z-meh9Pbh8C0>4_kg`QKaetEsCiYIN9rm3u5V)BL0`;)HgHlJhfdTsB(Hm=K%#lY`C zKy)qf%a?ELNFaXfIo8%+q}5HF-N~Gh1{!XPch|G6>p%yyFLI9*QVR?{EznT3UNq^R z2H&jD9VYb24lgiOn#(KK8V!v7K={2j2E3k)z5CnMITsfo;!kicmwN+V^~hbe%CIgK zP6&*zW<~Oc)ko`-1NxN^&uNRlB`8~5h=5!N3WI}2^Fb6=9{oitn@^n3!b#q_8K75H z?d4BPf{crz=Y;>2;epDMdK|V2Oqi%6T^P75G*>u}@Skk^wvJ8PW5a>!pLEg@TZ^)0 z-g(aCWNEm?o)+ePyj^Hqt*!Qhv+pE*F&?Nk`t4coV(-vC4Sy5<{rDig%u>7FI;-Yt zwjWF8KHQn_ZY^y83~bs7zgUaC_abwqI+$ZK&(vJtA|g)fFQF~i0kNNP*0aZ0-%sGJUJeUfG<0MR<}KSr-2Ph7 zso2VmpT(b?IKkUh+tPEmWLtl=d9Tabu>3BG?{Je09<;EC`$a_e1FGu|fDARQaKm8> zKmt^F=`cM!9hAO6Yg@Pav&t?$55XiyJhfJ>WB+fV9LNpoaN&m4Ro2@#C-8pz-qxj? z^ib-dnRgnj9tv;nhF_grs+9J+8MLt0&fnG1(M*3$qXt?HeW=|ObgeC05-yU0n;KJv zFP~Cjwm&OM4@ZnSTzvX9)@|wDfQicP_`<#y7%S~C5N*>2hefJ~#dkMD%ur!@p@^^*F#H{C6~`WkiLMd-uAA|TLDLrmL;r2@== z{#4#iQ^2wuLQ-A2P&vGZtQcWLh%x14tf{Nr5pkkXuQhyJZ!;J@({96-dUFWUS-Qzg z!Y1(djEi!FfX*&j7}C@8*sRE#YS#{CSX%+cWTb;E4*~T|$Av*NRiV1QYGDh&KI}X5 z^#bltyZ7r&LsX`2N%9Z%6D?iGo6Ni_dS4>oRuKaCiH!iUidpXi_LH{z!eRZ`P5s+M z(ht8xLPEpi3{t(t%h(%iAHw@P|3-PBP~#EY2zWMROj;vsDcZ*4M9OCX@Ak0aSFKfn zqE93lwf-$=esg*_m4oJ8V!mx;*hYi)gN$Kef)8c7&w25@XE*L_#=ro=&a8zdC6&DH z`HuVbeBcemiJWs<3+g^|h?E@cTxY z1=_U!_pb~}*sq?sy|eXQwT~*8pn?|)L;2xHe8i^%^y&ATc)k5>4w_FyOm9yL)7N^a zH+x5OY$ig>nb0;!XWRhQGG_ah-}wU9mj=}&=2QoDCin#;x3+oQp|adzfA77pnPdC+s#>Nl2CxcB~>&y+&gkhsiVwKiJ>jd8=(9U3~dM zNW|kD4Fvg%aXjAVXn}Q#Umn+bf5YDF|Mf+jTO$E7)O}Rj!sd;V^BO35tazpg{^q|> z$Pq-LjW;c=dHtJ!fZ2KxXXMn{BY>+{J?}adJ4=M315i}ecd0aM1l^a;I2>V$eCIH9 zE>^;MCL?iV8yZpPn}fnBTE6_m22$dZY(Vg$EV<#bopZgk^iF5dYg1_FJoJkLHqw;+ zbb$y((edK_5xShS!%nZH)gxb^U99P_uhtp^ggS8(*urh@QtwLe&;bq=CU_aRc~k&Q z0x-f7*9-0OWm{XC&357-z$AM6r4nKxZymzY&5ugQ+wx#*$FM_R&F{wXE6LYquVmLU zBn`$d-4b=ZYUZ8pE3L($pR(}`kqFc}$hh1K^Kx=dhxo*CrM~@WeU= zt%RXg{jY8}m~nZJtgp9@crB-bkiK2ZvD1V609+P6hAyezM4EuNd0ESrnO3?VtPC(uGPp?g{?7G6j`E{|5wU;d1bv7_~5FepeOLiD^hcWJA2b2yD|_>Fh&ZIyequFpi;bt6E$l3!36 z3_-@ec;7#nRy;44#sVK;EyK=;eC^oM^+~je7exUZNoiS|*8;yHQ(JYMfss{>iCuFp zqczAK+5kw=j*d?=3$bd0ei1?}DewavUWns7`bF@#mjROxPKw7Cqjm;R1r%#_;^a~) z*V|?7M!z^1`W1bHb_Rd(q@SxfRCY6v~0YN94ow$fh*F#L`}&($@vJ&<($XWHPV`7&%ueC;%KjiM#`E z0()_OADuwWLPO1W7WJ$FxDeh!(?Wm05EzK%?ChTbINs#P4t}$NjLjj3w=DvZ?2GicFNCyUZ12j=E@&kA?x* z7ElOKT$64#roN8AwTQ$c=s6bkrT`4%jp!c5-;tlU3bcTvVo`78yUHO_N-io{+l}Qr z$DWwYf^|S+8w}uI*_*Ks#g+opHSz%md)D}m(Sx6>sO~uMf%2=8n+_ym>ZqKXzu*NY z8Q`P;9c}@zmeEn=@!ym2HmtgVH)g*e^~K%(yAqEc5E#>k@^c$?WC$qeM?#Eh|K{WW zLKZv)9T>@Sx4SX5?F^mC+_w?TmW*x-=qS6-|dG9J`~(<*i*u>W4Q{+fHvCq_#=Ff zQOt%p)XE1x47z2<8o;|?e$$L!IH8VrU!!?e+b~lzqCd=NT>ai2=n8#i5<0CEEh3kH zSo)Z4dFj6_Ipkr<$@}ARrbz-GP0R-6>ee0ZkK+)i3PTIosqd1jaDr)tr_SY9k3AKlj_H)pzJFSPYd$?d>M*_e8*cI@-| zB?z^myiETtMUOiDuqftj)!@*+l_go4h_2<}(+~eP8ua90E)B=oXQy+-F(GwJ*lbxOY4pY#ygp2!?wEU2Mez71uwd6CEa=YHq zcx~Us`J~aG)0A_y7P5|x70-G}vu_d-)q28-Q~*&zZAL~WTDQYD`T51M46!eyw8ZQ8 zrZHEB89Rc_@_B0=CmPDipt~v93^%1@j2MIJGisBRNxe#+vGJ&y$rl;8G1RrZ+9>(1 zihnDst_t4imwrLHI=}tMaavkW#ygks0l+9T_+RZ2{b>us4*(LK zHln8t`AKKlC9aiSGQwCE$JA)_#Z)kj`a>T{^UyNwSnT?-$tdk{Mwik^+v;_b4$cCbwKZuKGs7#~4nBL| zU|L2QvOhhxyNM(PMhoJKsQh!v^Lz1$hR6v@09jui+vy?^T^G4xK*V~G3f9@x)w9;S zN(1QV0`~++SD?1jRchAOHo#gwY{Oa{@78Ff-5o5LsBAT@t881CRg9@wMyGAVqF!Ck z2y)oGO%-qxY4{Zxrs`98P7ya#y4EKSplN})*JvEG?F}VWHCi0xODudE=DkmK3fFgr z!>W9o1H=pNx!y(ITeNsJH>!W-<hj*(3`ZZy`^qn?RSv5s3nfBy^M zm=D}f7x+MjT6IH%#AL}F&1HR2Q@hdeWUH?bFauC$J71>;^jjvlxq-&^UVsB#=JBH8 z<67q#VxYJEx!gpe=R3<>c6QxBTtFLLUp}^m8g)wTdp8sa?~?#tsN1A@A<#eQ3yMhT zpwsF&sAGLYyeuf|Gh(3(vneSZ1%r6N~+VP(`(s<>M z7{Mq~G5R?{jFjS`ew0#p*{CQ|O4xV|qfL}@sDjQ?hs}01ix%dUvLc8x(Y9!*^>L=T zcZavf9`1*HoGpc-8KaLvW~{U2k2~5vByIV=jEw4sef9EG^D5&f#)?(bJs*6X7C=#w zljYy~OnQcx^?*^73Zy~g8Zs$`41ENmL=^Kw!A1bT{l9-H1<-p)PSx)DdCkj2nV9G= z16if>^SbWO=Q<=aZU^cbT&&wBCy6ym76aY{##awB!BrW#4}{zUii((#k;elxY1*&Q z6REIKM&M^tKXxeIcnu7Haw>Csn;<~4oxeT&95Tcze{^!P41%nzj6lPpO1Z$1?`8dB ziF~$KK50jq3oyRl8*yh{ORfh-e9A07j;qpb7tBikpjm8*(mQH>$@ls&q^kf^UVwwu z-g@U$Y(B|>Pnqs!MWP9@;||)9L+`pGzRo}cXB^fakqP-@ePYWK;J4tCxbwOGezKO9 z#fEVuza`H;JLzVp3xxZ{*;zD~-N{d4%)n1DSeUl=jUv{5IwS^uuVbL^Q{cO<-D+)? z!<6V~*l<8)W#!7|FgsuOj;4%x&+1Dt@y>6M)$4i?1nWc`hKj7+bgqp=&Q0*#8L22S zd7+?7R?B7Pz}1TGyaP%GN_dU4>qdSyaWOus7BR0*=p{yBnh%FO zYW1q+diZo!Ws1UB?zPd$u~;f<{|0!-UXETw zSyh0A+cNKky1tCK;X#r;^WZO4sdG+o-BDAaTQ+7Aiv5Tc%V6$`wxEl`9>}%X7usi`I_eS?3 zhxPS=87{;mc^b^67TDTRv0}s2RpFL;lu-H%VYdSmp8?O$GWZ=kFbpYNp=?`FTKwV( z2Bnl_AJ_O3Yp(W)>WS&LdQMI+b`nMKNGw|yu@3$k*0-})#r6$m}#DS)%=IXEGnEwE34is z=CmS_+UTd^QI~v9zUQv~S?_=!0a@Q7CD*!GAzv`Vn38lqNm133?jJF{%y8dB9m=D$ z?>|ix20`MDQ!HctcDeuWUyjglC_x&+JPV_vh%=6S3O3%#`U_CrF29^C&p9Y4rUcHh zx7|MEQkjTa?_km?0)BK8F5TE3nj1klBg4|lt!B{lzEi44`Lv(%D=YJ8q;&ITQG0hG zWvTo}vuJ2x9>3@=`0q~T;d?)7qyhGb_WMcGEVx9(Uhc8Ju2ooFHq8Q`r|Q4%V3Rd6 zDXFgPS&TF+o^fcDJX}oI3fCL2iyCynxW^$#{ZG#Yd`aBn2j?ZKwY;ZZXvk{39j+Ie z6%Az)|57@jXvP2+XcqjrIu^CEw--?9o8Fy)6l_2=^y}NdZP5uXutl#w=-Pqh>7SQ{ z$!D1PKG%|h;e|knpHnI(fDeA&V;)`k-!%uRbjz=h1iFvF(mnn{${rx)?@8wsJf(UpBMQVJF0nv;W4#Po}*S*FKGiU3JS&ABY z^Zz;K|9E^}9J0QR;zHTMH=B!7L8Oqn(HZ5~5MB}zCN4`w0;;GEY;sV%eKq38N5wj? zx;uiSKzMjJ^Do#FrIoUhg-VuCHbBlzD3@%HUJK{mDEGIHsaemI&L5uEm{9EOOb}yy z?jmO(3+0jtq)}7{M+)Ay#R{$N<@=U7-__!Me#{d3AK~JE-|DRrGO#`3d245MAP6GD z&kAu)dNmo{2ItN)nRC8aD_2*ZJOj2z1guYlGZfeIEw&4;RMfQi%f$k|eIvONOI7?a zYmJ5n5uy4vvDHZ$Qun!)F5W4i;yL7Lvbn3lLBQC2d^OMNh0J+$)4%Imva}=nbeN@4dEQ9h_ z%LG1+5{XHTL*&dyo+wZYd}sZnTH7C%O%o6xp(vKRy;SO$mccBh8`ZLKH1xqfLt=l2 zP|zLGPOn+5WcXby?*9hd|J}hyAa^CeSoGskkADLz(XO@*T3dVGab-W4()^@cGhlh{ z!qCVF;lAKaPv20N@4cRXR+iun^0=6;E~0|bI>^BoWH(E8S3FJBPjWg?znAi8tuNl6hh zGn+a%xojx2nXeso#bzeNN50j9)|G0oydA@jh)GHcyraMpbu?DaSrtR4`}3kBfPbJw zLpgtcP}~8_GvjBOh0*DTcXq#MUs%AKxb=qYU$X$FfBM3DY{La;R;?~J2R9XqH>+2mt+*UWLa1sA>o3JgA;ko+?1l(P2t$C4`k) zmerJt{BEd*TpWnf(9k$4ZSJo=uGS9Jd38w1q!X!KUC=hK8(Jt}rs$?6nkC_Kv}}Yz zezs)e+0G?e529xRb(1>`XOPvo2oj!tXJfA2%nA9Ia0z(A0KnOP{>fQlpC4A2%zeAE zw)R7ul<8fO6&JBCX2Vdc{m*Zhi8gy)yd*YE!e>4p7Yrg0?66!R8Vz1-jG4vvk2`$>g9n!vpkJb{O6&hx{de)Dta?H zn1EpH3V5h%mUyDz3VuB}2PMBtfyCvh<wXS=`w<*LttFzFA+cs)WzbEtV$5EK-Os2f`;urCZvW%zvCP zdwn!wA|}ywQl;iv-jT+CaYba0P_5;hhIfbFfN9U>R@cCLTCVvkHKhB6au6yPa{B9IS^!z;bgW7k|T3dFugF^#d`!`tMlnbO_M8A_& zhk}5&CrqQY!O!$4{C-=rxa7kZfEPBUDsNe3{XaSXf4U$u)Q7t}nb-g=RPF>-+o)}V z9QJrryxBQp-)1P{UyI>{5HiMx%2wruiGkr14Dr^Y%}J{5;;9F(pL{=?PSl3d8v9f- zhJD5@TU~zUZ$*WTM)lMBDhI1bxp?Hq(Bj0e0IMKJ>Tmd_Z9{#P;~NSxp&kdCm};ae zJ3HY4mtwF^So~}g7xuqIckGW9lyAX9zv5)|VCA&6b`#H{MBA~6uDl>dx2xL$DZ=BZ zzphzYp<^Mh)bm*=;m(52iQHu}EXYSkMOL>sd1g7((Vsy|-@zrL2$_%n)%1Eq{Q%!> z@H7_6>w2D|p7l{eMG88HE{GdaN6>yA>IMc~gtk8Vb@?AV_CNKLA~WDKLIZaLiBrU2 z%|B(n&V`8SZgYBHNz#BWP7c`xss3e2apVsPOZSNubqthK&?Gq{6-Ct7>9XE+oAmGFl1eFGSpI`eo0}}u%Sq)H@B_uXU8J%a~2QEKIA?&esyo`?AwT0Oj%L2ygxQ< z5D0#~xYd#~@iSAN8(Pc~JCFMx^0gEoD0ZCfGlJ4FY7;*y3sogEbM%Ig7*`Y-b|B~tlqWD0?Agt?W z+!#qj$jw47ICn)P?tYn#oGNG(xLmj)QEd?cGiJf}A_y$te%{mfyuDr7KfQLw=n(5% zB1$O%Bf?B0l{?9!j&gaKSm0|Fa?ShX9}Q~M^E+HkyrNBtI>AHs?N0Ew^X_gaRWk8P zCYJ+bNadACMJbb_Jp@eAj=w0fuH((AqZmXDl?Kr-B)zN%Ylc=Zq%N;ej9(qT@?>ny zQ87F?47@?g%{u5UMoFtj>y;cLTu3Lz`G>nw^aPTgcyd`B2oNsrdDO~?7#pBE_fCXE zX;mUW#fdhUqlBYgp%J|A#Dlz3{#`ZGq6u;-fTcSG>)<>IErln-aIS#;0OKxGQ#8(H^~wF7$W~J|o&>D9=d>|u|EZQA z1-TCo76$8H+wYaiXqVrMVb8?9|CvqG14>0enP@5ICL_mIGj{-^4Hy+XTm#ll@E>$`+U*p+=XL>`h)pd}A=}}PCs_;Hcfa`U2{NOe zM_P*o%f7V#%tR3zzgcc_*3peATza>_Zhj<&yj1u87y91K8I2gTCYBb9P>;2`_vKw;K;K$&L3NvvLT>Ik1}rXI z(N(uc5Gu6qu%!m0UG{mRpr>DWv1|5Sh6$=YB@x(rl)d$WeGzwgBB-9DWhC@N?3bVA z;e0k4tDqWaT8JsTQMhC#jDp(x0D_-`TY_0Z|8!jc#Q`xhzpE;~WJhF>2Jg*TM_E@p z{=EC`%nBbcBlfC&MT!%G|I~FB;=%2%3r4*0$lPxv+WN_jk)TdYkB3PAyuq?&-Qe}= zbt^~H5_JSRGu`anv_;rm7EVkmO?%PEL0QF!;h@4w)8UGNZP{=Br+KOkcUigJs}Yz) z)2$1bsK>_72-Bc2clLCrwK!xh+pmE!3(X!n8Va=Phv7f!TE0RfqYZ~bEs?pRMRT4T zUf_T58s~p(5v8*dJQ10<#6?os64F*5n{fHQmgUx(-taIFK1}*kF|wX1>-a*6DV5(W z1j4xr0)g8GMh+yJvC9bFP4GJJ*USuF5=~Pt?gj$JD5iqk#1joKYUr4nd4o%v=9?_ytmIq!gD7iml2y&b#YP3QAbdB zXha(Eszqm(Ppqw}ISM<(@2OCw-Ee+)-Ff;m^xV1=*M&E7!iPq^gbV#)zIt)c!ZF3j zZR&ZCwWYPI@;VSr#|RS7rMWB(GYj|S9&Rb3ea|mi_Pi;nNCr^>cfBbEm7g1F&%92) zLpImS-zYmVIIi*KoSc_ePTx)qZd9rkA_^bgN)W%0zvgRWk?F^nGVuEY-(T8R0G)|-DV(FZ=D1`C; z=Ha$a2?=pp#ipR{f(d_}_TWOEn|!?-)!d+?^V@pOu(ykzIeFugd2wava|p9%X_-{b zd*2}VhM5y{+Pd;GhS4)k8S6-adm@+L{8u7{EWrdalz#nwtPW(8xQO}4`4*qB%MHom zLKyOX_sf_rd6;6G+qy8HFxzz>_V^G!Sql7Ew^&0Fh0Ij+oDQz(td*8Ua@7gS_r33L zj8A|z3fbaEg%-9z-juYCL`lvjm6I4_{f;QGF&9buj1VDVqREJBgDONJ{=1SNyT(xP zvyWigZa5y~jZ=4froPMk0Z-%wHAn;VkWEhd@AyDCfEx};DbiX3U>)*No=qp6qL%pl z*eC0e3#Fsdqz)C8nW;%0>+tR3Ffhq$b_BA9Y-$I;7hq9m5JZR^btV5RbEk@d*Dam^ z;`5^d-!QC%kh5VP!p=GFB$O{ zC@PZR&vF}L?WUMODm~}?eCD(GoF5x|#TM4nOnNt@cwgr~Q)ag(iw#hpBz{Ml z9R@9=7pIK*(V}AGE>9dDPU3cKIe>UsnvN6_D|q0HwbmcUJDoXHE{9^!1WFRFRkAre z@e-Nqw- zXqbLum}GN#KV!wA;85*59^dBGlq2_1YHr7TUh|iLIFl^+> z#kInw2}e?|vc7*~_(w2DO{OhNvHw5Ophpk65HV0i{J(rw9rVgdpH%;YAUgW@I=KHq zzU3rOnB+%xP&cd2*1Z-^ltiJLLS)o+_DLA&v>9rfhD#!|SEJx3$3&$zb?qcR4!cMoB1qv#b2I>yXaiYd%e` zg}~f`p=&x~b7htwuPy%rq|YNJhG_N|65Y?7_fy)ZOv0fiIQCM8;Pdy@t8hievG!I^ zOoG7u*EuG?s*uUsE-pYeW&KRkW+0s@Ug9$zm%UroYph*=>2x_!XVY*c@2^wPw5jhh zQX<#BwUpRs%axXhrW0Q7aRiNdZ8!ZR4|(JEJ#w3%!z0p$rjQh}VA1$5?du6Vk{=~k z3BWHHF%w)0MM0qctZmJausR*GVBsuf(b(d!)Izu0r>4575_^+KM_&celqs-G^;?6V zc32Q4^nnlC6YYd&LrxVf`aQDePSI@cnSq`BlGp5uH%^q~{?d^=z?jN*mPcIPl$4_s z`Si88En=qnzTeb%hCm<7=}9pnm=&@FqPaUEP$kkbWMge~X#X%o57^Ap1cW;Wk{bqO zW2%F*^q`{UVDW>&-QpCZ4UMrf<6~k{A+L&_^YQ=e7CodB9|>LnMQ`T}(Oj%pYI(MP zyDN(}7t&2yZK{wj>fS}aB4Z)cB$0DKn;Z#U7UOfQ_@z@R%k?t_I5;wGwZ>G1hO{s* zqX)Fy|MnHLEKf2zaYE4{vL8H`l6q&HWR%>wIlwA z3vg%gu}T%BT{8^xyYPB>nnK8BnLHSn{^snoFu&yX^HZIt-1tS2p9*tBqY~3-mp2E> zIhQmSlJ!<4)bEv~-CHFY6Mc&W>ZsQ@d&@aYy}(f|Cc9=y zpD+A#Bn&3YOb11{>?fWB&mYMNrpx~G7xL{R@ z185T-@H2^tss!#woU+j|AWHB%4;n5C%fz_l+vl&^5Q!d4bo|9Uz72f!DLo}zbBBTH zB6vV-qq-4$Ml)mfnvJS(rDIJ0FzSmxkR4E&&N}+2>nWKnLwyLjDumia+DA%v0YPmQ z!G9L8v_Lryc-8PdZ7?WzL_p}ey@#7{B%BNl__cK!!{W!c1UD~=4@iC|{uD{kVOWls zN!Qkkec?;Aarz7MrSU;xt+WvUFYlkM+vtY|mlS~`Tj7Y`LJQF#BXE(6_D<2>;g}EV zAk^xk6Adtfen3^+3+1kMZKRmDVZW+AWp<4zAYif4D#wbuI~$;Ns%%!4qr7&tVQR$S zwl}hIxlkfjHVL~qDZA76LGz&_YkbUO4+^VxI5Rd$!P~XvbyX4_oSlIDmRTX7dF~c* zeaPBI=a6>b|o%CT{y4H6#96=oWHTt_8z4kr?cPU+=PR%6=(^ zq{)3eM{WK3c5QS}BBEoB%JKQ*4YOC@I6&QY+AMZPpCVBQ2pIaOd@-EzpJzSU4xq*R zC*tx>^c%Cdu9#-o3YBBE9Jy}?Ord0T<;{z!d%{T2l=m>1Qpw@Sc80*$9xmu*o`_MK zuL=pjUI=i{V&j_hSPg$u?FWYPmdrb{q$TWY99($35_z%tD`k5{t8z{D1FFQcWF=AdT~cg{*h>g%07TD7eNonjm9+5iA)_pWH5xo z*u9Er;8HNP>BIzliNA;SzY-)mY9O|oT4Er@0fNGeov;XM}kcDU0I*0!q_(_7J=Sn+mS?k%l6>9fv3-bvJ+g_iSmS}ww z7Z<}ntsd}sO`|~FpX75R4NLk8Ng6sTN>Mx$IXIh2Ij&ruzF;O(TlyLT%!KxMwU@jj zu=~^tl-oCD8jN5OTB&f{EJ!~yDZizEc-3ufQ1T<$e5{!YZ@QxQ6TD6XXk?8!D}CZb z+JiTP`XIg&D7BIG98cwT)s=o!N~*nT`p1PQd) zvRX=R_*aN$?Z((7x9dn6RkO89CJ@^VeNv)rP$dKvNud17kg%r68MgA9NBoVX62?yY zY|t9DMYIneVN2MH<;h0TjCV3D1v()+IV}fy)u$u^gq##WN<+e(;FNx0wD6ss8L5DJ z9p(`*cr%pelQRRA=3FX$u{yh(P8f8i^^)B91Bn2s;L=t;X=4jgDO=F0=bTSo^@Wb9 zeoNp_KAZ3Z1&n{<=6Umf9qK)BvR})4Eq>-E@4PTZJlIxM!GGKxKpN&bHnq)W0f5u< z`0+J=qCB_Z+R)%y944ReA79>lg+{~?PISjbNGL=F`(iw%!#w%1I2KiI7|K*~&%|^b z$L;3bL0!r5$u6U-M!aq|TQa1ZYyFX$#?bP}2Rv~A->KAoRUB4OXaUL4c;De=di#Nl zo#djqM{zv(ANkft8L5XOdR06wo{Fz+$$hKfch4Hp-UT{Zxr8F#F}ZmyXf&Yl6qYFe zAgO#^d{;XEMnQdjT`!jlU*o`JO^(g%Gs|HP`nyp$xZjXq-SqPB>0RgZh59o z*Yx}OrB$q;IZIeua-iDf=JjZ{xCygX^`n$2yUwBF58*)v9C;)hC5s<#l?Vl1U`cEh zsRgwhtiB0+-)R2Q$?YQtI}r$dl6wDEP5(tGm2u}n>vLL^*H$vsEB0plXm_m4)BwU zC%#6OD!QOec8`~iU}MD`$)}bTj=&hli`xY6$m!x%u=%R~o09%$JBOXcn9_LZ-XrlP z(Zb22*IN4p4n)&FQ0P548br~Q+ zHSWlFfRT!zCc*)eKX;%Qdun7*66mxv)YC%n>6foPVY9_}^gk`8Pwp>=TdFEfI%Y^5 z@dO-$B7(rb%c$|W`N(Mv3Gkc#x(1EU$Nn9U9BM7s_T(B7v`HdM>93tHkfsd zURnfp9Y^Q=y~Nz9$~O1Q0_NfA>7bNYd$4VhF+SezR$C=LU%8l3b7*P^;C-x12Tx^{ z3cr$K*~!+IrPMMKw@f%uR{jL@%RY*bxVXdG-ovPt@Z)331pR!A8L#ZOptE8k0p~^PRewPbMPQA6b=R%^hT%m`vu3TaNz8tjwc+2f z9@FYuN?+`*9FnlD6N6Vzj~lV5_s7NSBDT@BFbwA1EnaeIIo4tRCM==;KxNzJX{y(X zPB#_Dek{fB7Vpafso1nCb8C1iLVmLL3D?O0Yka<$^yDT3I#Y$EKDgUauA-ys>*Qxt z4?x02%`T@;q+Yw@L{hY3xX3&i9WwYR?k8mDD?qOHW*hke6$!+1&u>||ex$_<0vXtg zz(2PBFxB4wU0B}}Hgm~oI^kVMV+xM!xXy{H`-uMUbO`uGKUka zXlXDBzyO1nhqBP1-I?bfcmC=3+^_y>*M3bX0O}$69-34|6a*~UG3iePQxJu<g}9I1JAj|SEq1dTE)o4q5^LIUEx}QBN8d6{Li1CKyg;z%yt@f?605UPgw;m60 zbq9IUN|@5Yv=1-x)P3wqiWX~=fj;QsEQtP>?Pjn~I_>eeCf(KKK5BP{fo(WGB&T7f zemke6!&8JZp7QJ+HWmgE){%y4>u5_-TnG@(F>fXi8Q9 zE=sy*wp|TdYI<~q;ZaSF-lA~za@!_@b5WW-mkwR zJG!hEEZ@-FyaX2#op%y~7DZ#RLB>H0qJr-6m4#?F7Xcc`N!CUA*_9v?-cuDI^O z&SM-Q6kO-Q8V_ZbZUGi*$p4NQ2TymvlWKTzNs`kU%0%la45 zbC){g4sgsJhP|R70{yRK&+Yz{-n7dD)xOM~`Y8mk(lhd#6u~$sLqw=K`a{{&)Ghcx zR?qGgsBafJ3OdQ2_X=E5r5!y2Ww=GGER$Hg5t?{pBHOH)USis^SzB^03MB#LWnUGm zr@&5dmgjABeN)Vr4)B6ng2Zif1MrmnC$O;wOKM@VgW&DwNIJ-6St02FfR-xQsbbFG{)B0HKHctufb)l4IWg53c59@Of(H=fg3= zICwm;5G}qMi}rH~vo1?CUyzq`yL+HiCN&S?u-6)b&QxV^UqW*G)#M0JUBwAP^Q6vb zt>ReWBuInmd`0?6OjU#2ol>09USt^jOyXzY{evE%{IC4i;Vmay0WE z7?)&S(X&5mJsQlvRV~JVs-?h+ofK4r1)*Z1ghi9ffFNP&C1u6M0pV0@A;7)2AV`^o zC31COM8cYt#yNpJmgP=Ye z`wEp3hAeHME17w3yGQaJBeF&&dNS+9g*#h{?WGT7Eze5jTg=Qd?lhs!!Y2?BT|ZBO z%%`yYFN6@buu<*zc#R$QXD++FB%cxo-lHZ2wRqNHL?YZeG!#XtSfNGalem6Ze=TMK zlDD^fuK*%=QeMq2OBp)5oGd*}ps#~m+&%J9^kl2B57!$Pg`j&H*MN7E0UoZ6ToWh9 zV$T(IKee5KH9J24bo+)aJccORE2GWX0^ROUkS=OT{;1awkOcmTRuq2H%%vIRRucI6 zWQ!&EE^wSbfu#wR`qLh`0u^H%TOwBnK_~q?;38XNJ$Se?yROP~7Bmvx+X-O`p+BO4chYZZ%fTa-{jIbBl%X_7 zoJ{{aR{WN8XvqDx6i0x`9!<5Pr2#%0KoV4x|1Ts&()-^5o9N$FGC+3a3a>o*r&0i* z9z+WtYZ0dlQjfa=7^5(;;9*F^qOpM0+N9{SaeoL z6i>{`^UoFts%j=yw=17IKWCa2mEx{|?nN8$b5Z<5f~2AId*xjJXXQAC-TT)J2(R!g z8hrcVDX3{oy8Db|<3u+;6c0m@zCjihW3zG=$XoGA(oJwpMlQ6GAG-a=DJ1#~tU>rJJJM!{*kW zpVlS4A5AWjjJ>>N_TCi?ylGScq_P?r((8Zju8Qg3s-?$2wGOtf`uw@QkIX}KFgfPv z$dngQ5fW^%7zKxiF(*~|Ppk`ZI;x>3h&(juP_)>znJJpVdVyVfL7V{HO|kU$){bn~ zg`^6+qUqZ#gGwD~y6G&~9C>cJQ_@6K`M$zgY|N9#^ElBkTWyd)m`V|+MM#u``-pZd zY7mW|FC6z)1))1seyq`p-U63!Z`mSlJxC{CNu2LGR_2WsMSDfexqsP_3KEz!gWScF zn2>vPGv}_RU)Z?Y;Dm>L{JLC$(>5(a6D89pV+tE&yxvUqF@F?Fc|TVx&+advtzF3i z_hJKIgaHi@Isod2YPXgrv3FL(r8E0TTW&4mY)?Fl2c_iRKj)Bcq!8A^!{N1(5+vjr zeN$T=esT9VVDI0}?okpsJgOESD-H6BCJ+R5@Hi2jebkLC_@=RC^3I-!v$jMIJVI+X z1^wft2B$>)_EFT2r#J;qyS?)YK-tZy@jEr3V^JNN4y6$~wdge3F5HDwjf)k>Uf%5Y@P)wM;2~nCt zLTE=13KgUOF3JeU&f$L7Nt$kY&dbBwT(YW7f!OklhC-UWy=F~WP2F*CYWCC~-*_hW z49&a9jb8-C`9MN@@hf9jwtC!^&M})A@#Q#L$P+Bcz6e1wa0S7F7dxw`m0*f9P&xPn z)VC1p3k~!L)jtkpL(!W&J3~S0Aora}HbYcLKsJqn$MvU>c9jZfQyDqUkbU$PIRag< zO6vSBf%5U|Vc~B>LomleE30p+Km-oTnGH4+xr;sI*{5?YWc+-vSDZ6eeWKFc-HkUV z?EJ)h3xxbMOzNl5?ui-GheRT6E6~#jgln+$(MXv88DJjo5}T*0eP2e%6VILu_(lUUl!(7krwUB9J$0@76a0^-V9C zo5;0(Ih>ty@^Y4C;~9&IYM9ahNu3?RVDGXxtZmDQ@_iN*6tcM*fg~7V1SE1OqE8ZnQmp6b=rgDLbwjJtpD3bAo5+ zxA?>9Q+>c>mS@W7+GU1Zh=3AYaDtAG4`-c*xqU)X9c5^JW5aH{k2JThkA%&1@uiMV z>Phv*ohw3AN?_dK%!u^eHCLmcg7Rp0FOy$&KwssCf+kOZB`$=gGTnUt1N|-E?2EiT z?#WgX|Kls|>Z9`H5OPC*Z$qHkAvriqyl=Eknpgh8S%+960&DV|E@a6?tf z=8yZ4=w7COLd9&{8qwZwoToY1t8=%{c)BBkR9%hauRuMQ&Rrho=^PZ9wa^;;wi)Fa zdJdetJi&aOQ=|t`;G38j&0wF_bb*dBmfSiKPev!CghlMqkC(jR$=nZCiXN+MI`>Ldbnfe9>JT$Rw9&E3)CO`_o|Z~eME z_%JbBorTGhANv<)opjz8`&u^3P13ExS^RZ9AFdKyPAk9YH2Gs;V!n^^pZ@Hil12eL zcq(7*Q%IpKZi&aece@;%vozhPO@AGf94a|!aM@@}b|$+qZoY@B;pro$3U(Tsl@MO8 zifg&za}&NtHh9psn#H9aP$K=@c~7?gE{X1qUU{im3iFU5!SGX~SqctLyZErEpj~Dq zlduS0L>jKmXH89g z=S4E|gYA=Vq7A{Fr@U zHJ^5EZQY6hu9(w7W>MB=zR6||dFCIT7t+)PbVvdHf949S{**LDgm*(99dyzpGZRW9 z^HqO5clq^>E5)^}%565e>Ukr*ozaD5E)i};RA;Yc)3?tbBn)u0agv>4Lkp7*Cc-mb zZH6!C&0Q#T@36S*@b=(nGw)Dl>6G={){cMDlK0w77u0_pND}a3p-;jfe^Foi;d+>m zD0%0@RjLXRD}Vf;nAW|pT^JXN2VFgIsMlR-h#w5SqU@HT4Z>YIoQ8xt62uI9t`TSQ znJ)LISXUQcy{!Aw!)%sI=r$FNp;*#N1ZU~x=ZA04J!?;D71b<14A)j(S_3^oL56|| z5fZ<|Qbh4hJ*IAWyg+hBjK$7mJ@9BbL(ua4HjLqRs3}r~{tL9(W0-iM%}rD< z@{>v;GpSslZdVkp7Yo&t%ulSj0OdLe<6TA%K$VaD_ zn(Q1*-H;^pD0?Gf_B$70m zpA`8!)yMmnZ*2h*;O%YRTRe9Z(u0vyd|v$FjA`lmZ1W>t$~RH<+o`;#M4VyrRVdcy zzrKAL?G20=Ijrv++Q}j3VhV}k#ZB@&pJHY{n66mq-x_9^(xJC?IeHQrE}3WbLS%3G zxIAzF(I3qGI+MC|I!n3S^2cb^t3ewXiM*>-Jw;jnp+R?F_wN9HSsgd(^`&f9^ zm%6jQyV>epp8L6XuY_3bB%GlV;M(_P$-PN5=l9sUO zr2CMf55_CZx994~S_90kCdK7@4h*=%?hmh?r&tljd~6qtjr9xsONLW}|;n<9hc%Gaci^|v_ z#6@BuKXvhPd)3W>mRJAz^Xr(}Byu6H!`Ef8!)LkgYMP3+=p5Gel4nw9y#xIi7|N8n zf}BIcrO2g^f2iDm$(m3JHMBt5&unav|4V`SCH&Lhz(@Yozkwc5jm~VJtof6V0g@aQ zVA?FuZc&%?8LHkjq=1|!`mF*V;l{W>@}O5jt7cx%9H~ZGY8p8=tThYQWMy~v%maRD z%Vb|i9gYay#q*3|?$;-;FreuK=%_?ziv1&9e^jf}<3IMst-Mpif$WvB#mk_^>y#z9 zEM_H1^b*j(QVl0-8EIkOOB@oJD8RXHGu_}7m&AjRoy$o|d=Oflh`DbHP7pJ1Hk- z32*PqZtLfP(OyT{YntP$X2{9qeu!!?3<`U>{RUN=Mc1VJ2SW7aEQ&0WS;APIhl=^g zXM7nZP8Uy2Q*Rz;N}5RIq1ro_=SpOCy~&!f)Vk8kuY~-QbgG%)?ePTG&gyiZ+oZn1AQTw@z`8CgPd-#z!S%jr{D-iyvc!C2T(LW{u*M`iRpuXfmi%hMqhbhTV zNpj=zwTqy$4$o^r^`-8{FT3=OpC;v-7RxjOZsP(qjg~#+EIl}7dMF>E|8-86%RvFo z8|DF4YadkKucp@;wrzxbK>g}x8n}YdOO0~M?4lTBh;TnVd7oqalMi?A`5EIl=52w} ztdTu}XHQNaetkXA91ErP_y>B>eAGsq3K;zak8sLQNAP-95U~JCkpN%%(y7nJKAa{M z-zEq7;OrS*zqMAJWK5AN$M~;J0YqDys|B6E9OHEaho6emP9-v4z%;96ATiL@qC)o+8iQ*t7L>*5_FsW@YNhR1s)Y9qL$2zRytpA&$A-72 zLV&2)d`H20%IRH5y2r*PBul*2%erf%_UOn}I;GxMtWX2@$|wz~drTB1Nh?M_2JbzT z;XOXr|G*dj8V(@%o27Do^61xp3!MQ28101Zee9fHT2rA5UbSZzY;q1~Zc!Wml0{bL zz^PNugL*uU>2Nm!2Qa1`gW3`IBmqZyQ4R`*_$ZRGe`zxhyo)FA)ITgV-?ZUb92Pp+ z*9DKXS)pYbXaLxqwC&(?&obHGi@`5-s^(qytd!kyRZhiNvlohbR;K$fdO(_Xw6yFZ zL#anRYx{POk^Db^x60R=cZv{Bi{%qMkssG$OCoR0==3&^*+m|SO9|{X`n+eDLAJTO z$n-l2gK8I5^MCp$K;Cdq^_ao-a*FeT0QT+RoXd@LsTbTMrKGqDvLO+cTUT{Gp=k!Q zX`w#M^Wh6n0Q7WYS8-%|RKw)k@&-!tjiU89$TgI!^+5^`NBeI~col}^BWbV_FKd8+ z<~{m0s_tu{ejD~FX?iqUfFM9nu8l)VJ*Ds^h~?XAtENo4*lBLvI)BP)l*S*1ioRS; zX~V~QadG+%prb<-hX1$z`SCS1K8C7$&HuS>r%q9H2|?~bY3JKA-kLyxgrbs7CV8u- zc7u`)0BDWuOg4A{n$fTa-CxieJV3ji(w*zOPmFr7r*1-H%@h^vTuY3)^P$ua7-p-f zZ30VyxSVPH;&L|JfMNydrSua4H46pSNyTXNvlk5KvLuX`Ws=G2^u`@-D8x+7HvLTi3`U@bAH zHMczFpO1A*ex@!P`QQF8jAN$5kL3!G3M!>eXdmNT`a%TM654QpY%DgF`peNa4i;tQ-4Bu??{lb3$2sw)N@x(V!r;=wQ#r==~rAF(2rh8xi zg5TM6N}tVV8bPD{XNYPDe2g7S=wIyd%#lZGU3~cN&R9;Cl42FAV&&hDMva}9P&wMt z!-00i)yuU6sTlc~e;$lV_}}r(5(skE3LJ@8!6~ZUY>7u^fSJB`q+uLR}?iuU`Gok)& zXgI|De*3PD%wccJpn^|J^+X=rit66g>7hE?Foa~StZUX22db*qCbC$`+HG<>jAnX! z@Aa3~w3c_9BA$hX1^Dlh$_Np7Lc}kg$t{2rA(mnJM|&_Kb&1yt&$w ze;8?d>2Ko0EB!H%`Ip|G*BnwU!{zzvjar@ni&9Sd02rnr2Z#OZcIz8~z`@CUqc?5d zI~1Q#{pC(!YkRsfXpdflAA?T`ee5Q$CsuykS3ft&#UW7Q|2T>5?k#b zum2(F1;QKV@9?Hb`ENJjJqL*9b#jC$Dl)6IFKt&2rbsz96*~&zbJhnoc5wn*YDE9> z5Td%EM{}K^!h{gU%%3v+;HYxvg?osi=|#G%q{-QB3&ovZl!J2~x}W+%pAMqBs3!clE^XmNTdjXQEYQ5K z`y3#3q9PkgwRsz>wHueIc}^osGiZXgqvJZnH}D{h+|Zks0o%!R7e69G=~Z<@K*_AJv2Al4mSX4iZR zl;m<@`x4yTOp(48#3ZcxehFtC=gjl>c5>LFgNBiD4gV zbUJQa_WrFxQ;32fmvchcFat44mi5gVMy9Ys3{xp`CCdTqYbpIv&jkJ~-zV}>+{b3p zVsW9P%T`^;TXxwSpsu{SR@xqVpDh{NX5xswD43MW+K!ch9&S1)Cr!NUtZvb^#dTRR z{}*TVFUbP%Q!WeU*(?{h6Y(Gp_p7mLa@?!)?v_0rUfD7i?la+!Z}-A5;1!PcUUx7% z`jG4j{JPi5lx@CPY{Fzp=Kr(j_(%gGMD+NW?8wkmq1Y*on$4-~~IKo_b%-sF=WBqVEaa;Xn3ZT(^A&Z+RO%!L3a$8-i&MPgNgv zFVE+Xot@J!4n}s^Pq$uH_qw>F8-*;)sruu)<%LSOw%BSD8Q+{s%q%c~(h}aVd=V?ypeqoNB7Q;@2u& z1yE%{Y?kDsHh44AqE7y0%;bXFZ*Rr4Z@~))L|y_#UHy_sg=(+mtr@GWOmR4aD6q;` z1|$-a({yUv)9Mu!LFAVueDKzNd8Tez1y_G(qhvd0uSPXai4#%X(r{&wz>FFDZ*Dr& zyszo1OcXg=(_xWplk01nh22Z~YW{p9Bz9a?-r|&kUz6}B`%sEypH>E1!GFG}qMQ{d z0g*6we3h-ZCD1h;cURx+3ohI*Nrdl~ih2=m&(cK7eehV*QNPe=XksA8d=hVp+0|ff z)t#?;!l<)g_b!X7`(lN6ZBf>6-mbKseQm#J)gKTzt^p%Mas(&;chLrK_ieCb!&%t` z@t3b65`*?9@g)Pjb`OD26-wCdkqAn3?ZkunvX$pH3joI35EWh*%0v>7i{!T$wsY*F zQ9r}6H~eRLH6;!xVbb`?fgpNtsKM;;ca}BE6@|bs1f$l6zY(d>6}P?0THwoDKV2@nF2kauL_j6b(|x#ep2 zzq>Voh6}>f+vb2_;L6%Bk`Jm8GD?`H=lEW?N7A$fYA?DmUvh|hDH+a+a}1GgJ$p#! zb8%HQy)E;;Z5qV&3##H_%*rz4v3$jdr{+M){qajP@{^T|;N}#}C?1Jdg$Pav{TS}c z@Z#`T)%W%^nU|AYR0-r-1>@~JnG?h6epq#}#ev|9tgu*pmTFI%eVs~s6<-FaBZET%c> zJye~JKfNTk#$;FKF2Bg zvZz0~-kT3#pm)uu(%$?~6geeH3kARx+o}*5X}*t{d4|eZmuc9eeI!%2M6wcYd~J4$Ru~nxRn0hRVO#I-;!? zmz_Cs>ulSGof0lrmxt8N;K59?L>;1q!x&@W^Osg$)vqSa8U#Fl!!ADF(YnmBn=DPt z=Ii)R#cWE#qfy)dMu82waJ^5=I4TCZnr`k>Si7auXb>h1*I`tR1B(n76B`x&3%^yU%}RHb2Am$w&Y(MBoP7+YdN>z`5^=lNPH!@k1m% zyQqPh7k4nu9|9#J4CG)$uL2$6g~lN#A1*8?V8r(a)d?8qDfqX@x@#S9qNIHp6~6g;lX>;<)vLTkqEJJ=7>Rj0Ej$~2#w>b4`bhDG zq`*QG{=EeeGnFt1=481|S5sGgq2n zbyc18T#ayWE+OX60iAkpt-O_(YxyDxw6C?d3CcUKUw+e$@9vo8Hx69{!*HFE!y99e zpDfsKns2>}6sA=&ec2ol^b@qKWy^+? zoq}uLQ$cj*p6&TLPv_-5%Sfe-vVDj3c&iKjC$?Yx+^8;&!RO~^$NgDhFbVMpY@Xd= zF_x8D8*6REEMlTHlhYTflHXOWA~MPS;&W_6Loz(ej zvno9}1uG8jkAH9khbu*6g8Z%Hbl*g3bNbTg66b?4q5{!C#g$Q3yb-4w^AxcYQpQ%o zlt}C5sMSspLg5O~&?ssCH%~Zw2xUB_!Uk@`b_&LX$UNv&*8%rt8b(ga9vPlw zA7)-V;R|?!#l$;LeUIpT_vPN_&{iBJry+W36TN9g=Yc*4E}_6Va@)AyzHoWMOfCFM zyXYhS{yDkcVs@0>N2ggB2ZlJQbMKtDakT4$iiTpq*5<`0+k^NmwlIFfq0)YV7}kxO z2px83J|$wI!Wn}`tp9nWAJ4aC$X@`@4`GCzdZ%P?J0e^wLuyd)m_zQ>arJC*AR3vb zKR>Xhrj=q4G4vezm5n=^L}NgvJqdJLDP%NgHR&mJ+P;y7&`9l2!T8UkyHf7V!W!}e z7BFFjHq-37d$m%-V=6Ja3}TXy=XS+(n5Pc;b2l-B5xnVGwS4P_65T{GT%K0Egghxx zqq6uFy18_{qN@H^lHHDcKhf1Wh67ZvjnzH`zk4&k-55xbf2{fz0rDaV`Gd?G?v1*& z1=r|wQaB28MI=1>=da@ZSvYdINifH(C2C}ZTHKZpen~uO@SzKX?sC6;mgA{}0fJ2Z z=m+kdYKOxcv4|MBT~2#CxRran@Pf^4f;dy~qp>Q&zOc*M00Qq|SryeS*$}!gm|6U$ zPmviQIUU$``dO3l&3+{8H@v&lAtr-OBa0jAzw&c(Uhxq7iQ^s}dqFyd70`MRCi?L5 zbZ@Ml(I4kX=x^`wA6Jq~1BVOI&GnAsU8Z0joe2kYwKj64P^=G<@Vr1{H+bqOEF<5X z5-N6U=^L12<&mE#Uo3GaCTD#fEOV|}W6*9+K|XzI`_YPiR-lN2Q>Z-AVU6*IX;FY;hr#1USaANZZeAoE9(>tYiCTE89BGO6yZm|2(J87{InM!Vuv6 z{Pbt%=U2xm6zu?(37x9bHV}gjGCxMb|LLO$?9<2MJluYRboB4;k%TTMM?N7&MLdUE zy4@{4_9I@mzYz2Sr{G_~b$FkOCTtGG{L8oh`|Skty!|Vxp{zDA9}J^CWA>8+=3@!7 zawfuhhy4h_EOaoFzdmVNN_XQ>bJa5L!)cFclSehYd43Fe3=seRCBLJ^<0rZZz>}(h ztZi*$e}4CW$?-nIyY`0#Uxi^?jMR(Wm;C&=%Gu9s{L_OGoOv9ffC}LyGcz0xy?4i4 zKS-?@j$MPkySF<+hg%v^;1g}MV)^3)z=P$Ju%tquqOTF2r$Aj`C>Ke5aUk*rtcXOU z&!F+^T+B?3oso$tH~AAV$3$8~gL83d$>b%R+9Wnqe$1Jr^!>fsXXs!+7_hQanv|`+ z|FUeV2w)G?XkjuHT#gfem7wC>K6&$wF-6?U2AZ16%V|F^3gumStK6MMfRDerZ}lQf zM_y4;{LZ%)0|O(v3Mw-@uR~#2Ea5K(Z#UyB4{dDr3W$%7hr^k&nfOG?J~6qrMTU?z zgpo~pphAn?5V+2~w^y#(_^eKioHsnd9`p&l7<`SaT`I&(YpRAZMOs^h0i62R4!n+078xk7HXe;kK^@f^U(gO8 zHgNuPqml}Re#kXJ7mkZWDhxS$Vg{}7Ihm0|T8|_Y5=~#pYpfmzo96@Nqo)T;tUh_p zR|CjH#zb$fu(BZg?{$OCH)cAzHNwjKIBhpa>!GaTsQWP8AIA$`QQ}3*knMZZE3r5> zzaEYTUs|?n26P8r3}}q}KE&O4Eo}J&FPk-hQ7>o_)B537-)sA*m7>9C-!pIP8_l2Y zwjDtZ{OE8?)$AgzPwKW<_#7~EbQ%cMF zF;3t-^SEp8n7P5D=d@547-_rmiMa_7ju!U8;Qr}~sN(;&Y1)Bu+4XA~b=sq`3nXkg zsv!l%P@7WKF2rdqtp>;mT0yv)##u1}mbp1OU^NM*&dyOsM>6EACEAtnZX6usuoBHu zRC@+{gR2YviAj!^Y=5P5SFn88^d}Pvi_asc*J6x|Moq$i%nQMmv4L)Dns3XUu=w~dB0{vEK4oWr-C!e*dU!w{o1t66dfxo}t0s%& zEislU^r?Dgo&qA!VS>*k1dlhGz78XBgWMjxz={uV^0&rdEaQajEp#|NM@ozcYSR>X zx^5+9GVzip^qIYI{(09mMy7~)?A?OH*at<-CNyYpdB8KV5Q|j4X~}{qJN;eH2yn^a zVDhqA>%0}Mdj3PI4W~mgsyYW(q8w5kV|`cu3R8~`#Yd@Jxiijh(Ni+fDQ5m6JRqFQ zuTToD!@zf;jVleS$4|^rGl(OaVm7L3w8$l$UUP}?uinxP-?c8$6Ks6NnQdIgRmTfA zy&{(Gma;BfN$e?Sdg*#}PTO?(`e$9^R3!4&0>~ZHmrE9zyk-$z z`c&)rXh^K}DzA#aMat-(W+gzMQ#4v1rjm!wh29?=`B_l*CddUdN#8*>G6CVKJ7&&z zyVN(uBHf-tcVq^PK^AP{+A{K625Naj}M#K*6((g!I&OC@pVEZi%d~ zIuI)y>vFPyz zANJniNDPn*r>E-qcMfhgDLp&AJW@JbNLXi2R@#3e((f#YHVlZwmP~rZ*pQx!A=|w) zrLuVYL2O7KE8^y!si+fEFqE?9bGiD(x}p{cZ}BvnnXtN~v6(3N^dO109T#uv2-)2@J1$ zKS(WU08t9XnS7I*1nDnJGj+&Yj35F#E*Z((HU(`#)8k|lPz$&1JAuVPhL;J&YZoeC zg3`6Rhw<0pAEbL_uo zpPCPN{~w0W2xCWbv9k+2bP4q9#yLGm*_ir1K@YRzmg(r`TVrL{qPR`Vqjh3+wJTum zne3+``JyElk`-cgC@Hsl>y*d<7dE1ot5z?RtNL@6Cw?w?5I zl;2AxgwL-7HY};Ja2+$XM=ot%qJbfac}h3Udi9od%CSIJXz6PwsiQzI3LTHvnQx{Q{jN!ucmqOT>IU=PtycJ`lvGwU-g*Dy^<3 zNxT#@CHlgz2vs6B4$l|Ft?EeQw{T&!p2$8CQnRY3%uPI=1>nOLzqm7(V4G-xg#t*=iR4`%Nm1nld zVF#-NVfIH0*&)$Ue?U7pY87fAT{eF_|?p07gI}BC>ZdXnERtL3(dpcGdZ376GKgD%Zbv4tVaLIrsu5q5FcU%e&Te194Y(T)yZSE_-8;mXS53|MI%q5q?n|@k<)WR8>k~w^MsE7e>=mh!F!K+!MF(sn4;8ei3h2~=DRuf3HqC|}9=>6jk zQF%Q7LMNJNiELlkhwQNSpaRRj#kZz!tYhi;9#IH#zjr_kf^ZbYh%Rs+zM|%kNEZwK zZtu0WHQW|i#8hcNcJKP^*)yHmTLI28V|f{y$VE5znywt0`?aa5nbRdjx?=l!mbcAC z0!~KFF`&ENuOAssEdTG8%$(4kk&vEl*sZGzT-p)g$a^uWiApAfq8UJ+duLq^IXa^A z_H8h^&pb}$HYcbsY7U`5oHjdn!!_G-;nlIX-QgjlsGcicl=qDk#B;iPfIn8rfkw<9 z@#L*iZcxsM&2$ZOcuYE7MZydh8mVvu3%`FTUBWZ-(MoB^?P!q1snKwK$?9re0^lA_ z>Z^-V5bN&FIa*1CNVY~YgNC)Y2HE~_l|U9Is!RFEvM&Bcsi=7rKP8&2h5j{Wdws3Pdtj=AzZpL?6vd$ttV>-n~dF=p2iJ;^lo$| zC|s|+_ifJUcOgUVk%vfN27R8_L&TMl`Q){a3&c}2!}E^ghG2CC<;Lam8;})NUJ=cz zP~n^WIBPjFEK?}7^arN=7=yK zFo9`Ti(Cys0-7W`!sKVz+`Cb094Cq0tYajbV~x#K9+)q;L%hbT=I~%&fwguTHy=g>7o98M6+ZB#c)d4f)w^c4JmzTdp+Exd?FNk%kQqW0- zk?7SqM9QZ)nGJhp68-)kwR(k*=dDmhGni&gecnTtFfUynot! zPTEn`A=J&M{*k3)U!e8MCy`}6;TfRYHrhn6{yBF`XC znKY430b%*n^9nA9qxYg4N-N!ua6) zJf`*iV&oCjM%7YtftKs6SyyDOK7^)QT4X5ewH8xG#NDP%TZ8e%L-3rNc7*wrVHXg1 zuyM8wm&H$Glw~BOnvl2duTm}^7w1!`_Jp^&seQTw7cTDZU1GtU^gE*e7IcC?Jm&9` z!xIw<7vg4S$9Jtv5ZCgOxz}Y9^*s4@CBpKe>_eTMsH&>fEbjHy)tx)*`bU!k<+BYt zGw^RuPTeI{RJKv>el!8m6jGC#S_Q)o@B1G$!~Y;7 z6)+Hl2la4qA-UV-3NJB5!S}@=_w=BW9%YzyMEbqOeP#<;z$dAi>s~Dtg2SKh{l2JBe z``(tXyaIeA+xli?zE1BV z10c=U2MEA#afFoW^JgwjKrAXXW0iyu`&I{5th`vuca7qKCI z4|Kn3%U0||YgSbXgvYVD=?<(+?BIdhV8Q3xD=LKrxVGL;IIwhhoS%Qu(|iX?357Z$ zzWh=jYQ(+ASMq$YPF;r^8u7SCJb3F1i70%NX`kQc%UMavfw^K_5@pUiJ}v;!W4P?7`_LA2kQMntTFe9LwH$-}7UR ztc<;I-nKzpxFG8IHb@fr8bjiMk2;p1QJ|}P@~JKPnhc6BTkUee8$?uv=y=#JqK8x+ ztz85H^)e}I4X5<478(_XBo?5Z%yDlMTGOsV+nMi1JZdkjMr5C=ar&B{>jfS9U7prQ zyl0?a<|?0pPl1aPtdP~ooC^d`2alKXw{~3i7N7Ndk%40>K1*g$qNd?Tj4*wb z%$Z}z?)WYqF>dT*(iJTFSX|A&aYb(ZAc$A1qC$#6co^*^EJfSyL2MtS2al&1RP+{OD4?c75C^nYH_iKU&*rUrPqKGNI zAm^SgBP{Z^3kieeilv(){7IsEkv(mj_{5o8$Azybqq0xp<3GM08bKaSc986@3u!u_ zRQ`$=D}B2%lPf1D^>flB^x>;AF%7wvZpPuY;gPY#mg< z19Xpt8|hY{h5PBN6Sa(IiQ0rH=8@@2zswm*a9$E0eZ?&r_3}Z}t_UHoB25A+iG!+b zj|RS9<8E)@m?fiH4#K7YS*0$8$>e7w>@`(ZSGRC}Dx6lKHV`a6{mk#h8Zdr5bvNc& zmXT-@S;(7g6v73+HH1s$OAw@pd1Kq@heMnlq84~_j1D27P0DNNB;&s7ri^L|nTqlw zwYT7<0Htx8@mQfNanWJQZ2lHTos)9j$jRloed?9>nrWqM7=Il9*7=YXSwTbV3G#Iy z^J(`L!vAq9x!u1R4{YZe=t;2}NVeY<0nr*4_nUbfuK=RHMzID6kNIv8T+#it_hCr_ za1jwm}LcKj}Zr{^VBbI64Spu{!?xefa8&9%;DpEM&^ovHKkBKw^Lm4E-r)}#VM8#H{Ocb6nxr>^E(eLxz=le+C6bs< zT$(+CEV_I0f2AM3i!W)PjCYF+xQg;?pP_;o^U+p; z+@xwWOS2*??#xt+^#}{X?T}R_fEkZ}bJ(!2#VB1>{;?Eh`GrkUJby6QIdWUVB=PR+ zaB4ZvI?WA?5flumM`~@Cu$}UKBZB&J8sCd6saVAC7afw@Es!_F9imVGGlMFYW1!`T z>}>*+E(Z&sIZLuzw#zgpfy3djDIlW#?(R0yPX2+g}T3jk# zszMR!K|&IC+X)>UJ)=IgTauZUUG=Mc0kmS;`(k_XU*6%%t+7{qX`o3f=bG4{^R7Fx z2GxpVj$DUWxonRS2AXe0$roHdjr!KKB}rWU*?tB-a6Jgawhnt^oxKW$@&DuNtfQi8 z*T1iX(lB(#&>$fpjS@q5N|$t(NP~h5-6`GO-3`(W(jhI~{cb(aIp=rIdDgp_e_RX3 zJ+rTU-&cG;-+QG59Tpb$Lw4@E!O{1ITz-=@K176g3LOKDy2glCl3a)TKfer(5T>#R zbX5w;f`+kcj^I+6R&txXjpXqnR$rIsN)qQewTgdf9dv2gnTH>)DF=`APlIscWQ2bn zW*y@%ynL=4-&wlxvOHgrz=aP*Tud7#WUwi4hcTFUIwK^=_o*8avEoXlpm3YLkS(@pC?GV+Dug><^F z4JiFH`RLha-n^B>?u6J`@^Dxh&@iw7QN0JFu z{xm0+0^bZ?5N%EEyW!}!lvWQcP#!x>RTn0)>s~!j>rEZZ>_pBJ-ulB-Z8@hqhT~54 z_3O5)rAkGnPjW=Xd@Q@`OmHK*)Lt}?1Qm#2xA6v-mx>|ucBh6^kM6W9rfQ=0K^NX9 zd@P6E@G9B@B=X`T{(V1vS<4g&PP}!#K5+$i^XZJ;rOsYC`ufzv{wu*$SpSpQWlj)) z#NlPnZb>-Xj9B=SZgDJEq4k(ga&0q+mC#8t7oa7~&7EiTb4udAwu{i+t9Cj?#yn5D z_5~%gSWwG1E5!l}LQSuzd@Or-ot*!iQK&0i!GtUO1sBu>NNA}j#|-2*G&<5%_IirS zlT*SP5PAFhR#aS5pUVv{3uiZk?xOaMSc5JG$)J3mh&Wr}SoA(INm8AC)=su?BbxN0 zQM1Xf5mMVo$M2r}erUxi8c!0Bu}sH2iksVn-^oHIhF{){a5X`*1nL-GrG18yh#y;x z)7X;+GJXqR7W#SF1%2y{8XM=y*bSD94=T#-Xm&HEK*ljp%=DhyTY~a<&;SF9wJo+| z?_;8-XD6m*2(J~LufUOVc(rcv5o}bNhc!en5VH@hXF2(lv23s%XEX0Cj1ehaXem$% z6yZ=1kubSqnD=;#n$zHMv@(q@(A74)=n)^I|wfVCUZQ^vP7)k*R=0~qF=i>LsL0eHg zW*>MFAr|rh!(xTrPz+WowBaH(jB)+v%5i}zw)5`ATY;#;QMrO>?OJVRkkw+Kh}pMG!wVqrpY{` znkhWgeU+-}Hd|Q-j6u9mb{2sT7TmG{llK6)>rbz;Mbo08L;JoSjzU)``y z-CA0rfy6x-OCHf$mwJ)R%Mj2;Ba@^?Nj&8&qwLmR-1iQ@tUSt!@B2JK`k5-eAok8H z1|w;H>gs;kMH8=TKzE(LkMw|?^EIYrc0-@7!H@J86Vb{TjaH0SZ_SyPJKWRIaLs#M z1*)jlPj5$ppZ4`smZ}g6f2vfxG+fWoI==PFX0~8$yJ~4?F+cn>I{H(U#8~_ue8!cX zL%9)wV-20(YiL3xLe9^f;9h(s=cdB``t@u6jhZT8cp?Ihd{-dT_wW)OgcnbKI0ZN% za2DRRE11xEQDizz?a$g~>9!*Ly#Ketvh@VO-yv zD)9w$E!jn{q%M#uy1 zQyiX+drNtX)fBANRbE2ft|2#4SA$d!6M7rr&Y#9ytD!I4_os4+?`)0bWlOw1a4M0( zBix9U+~2>J2$!9L`0-ra6C<$hRQyv5U_yxLvlNxkOh|zQ-bsaeSbp$nhoV3Vp*s zv9570Z=|5lRCs8~>#2x8Sm|f2?(I-bJ}@UNkw{nSo;!%s%Q(IL0bR7mXYj||+z!Dk zh~JH@NP<>DR-k6A(MYLa&dIv{_u2eNL(XkXz=H90wmO3a(6_QPuHAE9dpD_^-2*3` zNl?25a(`2k?RJ)|lnnWBd~hKf1_Q1cv4Iy5X|P^CYB7URIpu(p8P`^yrNZQ{wxt%O z%W)^<;e*l%)_`^RU0RoD+KtwH;3j)C$x&NH-FZu}WH&`8Y@vFuS*3 zw@idt3nya`G%8^CVmLm9>{EVtBPn~L{gadI9l`>`0`5ByPH-P=UJH4x1D5^#-S<#m zH3mVgMp^8$2#K8h+d(CMt!BjUE~JPJH^)VihTcu{crN>*$rybv)sck66m6(l9HHES zaE9_tGsE4RIHR3CULSu1_w1ZOVju1xX;(UYG?4a0)$<|xb;8-s2ed9f8QNVfn}OMN zf+<1127#)e)pGu}vuZH&ZTyCzb-^@utB~j;wN7*5(Ov0iizC^;bMf;(Q<;xBV%@H; zuIbu5=!>nM=mq9T+;a;QG&Hylu1I^+C5Y^f%5X3I$TgV7djQdom6hKa-jhFrh`-4& zVNlz{1FXMaG=UTdizvJ|bdQ3bp1|esHYlcEDz*hggo6xeRDw3uUyJhBKPlLP)XSE& z`ui0q*l4oOTS@rOV1SK|5n~*g1o)~fy--CzK~W~f>VqC3fHPy7WQXZb>hRlXkwdJr#6)By?=aLNpLN{IkukjzE3Sh z5VN_n-lSQt?o=m=mDt#0T&K&an}1BhXtbaOyO8>i&bWdH4&)Y|H)2nM1M980#P8~- zd0Qm=r+v)d2Et$~-pO~iF5T5%?aoH<{0I~nr-Na<+0%*Cv|P%}fbK|p62hgGoSisb zU-c`{^>~?6mDA;XOWLmcqOUtB{)83Q)xFNe#kda%qvVlKj^x$ z+BJzhEt2d@VFqLE{)k+0&Bro=_vSNq1E%k{%}4Y0j#&2RZ>h{E5en?!1GU_G>DQUc zOlQXC*+!4}OQ20lh2f=O)*K1(2M2BJ$NnSF*{jC}vVSK;ynhm+Iwi{WWa0zG8Vh4l zF|m#B%HDL?QcF6`cQ#nSzt=jx^sTeAvnzAt>k)Z*`9|dBW{TA?)L*SqWTq~-(=s3=v;3y;|MeM8v`}(r)!AQ2;a3Fq0zXXzY5hf0}Uv>4Wm{a*0!bfIR<^QbcoY zP*}CY=Q7L=9OqJNVX;=G#BYkN);->;kB#WO-=Yhy?5Pgo^7!~_>;NQ=8Ioz_boX6= zOLiVFObN3f2z)SoS-ukEaTSm?354MUym=Z+6pXz1FlgKSYPs>pPw5J>Z>~#wK&AiU zbDLh4i>US(`o7}ho#Ln3Pr-B7sYzGsk`X_&b{UwJL`ib?72Q(7RN5@pR6n)9de9eY zN9jl66Ly;3!BnKFbiSSK#_wLqRaIVeE6ba+N&|zAnpcQ+1HU)uQr7*>ZCDuBhb&Za z70oq4FGcA2Y0paX}u?|PN>TtUe#Ib2x@L~itBY@O<1{Lk#ZTr$_z)u1Vwf7+LNVjS9GR(}BORAw| zi9#ig^xE+=N%u>|Z0&&uTGkRp4VR&U1zO=P#U$(d6K@5B6RIUa&nd|}v?Z(g8iiPg zs$g!lTd!`)BBOh!1_4Ph{y;kX4$jQWpB*Dc-zgp*8L0W z&Xj-Ey1q)%fCzMEB9;&&u6%iJC(__>7$KD-AtTe7(u7VPiJvaH;r8kJ7_qan>XP2= zDy0`894cyIK{(*^+PzPtcG@lO@ZZCTkd~&BAxr!sgn$Jqg5KS|afmb;=y$>ho`TFk~c11)v!d)+d z>`0bjF-2f?dWM>}EP=LS-r)g8!&L&ES-Pxvj(wD^z`M|{qKs^TQ5t5ypn0al2X^OiVDUvT7DC*zzC;#$S$bOj2hd#?GnBX1MP7$4_hP>ONSwE~Lc&6(1 zl^G{q)Mnzs9jqC0I6XH=^1Y69{iMC%Mv!vmXeQulJ+M*WWlblId$y9IX!24X)9-#P z;AqeLy-3y|myRJyo@!a5RCTEZgTOvectR1oz_%fhPXC>7#oh>6gvWw9Eq2jEqe^zC zAR&j&FC(>sS>wzdSjf5JAR{0YaxvHI{guflKo8Lt2S*21eMYcZPiGVNrNj-wM_|KHL_Gj7)$#ay2qg z8dt_MEec1J{@pKHvA01$(N+9<^bxJa^LXH&t?<2_%b@U>ff}tUp?k$S!uQ`RX zUO2BFk@-yleuk1RcO#zTZ`}u(*9o~wk&PQMXUkSVPv~w3^R|Hx>qtbPZ88iLEOuEx z^nZw|*Jgnmnb<(nWBCtt6+;laj1D%9V0dWk`jx=U&Kp)SThMIlnJK=2Iky}sUua_V z7Pgm?YqC(~7qDx-f5`*9fu&2=+FCA(- zR?R8o%_bd>AXo+$A9d`RA^i~?=@0V~i=t%Qfe}Ms9RYrMV|uQ$qj@vAu2Cg?zl%ACUo^x zYlm=P!nQ=19?z+-C&3=K4~@e?;dJG+Qa?pwQZb@J~g zK#4J!e_KkGq`+KRe|M!iV7}z44MtK#igTnSaw?F&wCrRWN>rUu#o#NKBEjRfbIQCN^NVCBOM2t_(vg)^@%b`b_EuWZ(_X>-=V~Tzy$X?%jD8fTWM_E zj9i4kcgKr0jb_3J2{%FYILaBHa5ETZ-o?5wmL?!=IFzpINXSbF9~Z|$pY9lT-s>Lm zMV|uTs;37Ek?`Mbn&$E*? z1?6V6l*X4p!N9{~$($kN$ftZ`-X;dY+nez#NTeqfaAUaipgb`FWEg;$J21c2M3#tL zHlrPOrB8t&hL!!bkdu{mn811xr=?GJHWkGs9Uw)8*k&_MYIMh@f>6%<;6S}DE_Z;Cc??PlRNR&iJ{ZT zI$2(Pfe~bHu({z5SmH?2=q8kds}$iz&aTo&wIwB@xTV~2BQ0Ykq+Sk5NKPLxUfd_| zHWHXifmHKs2dztQQl!BXjvga0xbu%!f<&jnZpOMsid1B;GtCBm$qvzibK9pjw)=8S z``%eZPzuCf`V=)vQ?jSoKA(&a2E$HCgeg`a+&s{_scyf5G9gL^L1X$+APg60^m$H# zlIo46i?b~INtlQs$-hKnjHLAQq3_*W$LDgVn;}w0Nl#Uub5AwHf;px*AgKl8kVLrG z5%@&AAIL*9M){~psUy=b{dExTZCufO3rFvCjW@0&kdfd8$Zl)zQ-6GJJv(g@6FPF& z-ImIE!F@k#J9lzsW2Mukv$C-2^)GSDJ4Et!Fta5{DirGI;sWtU(#_l}AjA+p*eY?C zkdy?OXRAD)9-2~FOq43=>$qcCWZA^PqwPCHR;#7|v%C)n@1 zTMXOz`TvpS?jR^Q5gz!*<6g_Cu-D|P;W#iz zZ*!4iT2fp=5m~sj9xIULqUzpf*_1v&J?s9bZZ?Jj2vVsGXcm36IFJ%!oyz76Oc=~B zyk?DfM-xrKBpbQEVe#REAByS=HYX66FjDMJDbVCu!o-sllac|)L>QU{n)Ewd=C>g6 zm(vmLOI>*=EzBN_qrYWnqj3MY;^yDB`UBo?F_ri%6@(_O<-9|^)t@Kq(7-8{%6dQ9w7h?ANUNa$zsYQQBP}bsSmbJ$ zhGqYAp5u~99b{@~;m2EJfncA9`&lmMrcTPgNZl4$u@8JWU92KWSW_HWAJrYV{kOWS z@ITtRL^vecb#8F!FHZvO&9H$aDErx^&tiDdD9)zFs^h;>z<;Y<()EFF?vE_)T40UT z6&P0ah5viz`R|Sr>+=VoCiyM>lY6@IcNYY-6WVg*wU-2+o;F0a1x;V5*Ew-FUhKidA)n31=&tyI!D^Z%i@Tb^u+DK zH{lM4J5w9T+&nz1i$9RLE1z}-(Y|l9|NEo;p?wsP{t{D^VmIegI|l(u%+e5dSL!$` zwB8_BG_wiqusvs`F+l#@5_eQ`OK)~xN=y_tw?D17B4^D<7Zjn>D+NpReNM``qp^h6i`lZbDYwB)j)CMTnfh<*62tn1r3xy_wxGlcN*x znM0T()=spf8~HVK7Z*dr8lv}hS$1Wnb7kg%@dSDC&b)vr|GL0@Rj3* zg5i%#ed1*!f9s-OW5ud$Vlsi<@>L`!vuU`7i<+p)tQUQ(9v2ny%t&(XLIv1!&I2ss zPhfePg+QvQk2d8!89Fomgf{;P?EHpw6t=ggfBdCr`iJ`N#E>BRrNMtI#GQVJ;BV>Y z?k9EWpRWMwu>KW$y#IOty&PZ~f+}d6W4O=2r;}$8O{rK{K^!;3kll}3z9s$=^m^vH zKOKIqzN+iigu3xfwP5n)F)Pc!AeI6=SzA1{h@7)}NrdCCiK?V;mY+B}G|miOs-DgZ6sA$aF}4V< zgQH|p{Tv9pOO)44CI-7Tk-mx} zPrj+U3sC<``NQRfy*KPqlEu~o7_gB0H!#?d67;3Wd3cc_GlnOa+gmg+QuXTe?9u}PxZCe5$8vsfaNyukm0cR32LeKsZ z?I?i7{Q8VpASr!<>p3CrSNrE zT3E5=Bf~^2=C66P;7gqiyF)+t(qyi#`a}>byL#YBwR<_qTm3*1%L9dRT2!==U3?x+ z%(jE_-qbYC%rMR0NdK6CTtA1DIVYPcvhU2%nL`9Ef&NV?`Fo8bJF{@k&kx4ny-@Js zyh||Fx}_bS{L=dk6aOr!VGa~4Jv4w>_bti;_jx{6{&*2^^*(rNv`&kNx$QRDFq(a6R8hCORTOkTYhq5?0K}u8%gNul=4?~hXD~Vbmhi=+yQ9n)XmtP z>0acNyWX{bus?{dc?uHU&A6W1ep>1>5JIaim!Apj;{*d7h=tQqJtrb+5wv@C=!@6p zM#ts>wfF*NFr_-JrB7YY^q&F+>OD7@b3?x={(tV45$I(ga}Ye{7}+yIp}dJA7zHNZ)dxOW>|c)U!$ zUs4jyh(t-|9z$JL#THJIC0+k2?h@)xwcx-3@UT((RJ~?4^J0oVR84d^v6=wEc>QaT z``qfx5Xq(i3F*~651a_EJD82XsQ=3)Ar-0mkTW(|c|p)3jXd+xB-ByJsUD!`q`Qv4 zM*`MPs9xxt52sSHISrgotcdP!=GIqrE$Nil+xE}()bPLIQ!;VN9)w9pP`;q|**gvD(_1XxLchxHQ}Sqp>FS}pvvL|L zYQzp7{8x7W4>{>O#6MD3ZJOtDvP`~5rh#dgzyOB@bN~^%OL0CR`=akn8FF;Q{NkqC zk-rf=L}+$~2N`Ba=L{uIjR1c}HTv)eVA&yf%MZ$GJ$(mlqv4!Z$nBlt7hOE>$zXl0 z91Lw()CBrBKOo6!wUN;MF-El5qMQs{zEJ#Mi?nK@4T_@L7fA&>i!8V;^KKNJkm%)x zwRg_|lM16tZoCsmMM(0;!_ei}tfx7;_gbduT1BU3SH>`ns3Ah}S1nawx3s45lD>NY z8(Egs*sc4~G^T5>bK4hmzVNV6Tw$a$)9U4~xAu87LJHOHXW5F~<+= zf|rSZmpU^^l)p2mG27O>Jia6jbE=UnZ{;StwaRNgZf-asAqs!eDvp$PU}2lS{Rb!W z8_CO7;>=}1_5T*2gtx2KBFA9z3KggG)o}r6?@jE+goa}^kixXQ=+g)2l8Q3zJDZdA z9bQbmE1Lvty7gO>a4`U+#6O1cRE#F8L;UB+WEUj5Q+N%1J+-XK<>SDVB30try5N!T z0y^1V%t?n-nP-jX!8jnV>NK5*G?14dwhy~FQ@LR(rDm;c_iCUb*jmMXI1^f*gsP;_ z8U@pLsS#+#NtmU4^d3|p?#~i;gn~J4RfNag{)35x>8)gU#e>>sE>7zXsN|Q$HCdx1SFSK9|s^auRp0FdqN>mry zrLG}KMPW_H*el>SrE|XVz`jHt`3k@3-uOX3!%rP*vwv570Gy(0)4P%Z)Yt2W$}dTA2%4LB zW}w?3Dd_6F;x(=ff^TpD=B||fjW(^Kuk(O{y5d?)HrWA%cwhc?_wEW)xkGCf+SHR< z?^>0O{m%4dkHC8V!0x&%#dTwJUVc)c^#$SJYX`UN1lpVklWY;}p!7_vsiazoUMM|i z-EE4_*|}Fj{e-LLH-g5^rHvqCu`oLq2!o@tR%=nBaza3o#ZaRBwSIFhjkyg+x7-7b zux&}9zdE{4ZvTn(8MosBx7sW1FD(n+OjE9liT|Y-_MHG+b+l)0VTem@mm7#T!kAT=DfVxcDbZqa{;F zip%AhnKoY@N@UbGEy_so5Ql{gS&Q_qrVwgFaE=6Ya!7e$agg5MFDt0p-&`fXBTv(K zvC%7!HLqi?AEA$hq@$2}zufXHRXAnvf5FKmINdxbGFo%qm)HX9%A0S+!rur1`o!N! z%f!^iK#cOkEvYQis1-y&&w;+U7DFrwvr@TyU4cg|sEM_#t@mf{mDd$FJJHsJ}+uB%AAK#GOY6rY9AuypX~kE&Oqb`{xN#fqR`G`<^;t5n4#5Ar5@+xqV+M zyH|q1GJIWWLgoP;;to4M;l^D9;RS*MeN3-)%;@?>`z`nlvK^U^2 zE?Z#`_#iF&%k!H>f?g{-@1kB%L^ZdERw+p%&l||(x2lMU)Kev+v+w`vUi}jS@Z|zH z+kaRsUB4I;?;dBp`3`shmFkb#I*2YVltd|xX!)k>TfYVpu((ls?979_gANHPw)Six z8#|V=?seIY2nioOt_A&BF)f7wi>C@j;5;(;r5C-c-(iikZwIwsE-ZEu-#spdHgO#p z4$ZD$HzPcYQ~Tw6eU2@Ie2p_ArpB$}$uX4EoAB!i8BJ7g(h}vc`@hnu-&f1>C(Ofu zJG+A7A8F?Ly}mOfqc28E6fX?welb{nbM;J{+cIn#$Rs`TXfCR4qOrBx%z~)dQ;?E< zVry``_$r8=d-?86x+?fS6m%_Hh}ctY;lA;I3dMkk!A_==>{dhYiRzgqQ<^G)l4E(kWT-W4>zWrDER$8 z9u0;>WiqY!Q9%S1u^S;NqkF}MRItqWvg{V@&r}_;km@jBJ=NlV75o|1I0Ha3FN*3n zzbt9R9*b-yann`Dgr7ziQ{SE7YNn zoh&O^m89#DdN1hmA+^BQ6p7(pGz)v}n8f-;}vP|nU0=Shah3cx3;5GT$ zHKlbd?NW9UMiKcyIv$(<2`>QjM{4VODV=MY|I+3H6d2i)A6gQQOr{^!5__kx6UQfL})aobuf7BAsGVApX^jKab*?9(^{;jK-LjL3h;TOKYVcpt; z|GA*{+kXA!5GpsFTQISL%_%nM`F^=JNHycr5XW>nNiuv;b^QmKcAyvy+U5 zZVP|7!Xac4B$Uxx^h+6}7N4P=~h{fDVm&qj=0b^$kLslxI9HZNqz@c;+HczFNc z9(uywk`WPE{ahB<^m!#;35kheZsj!!!jkatfdKRRaQ$yB|6gSS5S*XB{Aq>)By})n zbH@rSq}3VM9wOF9`!}Dgok-#@&dhc-SWml7dqz~--hR#gZ#G9C#Ghh+{qFse!Uq^* z+3rMF_jiYQ+8;vg4u*bX5dxfkQ{WtF*MfcBl7>lpskO>AH+lrjd)4p)Kt)jou zD$nG2ML8PzH-D##_2=&ZxY6_jRadobfZ^sI$tPUG40oiV;bB(u&UdeF%-J(D4aajP z`^@IjX?i``pEz5ha`xE()>E4$D1`5FRLFzRjdvu4N3Rh##`M2y-`|UMSsQAfasfJ3O$$8MIskNWpxlJy>sD^ax{`2Vdz`*#doqyN=VXC?#X9W}XL z{(SRm&K1FO=E!Re8x|H(g~2&h1jk8xh36jc0Vtc~JeR^^4a@4QydV8bh+M$jtUue{ zleW_QO<2UR{?*hU6z`S=j>!%soa!{yLwNxiEP4NiB$c!j*aU$yZ$gOKi2z#gnYubH z#ezy9%6+vV(Hq-^@We^6bn9E z;WDe_!D&zGC2(s>f44U#696tfHr{2g$7?+Qd{pai0 z2~nY2u{+ihtXeiuAq|NCyd$#%o?1{JiKqSM_UkHQ=@R}C2)oobIF~ya z&USX?J>Gbw2Hz`(>zxZ2L{kZffcrasf)6J8WS5fB{eB za0wk`T;KAzqF(kY5>zW)_T}977S+@|bbU&GRP;)XPP*Ois8J}I{&dupI$!bhxG%2V z;D)^99`27}FTBNRwy5jwu^9g_b~%arsYOwX=W7Vh*1Qh|GwCWcQ+L8K;j_vQi!%<58-JLcaBmZ z!K7H`?n0SJfy&x3OWyIcpKH|07qFZ}wfAF7NJm$mMOwK4U;H~2%ZGARJ^J;->lf)O z&t%WPnSm=0e~!xnr{W<4rhi|0kZCeS;@1GnJDY+BG2rSv%)?Rzpt41EhD)cRH@tM% zNcV1A&rncNUpKGdC-b?tPL(pziv4Fq00ENj{pSnCwhy19WG;gax~ia{sNnObGYTHf z6JpV_bjrVdqJb>vP5{e>HhG=d_?p#S zShjwN{z4)weAG2Kh%=nzGW(=e)(4_rN9yF@WbbL+XtJ{#rFfOlQ2*{F9+#=r+j7LE zh4ZgRB;Eq0F9N{NG0Fe)MRx#4xBmOb3EQ42iK}BO0W!U5d86iOFp42-?X5U~cbUmBK2uTd8`nC<*0i_J zLzg`uhx^pEmyON~=p;>H5&qb27iPiFA9#gxlnnMS(@vI>V=I#^CS3@H3sm|Yy%rte zCQa_i%cw6Su;}w#mDPcbDmyha6o;^Nnv|Zt;MRNpf*TCcW|pbAoVHXKSMq>?fjy+4 z6mUWBNY({wmO6%C&egjyDoNkW@149}Q~2*?;T_D6Ki}~{4%B_o_8Btr??r9>BL%&` zAv(_y(%~^+eg(4T8J?WP@%5$P)jRDTmj%2Pd$(I!r*4u>rz?PAtZ7f=EcJihIp9Xp zk^Ob^q`o&XgSWq;9)^{sXsuOUezBkB^42t>Pc&&Y{{ta!_)@K;Pe79#|0V)p|M?~z z!S|0%w(Cv1Ne`Pdwu_hRopbB*#}<%LpYcxJ~1c?*Vtf!VJWAsjF8w@ z7?)B4N_xdaTW#8Sl<%wa)L2xsn{O}k5uuz#@0oB43Xm3?KYXs{=)oFQ&w*i3llM;% zc>!~XPeIjx9N712_jz(?@v7m6FRN2@i}Lbgx+B8IAwjuv_3?x$M`y`sy;86$Z=A)X z)ObHDWkn>#D*jZFWpM4d@3DXip7Ry2xZw~0nd8%)L*#Y^?Em>P+s}l207o}q$W_e$ z)cZbjG-hs6=6U`m*-h4#UR~we(HoFOeib^y2h$Vznq0?2z4=79Pt&i)H5`vk0Y`S` zx84*UK#Q_7DiQ5di#?{rjI3B4?lLcvQb~NYVw)nS8cX?XL0f5xrU&863gfj%fd;mc zlRHHGdsWd4k^7Hhn1ij35q&g6sQxw9G-4M{vlkBGzpkh}NI8$IsjpV|BE#0w-1RS7 zpA~DEg#*si^H2@!BPHZIoc{4oi`vn!Y zYX=j$y_UG*maTyXwChb;>Eh*WbjEL;J>Cm-9<1hG2o^IqQA&u?`>(#6%jj{z-B|b; zM{gnS@SL|cqmS#w()B6suE;8)a1-hG9+;8$G9Um8!5w*ls@=KOE%^_E)XOcg~`91Vmo_hr-@RS#JiEj zc((Qp(>M1G=j*N}CgI5qh40;^vxFUvaM~FL8Ts z4%#DL!=7mL$s;XJ^glN`8Qt|ZSh>~tn1!iqlP;VGTd-@ z`cikTP~dU@4#7X#@?tUNF3m2A+Qse7YH}hj!+N&EV6Bryh3=2GZ5@QrWhzzEdq%($t+UTZixDCC_D_H<|t{m$@OJ z7zLgN)pIKH!lr_!S2Mj<49yKP{=Gx=J5lEc=GKu z47)m9*9bk9WLoH1?=?1>a-)Zsf=rF05si#$utPI12np%4@cEh#n6{fQ@>%YW2)|!k@B8_y zA?@GtUpiZV&agRJb5M#<>%PI~(-*n$ngh;4`l9^}N4mSDcmgZKyyBWhADwMlgH;Ki z981IQb!8_@O>g?y#pX53IN9`&c12UKen@>2ym%(d2DP-URq;HYbt!Xi%};lSTN)`5 z0)i`jttKO$MgFDny@SyYIaY(}mKp0;Saxd{m-)?x4499t(6LxPXGpu(|J~0LIT9y=1?tQ(kydgFU0NuQ4Lk2DiB-PAI`i%T`HHbwV1$ zaSmB06{`>=^(Xeox5&Peo)T0RGm^eah}psY*}8`a=$!37ER@-Ph^QQ*gJBg)307Lh z!C7y$*8Lmlfyp1z-YfXp;_6r<_*)_i%B1ciBU8VQJPU?f&#j!4ZFjQ1Eaf)3n>+ui z+<7R|$z9ZeuM#-0zvOLr;$TdVi&n4EGM5}TU_lOsIGk78O?kW~u&9A)dNZegB}Q~G z7rsu(TQb)ScbIBpHeF4^Hi0Y^+`0UmqPy8+qTD&QBxD_HTy;DW{%JojLRZ+!^HB(1 zr7)5qCmDK)_1WS+S)uvbf=$%q(_P(}xQoVnV}qXX=hyr74R0KzvsE9<*)eF*wN?+% zF`Y2&g?TwG2?D}~J@oa{dka0Evt3+`pi+h$?4wB+$#nM>{7h;+Rtg#cGqYd;PH^<7 zERRME{#n;N6heXa4~yhr%KVqy%zcU)1wR6NX;WiV|4~swj zDH7N_5KZM0MOZ=WN7GzUtR`;nGr*qAUK6?d}-^$u%DV&|iJ8n+N+mg3??SR%^D?r&Ba!xI?Fj8KcJ3{WcPHP2PIFBkgxfCt%O)lB`Q>)5+ z>mv`@|H*-U`>5W5;*rC^a-WpMx&^k1R0QR-pp%|UL~Y-EIyNY?J+PF7vv-zSO+uQu zLMtv!z43`mJK8ZMWX44G16!(S(ClLkEqBA-;S&~Cm{x^ie-a1~9WcvVeI^1pe11+c zv@Lh?$7W1(3h5vdXF9&pGhx$F6qOE)t1_KqxqDS%3@IgkCDR@F(CwGJKgLc7AY$}^ z4)d2Atr!GnTxALU{e{=>siQDsP6b7Sa-e&d<1*kzDbe(n6$V|#CZ~*=gLWEj?)kN> z-V$S{h}L;&)F@UP6MQwM@_6&cqq;DWBZEk@+3uHR95S6dwq51!Ur+___+O+fU&KZF zW(Mwb@K}D)H%tiPD0uri&Ho)g|ITmptv<`?-0OftKU1A1@;A8cN>ogiq2A=ZtAKb1DcrUg*mA z-aI85c3_8|Gf8zLHq@CZV=zCNekL%|K4tozC7N8{mrK@Tbc%qd{Wx3Qt`xNWUbJgd zbbs^Um=#zUt^HfX>$+CBo(arfOr{zZuywQo_PifgpC=T_jLn=(dlVG#kOWh%FV~g$ zgGiZJ1y*U#?QG!0DJ}1@UC|hY<tzKX8P7?ZMY-*DG!_ubr)cx8mok-AzESR^9 zYLKoaSuFvI1^Q&mAwA^Mq(1u1Y|_Sm4(GEhf=UJ1Vgq}wQj+z}G&-PU`tCG4qhmBy z6dNxif-^dBZdGte2cpe9$F)Hd_6R|u6m*H-&2n7;;$nyPa+a~Uy^A)*(74(}&Tr|< z{xvIz_LcysAUTM%`@r z@HwP(P*OgH%^il2%i1rDsfmJ$TP0pR(Nn?)?q9EgXTa9A2+96k{=k-nHv= zEsS>|3`nTZmQBC%LTx70mbWLJ+pv@;1`FH-)U=sj0^|_HA9^L(45dZMf@m&nupFeZ zwGdiN5!kCh`SvU zz8I~T4Fj+ocKEfIG6q|f4=BwN{EO5I5?J4NvB3)@@mv=_=ULOjTo`j{U@CZ1_g@6XA~mBWXr&W@??R;FSTX0z|GuDAyk6D1jN7;b6 zEV*Cbw!If5)NVcATT(JqT!zJ&mx0-2vk7b;d(dZr8#w%v(ld)S0Xx zL$C{;3@&PoKx)%UGL_Y01vga|a~}b>Zb)QNtL=?0dI^G-HptXEp8jfKp{eJER@D{L zs}Oa2(d6hzYC=u5X*A`PO3Xbi)$YtyJ$p_be6DDRtl=Qabw4w{rwMG7mMQ{qqtgBp z_l9PFu3`phK=F&Un-3(P2D)Pg=l0g0<$vk*2E=a<*t|f(VQ%KYS+Xf^ZxHqV#aoHm z+xyk*LQd;LkOjWg|IIZwzdi$+`bx^bTdm*l^z^u-KzV!sSAYYu@M(Ku{UVjqo9$0w zcP23_*0^c)mBlYHUqz!#DC!lU*!f41Nnn(++3En!kr$l}WXR7<5K#onsSH`ou#tKS zo<-daspiLg4fftXoK=m1Ryp=}|1@BhkW`fNrMiT=1M6I*L($cZbO%Rn86S#@g-siS z??1Hw99}HXuV7G(W0iE~^AWogD-7XLc3#WGABO+hSd0F8v;VomwSk2BKALe7+=YsP zm(_Vn2CHwCDOMKF2neKf#PUdfSo%*R|1u^lh)|mp+eWG^mhpe}E21v2T3zP~isg?m zb9gnI8ye)i%e5z%8|Dyc2}ZJpg3(&UMF(m~r@KfJE&OE|JynV>94fUJg1x_WOz|Uc zZ2R>G<>_dB8hkkVDv#7xcyoEW-0c?pa4ITgGveifj%5!%Jvw#8(txAPqwX!jAT=b< zx|WftR45awl-WM`p;XnUqm(J7rhy|XO5!!c`TtmZ>!_@@E^bt$LFw)gX%Gcz5TvBL zLjk3`TRJ4AySt@J5Ree*M!LH@zr{J8fcJd&-aqcxj%PeLp1t>4bFMY>H|G+XPG%D> zPmtf(gVZ<>9D}~V?VN!oml-zb0ov%*cU8anC+x#U3N)BeW~zZ!dc8Vpmzdt#4$oK= zG794Im^Yrv3OP-xd(*!8_FPb2o*#3|rb~2nb?AZI!{cSzpCzfPT2FMdv=^repULKx zq_>e6!}A6my>n|C?!cQ{+p~N`I`#GF`q7SNQjk+_zkPQe*@)s=kS43rg4>?m;#kYn z^y_f37rQDUFITxsU;&`XyRkQ6aQe0REccVAi&DG{n=suQ!vfy9o~X<^6!X_Vxo`Zn zaFFf@0eLv!+iGV4>Fz|ssG(8|Cz!CEKb|Xv!DEO`3GE}KOjT+eUL8?(UmF5W*apYm`m>TnxS)+sX5_Nto`op;sIS6;bLgzQoYPy- z9TyqDf}g)RJhHeaEFM{NVU~fm;A)y^T4CVXg6~@u8^w&z(}>R^408D4x@SM zCSC~7g$I{DCsbA(#yq%bd$TP@InhBqXMa^Fpqud;3(Dacg}i`z(Kzu(c663VGpjg} zcf*Pdz1Pn=W>8w{9Q4Uw5GQmgX)P*>U^+btSP{nXK0RRVuI7LA>2MQPEm(QlVW6t1 zG(n%oqN#rzwGPRQPhOWQ6!PwCf6<=2v=-msifyE8#kCpTflA-np) znms{#XF-vtl%#e@3gQW+HI8oM!kcHdPtKo%*h%PGmKcrQYxIqEII$2oUMS*pbkq*+Rh625Hy(1SFmK93|M!i)L$hRtD;K-H-wk#E>2(s}BJ$`U1i@z;FJZ&Y64 zHq|42mERcS5krOP;d~%m$GsgRm`F)BV>l$~zGI?eBIB3VL%m*Ir^F!u^;SGXRI#Kx z_dL!jDc#D4iy`f&Wm;DN-ocxYPyCND^D}>xYq?Ycyz!i?z>~CeOF6G(YvH@Y#J<=_#AqG_?q)n}e7Y!-1pPG~NeeC)XU75ZTz& z8ve7Uxja=G5C5Cp;U*=n&&ub2rH;5DL7Jncj~ZkWlD^TNI%|C<+dk9e?MZ?XnLQ(a zd(92aExMGSzkf(;ReUs9mwJl~bLI+N#6LWIurK=0 zQ2!DUO98hDiQANfg#M#>&F3jlRvo-xqSq*7m#iJK2d8z53It#n^}G=0^DL^zp%O&hTCe7X2QzePOV}pf0aR2 zQayCZ>J74sWOlA1!S_$C z(D>1U*S3&m)Px*yZ{EP-vQQ9gPaEJ&cc*uxX3^KcDuq}-x|xK`KY@PR&F@4oQAqr5 zO<&(1k}6x8=kgN}BbvmiWjZYn57u3*9y3rtYfF9)w*N+Fm*_a98WjH~Rd5=X z^Ij^GOXGLuP3k0(R!#{lg177nqC?zegsdDv?h5ISC9VtaCpXV*VpR_)Yhul!r#%y0 zp2Aq~MqtJ%(2&rIVd*kp7J~DRXRG1l^WuC^xhG2(ycn~BY`=YPlt@11iI9G)61i>P zl82z^mDCr&CKd@VuHZAVw-Kk*bQzG?w>n>ivpRxN57Mey#=~hg2r=1-kGX06?F^VVd1{DEtGl$`ux+|gX z4jZ0KRs}B{kwd)vki_07rl=vBIn=*7fU@31hX$D!FDN|iA0pTSKKO4tk6+zJI}^_Z z7J7A-MI;nvr=s>^PJ+;T5}18KLIoHP^TR&=0Ov=}t>OOo)7I`UJ~6s!JUGOBI`->P zt!h#>5~W2Dt_=oqX+RcJf0{!Qp;S;VarY$ogZ<8ez4baaMQ@$?2*v>=r-51d_iPLM z6;J!kXM#iZOt{D zp-%Kt8R$V>;58kb4{W4ht-@6*_kt1ua3ZYItDil zi;Y@ee_q|5gzGjoB%|X-g2)Y1MUA$5fDid%vxpt43=(-(`jCPkhXsxHwd7l~#l|mc zVs*7KYlBEzMfBezpI{O9z2FxBConeBe2&5iJkT8?!@9$)aJgH}&o5+i%2W z!ti6$P#XQS&eRXm{VGr-2;@KUI%v~dp8F%Sv=2)+O{K?8zp_jj5dH{uEsKR^chhGR z_YM)*BF$&tr__Y>he@%v;!aaOOks#?;&_A$)9sF0SUtieb|>L|9Z0)=$`>au|3Tk1jIFYVLP6C`ruPo~uRwY(CzcV9LzDV&+%6VcA6bBkFu z$gI8AKh-Uyo;O>jEtFo={u7RyxYMOz|%?G_}f<2;=6P%X623BD&VV>b>{>fde$S1359cOQMnS)6$E z&7dDzAYlyAR3?+buS>LT^lRN>^{w?Gpi1x+ehkq2a@gCMBENmz?*q z+gyZyNU7TtaZ(ShQe_c8YW9UAa14#G#h+RD(Sm?D-wo4A<{>UCEk?yg=38sVpk2c$ z3L(+7W3vkp)RT*1EH>Ld*&nqaC(@tthHR7YMTuh=_i`~}+c|~pse^9?m)!xQ>sZA% zp#Zji0k5fmDG2WXwjI&9t3CmYF<+B~j8?SKp9Wgp2;PwrA0SXq45D!gf1-I~M{UEk zj(s)#vgL z8~{-X5cqjEbmih~?!YkrLT+)aSVbY?;g0wvAsdb=DLXA@#ZRth6L$DWLK;JB=bHj| z!5O>Owi4M-2d8mNo$jcnh5cb4%?OrkTGr<+KzBh#eanaQC# zKalL=10Nj6%Y$!CrKslSJx$}hKl`mjGG5|IBq$PqNDsAMphcLHQ??-2xtoKQ(CSjT zuRmMZkMlEFOkGwP^LJN#ZsItr2%OpCkCHKhQ#H8E8KuKvE`UjiTlecDRk<-SZTgbH z(0Dpd{V0uO*0%torxY{?eAx4hO`|B`zBPut*!%mZd}5d9;h7i~4QfsXa_>erPol%@ z!FfV&7v8xUYnu8ts4=6xDE5=1BKf`uYciiA|FEhUUuRk%8!dl-qAQb0sG7B1D(t4y zWm>Z;&ug*3T3rOL=}VsuoQS_yq&s!+2%aa?*IyQ7$BsNmJWqq|r}CaR4{FaPOpgx`!N9xuq9gk1gNY9`A;9xo5g>j zg2IYYv0S);jSm%_d1Mv2mK(n8n>JWCt%_=Q7PuV7PqzUa)jE`@F_ORZ6vt!}!3hs`!dOK{hK*QUi3|t1AHQ^&EE0 zdg9&!c)&YpVJWTvcfT}rev{5e!l3px47vZ(hVNNbC;yC8{o;x?J@g0A&DKgmh$$MJ!yfj_TML_S#ocRz%LO?@sPN0Cbt)ZW3k6FrDCcEqicR_=h z8sdQfsW0NpkAtzljm1B|h$7#bjMc^u#U<#uiUM91d>&-T@MA zftGd%R`U0`0d{m;RDFX;|L?y%%r8^C6E@Ou{8pL6glLW((fi%Mx%%WBh$1|^eDpqU zdj&GRysb%PCV?Fmgxz(WZ3?|2A*yXRWnNP-QcpWp96 zQP=SVManWzE-_P*G3aHP%nW2C1vCt6lRaKcvP$EoBP-H$k%*KHTc&Q+YM<`6OiIJ? z(034=JnDM-McFflie|aYs3Ayv>vBs^%=q#f5I6d-@$jy%K zjVGbjgs(zalWnzGNB0IOiZ`XyaU%X2FS{VJrS$W%u!cX-Ck3CI)dx^7WGz&+w0~OH zoIMrav4wjR(y3IyCr~4(_jD26cwRte@tyFIJDlhzAKx%~5S48kb*&3ch|D6i8*lrb zBa@HkI$4YT=>t9Wv9tZde_>mIhxpO;4>5ZiCm-xpQwSDH^&8yL#>dH-n~x2C%bPxl z(8i(8k0I_CEZccFq&FT2&TbTw>#!bcwa<(&A^R)=$AwCfEejE8l+o12gFZ(J16QpR zC?2FVSeb_7C&B?R9a}`!?34(k`a22kC~mJ+F8M~E%IXA33{P+`5^+gQev-`;(zV|U zVDaOBJZvcc?fXHrP2vxo6ZA+~!bkM8hQm`m+ZKqg+yRL`&jSO zkxYtguc6@_FOig=X0T7Z-0`X&tIl?^&G+jmZ@gF-0^u}|sY|8m*?beAIRetStvS8B8H+l>YOY%hqBpPYS3TSVtVvG&{08rXt7mS!O%8-axZE(QzkaLjl}i0Mju;sh8RLb~ z=tQOHuRK*Ot1A^5glRmT8{$(C%0%lpv5GBC| zR9_;y*V%FWFIZ|qvO-f*=t}4fIBonomy27ld&Pos?u8SxIE}-+;)z9$jWOY6J0t($ z+)PDFY~$s_Jh~g06D(ta)2!rujkerDUvt7ST$RpTY1J6g>twOpo~t|uU+Jtl-Wl1F zul*oA{Yj`Xd-`2SLuTb92)by2+l39ZAT}>s8O^bB=8lN^aMsgsD2c4wM$yzUh0Ah6 z4^GWgDKR}npqt6@rKLaTVZcffzH7d`dMJqYo5lMq?|gl4<(QOEro7HC3tnJd1o-Sc zNz(+PcY8B+Yp7Q2(S-FUzLgWpCG|Hv8n^Q+-g;S0D{gc?+k-8)JAYHTpg03BQfn{A znXpprT5?l>`?>gHPAFR4{lm;Mw|(%IXWeT(aK4Zmg!e-q+YynF0PlfT<)j_*{L zoWOXE_8&9D1_si4A3Bpw;#j;+e$Jil4BJjnagbg{L}69;8$>Lh(@-ol+9FPD33g1R!_G?_&;dh2n>*LQ~wg>;PLbpLgJ%7 zx2UR7Sk*6TOu{56XzCNgfBE!=McK9JOorYqypeWjUg~(q;o@&!N!7nQ29nnaEO+S9 zw*a{FlW4~a_&sT=g&08YO4{b#D3bjHF`+zD0T2@aC5oxJBwBV^@B*=PL)e4{8k}sY zn3g8#Ake@AC_;)_WPdMrcf9!MU*OzBi68m?23SH;h{E02;i-zWB}Sf)ZOzL2gAYp1 z7Oj;_9pERqr~2{?ICV3!Hx8v$SPBry8CfT1Y*8??9t-v*RCEOT4<@Nytg1fIta+Tz zx;`&kra7U*V#~-s0D3SdA6~sKUSPRKjIDHRSm=1@0`$>w;f(jnsEftY6^}?rp!<@M zjkLiUl)Aw*_S#80G5!1ldpA}zzQL$(<+aG-`3iI>-?GtBZ2t!aN_Y$9@<_WC(t8!} zst5p;YPmu0r;>j`9;APuLvG`|n@3Rsu>rrpN6QVRa&}dwh zX!UoQHYC;vnjJUNg-RG@g?y)&k5E*F`?Q#pY<~WxCNL&FduhjDPR3G-^W|MR4uNKm z^@n`=0{;xuzSe*2y9f%3^frk>pflR(Tm$P%|Y7T1e$G=xaJgAo{=6BvN0(a9fN#6W95mBpfEF&taB6bh>)c)E)H}IDWWe3I(wyO#>2(h6$?H5c7B;e#sWQ|Mf zy`NoP(Ms*{e+>7l1p8gS+fV>o*hCx_8mXHUn;RzR!;Wc#&n0JZFMBl4y?x*);HnP z9NLy_AjH!#V4VK+*Ia|*SiN{S+C{lqce2~7&0IH<0J<@^A*doxjydgxq7G5e2*}Me z%&Ee=X!5Mj0F#wP;nCty4aKC7UHE4y44di^$7iyTua{VeYJ?{TeOfl|dA||2+p#9( zft*}yoa%NXN&q5&!rHfj9LuL3m-N2USSfe(cvqB=Ypv;at@fXL>+x?^CnnXhk`Pdp z$}|yRDo-L-07BGzL_@nj{XMj@^85LVZoUbT+WBswn}#56b!i>5Z&x#xd3lN&oP-EE z>66U}-r{p0e;HQBZAhf-Q*io3_X$N;`B;JwYW9uIvftmji9%YVlL|}O{&@3+FYYvqh-u zEFz!RveNur76E9{GdC8m@mkj(6C{6E_*;h^SX8!&M>jzrZ4)!-*J$9A)Sf)RHB>X)9^I3}T2dN55gC1_4$bN(9xT8KKIhPBr zSS&}ou}3kZ`R+8d*(ADqAG``@U^8=>0=d=0oClSDH3BfgakN})Y;#_x(QX{`gXrls z2%8U2zo3OztoZYzv}qQ>%~TVgV43#0G6<+Ix`+E+?)e>;4W*dp*wBfW8ZT($;NYf6 z`7)*`^1|BKCFw#qFx<8IUiqWj$B{8mk=VIy8k>Z-&n#k144ozzGe&3>zsR=FzN|_y zo=MMLip|bhK_B>1`K2!y1wT5}j5$;GXe^yqW>~(|ZE(;4l>T2BdizyRy1~=e8|I-{ zI@HMmpS*Y(tuNoAsK0s0>~ak<{Jf&WjvwPT~w3o^0(bMiKR^2#_tHR zI`#|KvNN1L7ep7wUjpdvZadF#BOki4w+5M=5pZgz zE}WdFh_TYf_>=N-UrK024@(dm-!x1}LPCEOOSUy%8^vO+b@g8~va8VW_LEzt_NPbk zyV{W5hQjg}cD}j3TT9YmHTLg((;pr-QS`^e#IA)T)m4WR4^#bzx#guwOz`^!3EvK%4GL#w@w_2{0;R!NzON%B!Bz{YVi>WFEve}Bx-m&72zf$2 zk7a_uFH!=L71DXNk4aXf*nkmu)%+N~lqR-FRoTk26Ld^^xM%VGq8KJv5~-ww*>{KV zroAqWg$Go;p9C3=4s!|S1byUf)RRS} zML19d`jV;b%i~X>xr*s_7n3e4*4)WgcdTAOrxdn71pEHdwQMWq$`LiO3-x@GbK(ZT znge&;Vy=K1X|r6&Ucbl-Jspv6Wi8x?(8whDQkx)E8A`lcyuU`+2^`UinM)mKt^Ge* zgNho^@b;oOvW*P0=!)~BCGTqf^|wb^$HRL07Z%{}|FlQ?BfSNt=*zf0m|zJ^@ee#i zBKwaWDs~>l#&dVW zH(81Qsd%>8UEP!0hN$QFl%Eem6thb5ch`2Vjx=8w(0NZ6zk46<|CM|P z^iC6g(z+!2JQCqIfIqN+?o}?7VvkJ|Bd?0#3JKd!HGHp&IT*NoRj84R&V@)-?&V{=K=+7JV(DWeAq}po3 zR7P&z;x4t_Eq{-?h``?F$d7(GSvTL;1;p-x9g4`Bc`=0U%E2 zTN}x59N*TZsxy`^ktaEmp2rIymIZgh!t~GixXxYIy;nsdH%FaXU6}1@;Tu#PFsqAy z<87P)+P6XQ8@F+hARm3_k>TW_gm~f$`0klgoX$6$f)d{C_kw96AHk{X4lqN8M>@Qy z2$ZAiAug5tsrH+Q*gFC&DT=Q+WuQ}PzMMX}prnVgkfy(0Jod#iq#_aSdqi<#dyu18 zZ%}e#B?wg5tfhnshR66qqV&RddHMWEk09PL15O78@s`!xIdfYnVNkX4juZwYp2s?mPM`B% z6XEImMG@+}&ls;%s2pqP6iWJKCGsS$+|RevA8+MG&xaj7Av7$t2>=NE{AEe$72E;N zP}llbTCU{jh184K=c=WKG(5yO-Zr1MP%%?cje(iH0Cc(ECx!V92-E!+kdQ2u1V=5~ zk6(^e^urW?4fB$8Nn0$Xr)4?0o>7sD{YvvBA=I{*^-Zfsr{sRjj0w3Hl!3rZN->#q zu`3!~VVUmDk`=-3D3g`LPhh;LRa{_6ks^FeR1b~$B|GP8wZ}ji+$kG23fkA z(TUHv!In)@9N*JSPV_XW7(YRDcOF;aMd#AGXuWW*DGngwvVDtAK(MtdvoHeM#tt7bgEM62nszRp>C77wOy7L^Pm+V^Ox^Jx1*|9Wv#55G zAeW-kU*m?MYU3A49gB>`+9%lRPgQws!nIVR9Z~`I!<<{YNI*6Ade&QYUJ0+sKjF9- z!pF+DQF<>Z+}R6*xYaU?8E-ODO=|LHFTaDLh^OTH9*~c2p_bP8DE=S~)A1Eq>|}C3 zpZFGU8z*=?5wCTv^D&|8kcnZxY^`gXlwHO@mz#l)npeRctfX?23@X4>HhpHh2h z$Q&1fvi03G_rmne6Ql-TVeNG)khR54kqwf6bWQV6=e+vcN!H%$6`GYTA6IWe)58Kqb@xbA$a9cd99<5#iQ^Xj@_aZY?Rez|Bm^f8a zEw?Gt1TslhNm-9>1i?tW+|}_f{O;k_s204Y<=?i0;;aYjCH&Z4zh-Ge9?A$?d1jsX!vnDN*xQ-g%qJsQoSn&j4;VGZ}x;}7mz zV6N~-REt1A+7m4}Z-^AX!K>0$kc3RO_2=u2()0@%+^mTj6gXd;if@bI9f#)^=~g3# zqKpm%aPr$q4yog>W)8hzdLmWu$bM}7!Wrd60GRh!%<3e!i~WQT7|>z(p!=oT?R7@4X~US;?Dika;{HkqKbga#e%qHpWpQGxh0hIRZ+R!ji*ya3H|;L=G( zU);+9!K}9<4iqXvrAt%f+ZF0~FjU`<_eU2~~PG{L_x^wg8h#GNkmqHn|TILfW)qtnkTfn0d2?~N^X_|41;mefO z8E8hScF0Iq$_M%2?$EXBY}eAJFDgjpwp~q}#BB?^M6?b)Ez5C9MWGj+SJl&sLO1ZZ ztVRmYJ_KT7YIR$y*2|##q-%di!0FS8k-W0vz{SCvuPTFOdOvA>X?tp4+VfuTo-W^i zlTzrT; zWDuQq4JOe)vB~sXPYgKa$kR(`N_8{<2zDp;*avW)jCcmbhhVFycA`K(^kkbWvfHAu z$acXe6VWa#PG9^3@{1cIsa6;f80iw$sSpJgnNeAyKdYY-S(wze%`QkXt&Wi05LxiY z0O$e|8JQ%r%qpJD=IDcT zHLz+ajbSh6B@(t*6Cgqm)~angl|@-ySOBhsO2JUT2fK$)04v}^M5?P%E1~+jWtqk+ zBVLKa zH-ky8= z>-NtS;onBTqjTzydFk6$DQEU1ovb-@446CqcL#IpfJFQid0%2&D@!60oGjCX(7fjd z-TtoNui|N1Gb{KhSiywrrkBo?s8J)ar&rjB%s0$PE4?wzCt0P(4<#AWL=EtOy~}4Q zYZ^H~!y+ehv7~#8>5+(|BzfNaK+lbJaHf^$`RBw`cJ;?57L*i^s8!a^9AjyMK)oqE zpMi{Dr^pXG@l5Ho&$;NIpAVu$;3)to0lHHR)nf}vc7oXT0<=bCO)##z|3ndgBiIJ^ zUq%f$faI9jyZ^#$nLz(z!~a>s`y2F1&3Fy)1OSs1WJ*sn>zn9a?b3}iC{)AM-}<{2 z2f8iC%9Kk0er=9uCq3IrXHp#LwSt`9VPgeSWBXLZ)l7v81R{;?2$33rB?j*#=hF~) zh^it(;@0pP%#xH<4-mW zGP!A>U?9x4AH5u_Y{MdZUvbiL5yLY9op>%Z*JGoT!2Z4{F{Z|w-S(!IP>DhPo8Pt8 zI`bR>>dF4u8|VQ^Y{hktu|Qq%!y8o4z-&y*itVRD+vyK8?4R;&s|)*BwWJH~b$oNB!?@=QKStkCwzMs16SaQ*xccE1AsP{$+=t#6BN1Kbzj@ zWV_*&ql=lBUxX?sC@eWywLQInNfEAd6+Ff!+|SBFkg4Qd=Th`a2)$_7DqunZl|kAX zV>_}@8lsN`6h~`EFvWmm}ZY1owqNaPJb!#>3YMLqqFQ^@vc$FV5wJTW2RW< zS2JWj%}HyO`vucf-D5nBORUDxGdxfz{-ALPj-Y@A4gJba$0D=n^s-I#GJH0Fuzvi1 zWJ!|VZ^_epfPaaxEQzYLA6j*}?%U#25QAy71+k}Kf zB*X?CzJP87_PFCMZB37C=(zfOmrTI6iZn>`H4JI~5xOAqL$u{pmEGRju(z^CCUy_S zc?f{a@8P6sCx_=2rtPvZMi^=mnC<3LtywLZX z-OD+E@kuD;gco}q{7wD6*LhTAXl76IIFuG$!wLSf!se6w6jQhh$o%}{=Jk^B93xH8 zm7v0|AOv2Mcwy;aW_HGL=I#4c_?j8@gD$Jtx4Qn^EIu7nHDYhvr50W~JicpMfNVk5 zZ)K6RMyq_zc*_W*!!2Mfob%IMn&Q5*HnUSvvFTqXF;N))oD3m7{8Lc0R8a{#?UN2< zxsCH zKySov0r$cMTI}XKptNfc-q&!yx$x#v!>`;Rjodtj`_4X>@3#e))&ZX*PeV6Tj-Ci6bwWon{ zvLkI(bN)axqOcXz5hwtPg>pNbbYbrkp9=9WP2C)??s4)8RmsEr;1Rc_D4V7mEh_HX zM|aW^^E`3@H%T4*PMAnoy0METASje>Q)bJNIQVlJ`c^CW$L7i=_`_9QSB-pQrQlEE zZ7QUOseLJ^?3fU!z2T}uaoxj^)7JsUuCT_?yZ7wN_k`Kk%J}DXT)i!0lA>GyH*X8G ze9Cf;Aar5+x|TyOtC%5Y2$wNdDb$7hIa=SsGJ;q3_YG#f|4FU6uO$AZY_74QcP{}i zaLkkD2oqn-0^BM<`GtNipd0bdm(TaQKn?Q?4(LDQBKL-Mbkvao=!Hs2<5op#yvAKTac-?{AzzM3IpU|aPODtYgOSLoe^)bQDo;e`OdFQQ)hVL&{~==PzX8025^5PA`aZXq{Wt}Kwk$b_`R z8}0wmGAV)xkm6hgF@WvLe32?M6pV20ljUk}vbp{YDqUPf21XHH_B})-Fm|Q!K(KmE zPl0Owuq7-#f*(8t8e34Hjqv(_(wW;C?V4>*iCUdS@_%%dND4E6BMkQDf!lq_g{i&4 zkP#RH(#yzg38y{!A=~FRwQeGs6&51yuQUJ|BnM0AssKSS!IjN2LRJK@epgVX=6&uY z?e=1Ec8Nu|A`~=Kbo20k-i-do%`kUv{*Rt*7)3dMrMsPuFUP-#F`E_>7ppz9r@>aef`au^T|LF~pDfL- zi$<-k@E`1)RRQ~d5ZUHm>t{R7FJfrR_hk#{jvb4OyEGz|naju%DehMM(|*W7a7J zz=|B~MUeYv5sS{Hng7R=dnsG+q=y8h51o>r z5rb!qVASOqXj%^ii?jY-Fn>r~a4S08VpW)y*&yj-2OMO=)s#4+&L=)igWc!TCVNQ6 z#s#Gx#$hI_9p%h)tVsTM`KjO1(*f^a3-=6r>1JyRHUVSls(o)qb7}UKBRf*4@+v|kI9pdlWkq2I`mAA1B~giOy>a~5sgu#}b6 zmDxGjajNOr@?OY#O~vXZIou?vSk7?0keVUfU_P2dm~wb+w`C_CX?x>PAw*D`UenJ~ zRUkWR`gr{$U3H4LB3{$=DM}y3Ifv=#)5qL8=|Tiduk3jDJilJ#DTgy|CF++gUpI6) z8uTEB_xOMWzAYKUN)dL8qe?G{d;pRWTAOjx@BD4$`Bz~XhL|v1# zLKP)$==b3cdYkB+dK#`gonD4ueYe2IEGdT<>IxpceK?a|b)Ga)7>1iHMl5s9 zJm>ohwkEbIgGh;on?_dU)_c!`xYi0@R#llvt4?S#*1wRFD{4?-VcNxd|H+xWC zHN`&QOL)QP!FKlKjG|F$|7t`578ZwEn>NcMhGIFFMC$!U=3n^_;d-*It+ocDfN1DbOl%PDBTx-jxgq@F=b92V{Oar9O*+ILHoNnBEf?A89mPuM86%)ZI+flZoYEZE?%P zM+!BHgYaK|FLyCd_eA2B4mXWWc_*A)6!5-_aI35B-#h`_>4AcqPjEnOMQx?fYqih8 z^r#-fHkv<-q%j=Lk>b%W{`^_!>NtlUU#aiaUs!qeln`wNh9G&do zy!`w@Eym=ZW%IH{JrMs0JCs)>O1JV(&cRed;x~~Qjs(|!HA*&emCFKoGcrQ&aK1Zr z?6@>8xql(&X1oA*cIL1?oLyCGy_1$XL$CP$y@DgsPnmx|^np78I*Ff{v+2Vz5G*Y>Z9cFp0q8+l4XXZy^NV*0*|opLGB_F61k>EvbxRrxYimLQuq2I2RW zMj9dt;E*o(Xa`aV?A_@4H1V%i-scgc29pL$aZF~hL@^^02g%zRm(FX9tOWz1F8K-? zOLA|V8n^2z>NL;vgb!xhAJ1f@>KPCDk;ANRQ@ZxqX<-`eXEva(D>I_SVURlV?r z<2c_y^{Y#G(Y9NLop&%MZpjL%KYnm&I4H)RZ!2g2BF+o#rQ|*`*W;_HFH+zd&J$@tKW@{jOC_g7W z>6c3a$0GJd9)*dgM73C~^>Xrls-@Spx%Vqr0!pJau(=f$%GC8JU z_Y-N-kwdr9(B*IjO(JoutQ*OCA~ih8#Uwb2K{G3iGI4nMxD7^X(>6(SY&Vxk^(uIK z(!q=8*RdzkR(+l3%wguX%Gn2TlM>WB()Z)X7eJrtX+l}-zu6aaO^yq!usKZOxE3p!HxZYub5s8&J!HF>llX@36BgFSd6$k8mipfcEc&Ec{G`xX zP9dm*qjmlThk_@ZVG{G>PpCtOl)677x?9 z6)!)dX5g~Nl@^9&#$G7zz`EtXYQj3DM5)EyfqKD48-PeySbwu^LKOSBo))b4^NH@5 zDtTHO1eVqIUpDO#MCYSjnaeL0r(I*ED6Y zEi}IW=_e+3^xOsoh|?}DyHSL#^Mx(deUmT-{J~Lm6`S*dfWExE+|_Sd7sTo4PMi5| zMK{Vy46*q{-h@XZQpH0KGu&xaFWnLYUJpFI!^qHshn-0Cit==i#4`}O8@^cL8?8s; zPIN`px+mfD+RQ^&J$ z$y)6nhFdjmkv_y<(+!O#kk$_l4s3SP?exE05U&;exEo~)j&>gMv_%dENtn=c{k^Kd7@)D|_8X%XdZ|)rSf-!o`k}$OXen$>nkqHc9D_JJ0ng3)Sj0 z@mY;6%eA6Cj%jGO({}QdjGV|{%n{6$GTou?EHDCs!PPS-7;Y4g-nyjvon7nhd)?)M zn!eSTnd|F6KHI|2f5`|E^y+** zgM1^z(!>^}C-Yh!wWG;3z9s{s%RNIneW@fg_{i2_I%9EX>0R+8g*ciu}hu1PXR zMe^K})Vi~B zHJ&qqW|NgPQS^3JxNG=#>HQHA7!J3Wx;bfW4I-39f4)^4bUKUk5VdOEhs39v7KuDp zt5&ShVPEetwfgtudZ4k>;_F_4$#D>J+6*0-@fNB*ohn*Ma_#S4N;kWw85qz={K&y& z)epVaRsw`{cj*seNN{1D>q>S%)e z{TiZb=@;XI9@QmZEyUIn&dTJ~a z#iV$=NC%0VDKUcp@$RTdgnz%#@^C)nsJx8H4>9Rt?{!E&baZ0A=gFl;0fF&7&*|f< z$=?jH(Bi2n)p-ePfo@u@TDMagFL}MNJ4y9a&;PEv4V3}P9$eLdnjYJ{_TYx9NgL1e z!>vN#2y9>T_uZGIy}-J>C8fjlbIC1uT`QETlWV87D9-HF{NBcquHMEhy_M@16*1+( z_g*ykdt~fan0?2cdR8c-Wp+~BY$uwSRkS0VW)1m`8ay>7D3y2p9N-j4{vIJ4W0Lwf zeY|J2f0cM&(#55?Fp+cprK?*}M8v_pRnZap=c-WmVc;nycnR*I9(Hp7Ew(Tr0Z z;ps|tSG?LQ|6Y78*ybFyN!@&t9c9A$V>xoxV_)4wU@6+~4XVT9_m~K7(DXMB%h>%m z7cOP+cpP(FxfiQUn0DFkuAK)D;eV5j!Ws9y9VA#h8Z2v!9GknIky_FzteP(If_Pt? z;H+%#Z1#^ve~A_=_S!OXaM#h=FSW$F&6MH%QG=F=qiwHjr8{>#c&I`NoOKV9qS8`Q z3CJ#G%mK=?ZnvK^=Re8SfZasfcsb(BW2Lj8A&_uc5@>gi%tGVhx@hZ0IEVCevpr%vN)O4LJ zyl_0mH9r0|Rs0}>cWf|+F9Ekkg*tI2C;CXwiS*%|dPxk3kJMb0uW9gu1p&WT`r?4t&4$l39~q*3k$`xEW&Y`BPxXw-BAJzc!3ZX4SF!+ z_5D8fd%GSP=w6C!<-bB%)hvD+8Y5y0Mb#wyoB51b>#0t9wbEe>_k=PD$*q5j@t5~B zJ3>+s4SD&1roU_humlL>EQ2@n5q*X{{-ty6?L-g#%J{Qse*Q zD=^R@Xz+ z%Y-ykG<~?)3FM74JN)lzz3%OP+S?K4GXFI~O{95eo@7mz9cWA}EW=5aXd_@q)=0

O%v?rTqnVkw_x5J!FqH8k_y=M84ck_LEkSP7(%NzXV%TMC9Zaufx zn>yPq4^@_Vmm1~LC{8YkgYH;dtx>w|pj(S~=^Dg-W{kZ)aF6`%>Ok9yNml$7J1{pw z(epCqM4o2nH{Qsiztv@ZP83^W%*KnIaqK#km7~UXsSNi?-$4P4>P7kZzr5fo7z>7s zwEOV_xN@3X;zNK?&i~IGs^<3NT#x9^Q&!D6GD0XHg-~#OuSuz3+WATC3$TF+Pk)JW zALy@qFo`HP8$xY2gKxN*Odyn3+Ayk|O70y`L7p}!F6K4Emg9xluGQdtoHuOLYh!G2 z0l_)E-p6Y)3^`c)edt5h%L%I+nu9*X>*9t51_JYrmo}5Bd^`2Bc`YHo(n6+b`{&I9 z2&)FI|4(~g85U*ty{nWcf*=Btjv#^{APn80fQoc8LkQ9g-5nCr2GS+n4MU?+($Wk) zDBazh&HMko=b-0Y*ZF@tUk)FhOK{lFv)5j0?fbsh+7AIxw!8TW4&-+?_b~SbzbF`m z2Rg0)Cn6uZ2~zx-Y&9=dpnK$Bvfj8VHm+aD`qSt9^TO=@*m9oq@M7NVvB~#pA0++$ zp+O{TgGagSiTxWZeNN=lVH_ExePqxWPma;L%YVeLsTK9`PMt56bB+U;{rY%a{)xp3 z0rzSoGEZNN2A>LVed?m%cS)5!pS?0DdHp;-enYIGZ~$8^KeW1InayE9h{wUP58hNk zoLR_u?zTE295xp1Hh*LHQDHx_znx2ek^^cNe{RRny2?568i@Q%Y$9^^&e!{UrNTu7 zr846kKrzyFy)rYTS$HLP?V+3QU-OLxP=wy-`M+6VdJnRe$Pl$S!Sk|wy6(0N5}`71 zrR%zxnHao?S|4XjWZQQ_`UudF<7Y+>`GiEF9BHlJ0neEg)8 z{d-}3{o&&V_3)(UgpQMuX65|_<<(Aabh_NliXZ9>!IC`WZ!f37#=CLR!fxku6FZeK z*tj9OYqos0@hZsFX8nJJblJlTXWbP0H!{b3k^Mb7g>oW~e*(e3+W3Ux6am+LIP^A~ zdG~kg>OiqlDuIAaF1JP7@D#&-?{HFi|GP}f0XhEx<8OZ6h|nce|$hoza?MIBon90Gv^~sb!wbT)bOIet6DU|BDb^OnPETpF1=&V z;a9<-!dry{a(j708Tg0QbK@NOTGY`b$gv*N&KNs^^$IO~e8s^E!32A^b9qX&R~js{ zZY%5L_b8y?{Y^F6HRolL&dwLbj;QdpT&~<3Wyd{coLeWmOASi-6qIS{Q)G)l(BoJH z$`|P)-f+fmpRVD{+|2~-siMc!pVFbJmN~sP6gv++8&avBJ)$9k5`@BD$&p+8blhhK}Khf{2}=(J_h!u(*2()j;5*nnIV>POMaT97pLVO0l&+FVZM%*Zg&O$ zaf$uWAzO~!-gR%$cHEjX;aqY3fXG}o(FF3QL5uyE9>(0f&fyf#ksPz`J*mA>zz165@idv~qP}Lf_Nf=6Q-AXX_ zph3sbNlh8e6K*noRE>Zo`mey!-3jvg3O!)iA3cekJ67U($l_YnYjQ+;I#==($nu97 z&Y6G8sO@rmFUy89cUrBz3#c}p|2k&hcCT_N&G$Rv8DP4RoJ$2&B3VVfg+_+<+?Ztw z`oqg#M54?NOTOB}gR8yIJ2VmV=VZnL|jRM^q;Sg9FJ zTW(0$(c7?s3f|4s5AQ{mvJU&~iv$^v*;4*rZCgv#WWTCWxGlG6)J-5@@=G$o3iS~G zcp{9#eddVU7g;$MwbV`N5=|0wiqfmwRg{80gK8rtVvjb0K`Dzkfz}79LscddC9A35 z?d5vVT|m@R4D~_I`de}&br}k#c15QcVZBz1*D%DfZ&}`NwP62V1ga}@U5EVpO&&Rg z+cD4|{C4I~jIzk@k42p3*e$)r>>9PYc27IoworY{kkD~FiN|CiI+Xu>z$RZ?+Ze~C z-$>=H5AUqQ^d2?EZaA!FS)?#3_Nv|nP@9+;w{rDcCGD3r3Ol+Na(_$T=n98H1ML^c z01n@aT`jhdC7gdA^hdz~GI+6l6aA}qH+G1P&)=);Pf)^m9a~(TQyjZkG_{_kKvqWG zL|)qBbG1KD?ejvz`I2t;m0+I0?ZY&p`GVnlvLkwvONSG~&DE=@M8oGJJLJ}qZ3b_h zK`E11)49AH(ycx){o4q$@~bwtNjZ?Kl~k+lTUfEXqJ-24pxwF`vyEU|IWmYSPWR%?po=#=j$TD>sa2Pw5FkDlheM zUq+g|L;#e18QiM)7g;&+1s=Z-8@H}gU{SZkwbXW%!giB?^?aErRG8n&EN3&m zN+c0Uj~*JB&dF|BjaORJN~y72DBzj9_ASH7unAilrv7V~2Jc%|jBDNJ5P^1oN4i(!!<0f-`5jDq51(T8 zq(W;wJa<&J>@=N?9pV;?r1K@U%4B^#OSIGyRSf-uzZx~E3?G!uv*j3!e?i%O^)trF zraA&b@jyD_GK#dK1<>+YFhur4aqJhJt>XttIv$4um2UDjcTAciF#r~;KP*v)X4pe5 zOF2o+qwOu#wnq^ydiV^9^*?F{`_z9mItjuynnzw~yRZEV{yxLjAanlTInhay=_~If%nKM2cjukznhpuVMABng)mi!Y{KN<3DH$qheSeA|WEPi87jLSS1tOoeAiE`U%AJAlTAhA(LY{Oxk;WIV`kb5HyMYdofLb6O ztE!THt>aeEmcg1_fd(7*JbtZnqC=GIfAFap|K3t5UeUAo2sT7eWaUR^O_{7L1n++D z3yKc^;r|C(7v;Qwa3OdAvHydNBUV5WDf9W?_?d7ct#K>JZ)^17E5C>GF~r$otcGg!6Mty_IhPSiOzbH@duJ7Z6DCVu$5Gz2P*c zPxwkg5LzRRxNbaM*|M5ph+{hGvIJcd$EW6x4zw{d<;a@xE4*615h;6{Ze3mO#cc7AKL}Xy8jQ=LYCPA zC5h$c#9tJn%^A>QwObdLsMhoZpi!B;ULd_3br3JnZoq`ATmZF*PV!H-VE2wX zLql~=y+gmBU$;*Nm4;0_e|N!)sCqB{Q|9km5@&%@!9slgNXzG`yI;d2H=~`yBG0kD zCCk&iF?^;vNsJq0&ZlrgVtZ?A)?#9~v`4peYO~g@K0;KmMqq^DZjMIHv+L-SvuP8o zpx=1@4yxWEUaW-OF^BPoQzcts1V@b-+MuGjt=YW5_Lpr7aPzX%??xMw#JU>%=Dr}# z)z{DUO5WNsE_eUX-6M@IC9!zi95{2c{m>ie!*Rs{7gbL|MaJsC-NqK!)%sIBxwUn2 zu`~*Ns86!X`>$0Zw0Zzc5XfA_4(w;V@_`$KNST5yiVRL-tkl>(!MA#u^@X}u^; z;Be{W@!$Vib_gL*qv0X)dtK2)|BbOQD2bbzy<1pKE{e&NT-W|`ebGjssK?W>`XG6c zBlKP1_9i&=`0Lk<5MV-+;EfHnBPnm-uAO2F#y0u%_~v0wS9{`80| z80JaM*J6{b_>+ql9-S|Q!z-S70N zwwfZ6^(@@Q?Flp`sSr30BgKE5+S|z&xwMqe^k`A{tb@4*D<`+|$3bOUn!ZEAT_fMi zIjMl~iAN&5d|n2Np%xR>Bup%;vGu+-4=z0}VH^3?+4&#M3{$BsJ}Yt8hsy#%1;pF7C+zqigRtUV zG6<~FLcS-_y?;oK;FGpDm)YlBeJN1s%CY6VG@{K@-Q}UsBaZ{$D(jN1WvTG6kpCLW z|N7-Pc3l$1vQA_oOSn{BX;cD^T^t2m&LOw=^S`sXNDiynm_QTGf!mQ@3Pm{(_|4iN zCPwmsMT_*Jb}mA3Y8+4MLEyG{fjh zO>fnU^&*82bj6Moc)#GH?_ynckO(G_ct#AaqxC9*mA?mR^|4K7KrE8n_XZ|iuRTRZ z#`3H;@Az6h1C2ys-hU{+*+5(f)&lh}hYRP*8=*hNu(*!Ko$1p#9twL-IA7q>OL@t3 zB`+W&cQ?+R&m4qn?cwQ`p9J`y)@y)QBpvOamxA>0Jq#HxU;@OEUnu8**6R*_v zU%?mpEmv*PK4EHZ9+F$f%qAi|1@tHI5^Rxz)4FP6v3NCRsfJ561FHtnFJf@(db!q{ zOUbi1j`G&ta)s_X`9`D`{a1gla_%2!Z^>x?qQ&w-It7Ull7Ux6If*y2h>i2jhY&cE z--nltDTZp;$|jOKF3YGb8HrHm)vUX}k`vmtd9{0RJYZ(uRbFYf{W%}B2PBb!M^VZY zxzm3(T0$fSuTcK5eMx?4bjNsyWqobWEOR&@D*H)KCVkwp@~3qr4HNIggB(RUkMdB6 ztfG^6G<#+zgRNlRvi>X{)4}iFl&A0JgzUp*sjZ8L@-(JQWmb+Q#=u)!#!x)^ zzF!@5eRz44y@fBeibW(`JRl_-b=C**q`d@%6uIGnSW!tvqw&u!0|nHYH_;(Il(05G z&Ev?Sx8hEf<>iz8{c&V!W{I?{){<1z?;eyAjVB2o z*~pxY@w(R-XIs=B-kYnB(&CT|I5YILZ9t*;g|-%;;P4dQoDo^5kX8*o>V=<#-;&zo znE8Bv59!htvn7X!$o|Ufqlwqoa>~l;$O-HB&lqBzCaA+&-?{JU_9wt%WfpcISk|)5 zBQ%#H4NH$CkSB1gGEaRylxaA6e%Q376&6dq)->^w&If5he;6xL7QVNnWR7FMoI_!^ z)t|fD^1*bwrx2gH|Lpf_=X^UN$7JA_r{zHY`9vD)>Q<+QqsY37InC_yZu!yRcjmJC z>ia`VWeBN@eHV&Gi|OC9#6{i@EM*($jYUm3_5_mKudOVi%1na0`5kNLL>wE=w%0IZ zs?TI?&)?I94Hqee_l>RB7V_EZ^va4X74l4Z^wfs#R=*=#9xwJBM3qyPSwkK4=AAVjGH{3CJt>l5w8NiJxa$}%5}Xipu!@b^Z56U!P^gC$$ez@?Qg4AIKY8c zrt;eLQfw0jsz4?o7VY;Z><-ToB0aTsj2gM{(ZW2fS|x{4}BIVL47 z2GU{nq`02^%x>@4UQAPB3t{Y=78B{C?tl1fJl6CUdag6paI(dveIr=>xNhgCc38^%C^ql40N1gyNO3o19BE+=xV$unt_1mdfblr6g#p>(fz7_ULj*z1|Ct=_;8 zXpckPM%(^j5H(-2AMzMrLH>%&0|+Mhx9GStK;j!^mDoj&8T%Kwaz%u!>6=yE8b2tg>-Y;Uab8x>N2BKsbKtf$6P zww}2X>fsSebHH14H)%7cW=YEBSO3fH(*~+u4YkvfV1APMMPHJ!-PTsM*xaJCCk|ew zlqaZ3LXWKRJ(mWT@rnmwsM`SbNfMIX0tx#mL`e)TeINeCTTx7`p1kbzc;Xh<@#g$f zi6`HxSYS+COP$QSO)HCYP*li9=h={-h_WRy2ihjw*3Xi`YTW9?5mSyiT0kJ{!NC2q+U^Ws?ey9gKzZtAY$;x zFU|jeE}OVbluf6v=l5J@J_%Et58-TsCQKa}qb4qR-A6{=qmQ0~w@7dX!m8pM0xl(G!IwcBfRX;=YV$k&$psRzM`sUD zPkmWTH;%i;RHpK?erneR(~+7zNhIPy!rpg#Smz7FtF*$Y$D~1UD;}27qpHkOg0|h~ zSh6O@=9iakk4zTlTvqKp{hW_;h)`OaEjU6Oo21nIMbB38wmavW_&kZ4_;_Z?tv92K z!=7VWYAa?2C;VX)MdMutq8nT!vHgTg%gt9V@ z0*|%x#5}i3{d<0Z{l^}olUI!SlS9cT#HV%YCk;rSPn=W#;@XzdUH; zapM0jOhfeCkYFf4zI$z78 zGvgL(<2*Hq(yWq+<7wSKPiu0`=-vlXg+&X^Wl%{NCrT?8e~ODKKxPW^9`_Jug^Mm- z?eggTDLD1uS^NGAku`-hOYVW}4B-<&G6j#qwGiHx#10xt-o#M}r%|Cd{fe#U#i9Jr06 zq$f5GlWrIK4uiZFbkJ6R&l= z2mH9>y|``cC1YN6aj`nsUGM;oB6_iZKK|T?wwsNNUfSjmNO7bBghVsk(2{-%Q^p=!|i&UcqPNkosZLXsHBlH9^6@KEXN$>jWs+FGGF-9iFu7o3tq8972AO-L3; z3>(D;6c8}UoU}CGL$<<|+gSyZ4`-)7d1~IskYHG3CuMQ8h93bsleG^K(x_<*CbuP|<+-jHL zm(l{cS~Yu}A-0z9^w(n8i3>6^UPea7DJ%8?>W$65-tm{-nTPQ;iD7sRDjtb>OJ4u= zURMaz@Y^Md{8`P78cA2)1yjO2nhVWM5g5*kZ*%Io;fZmNtD%`|2$gOopwlDl8#k}A z$nVnGYQhp&+CH;?Z55M?^B^uEMul(|Gf5fjhPK$X$k(-E@zw<`QAP+vIR(Yimk>HO z(_aorp=`fSPQ*z_A_>3!-jknRxkQqlSBX}f3&!Ln6{;$`A$}Vg`t+ZWTMIMm^;t@j7mSXjpeho&P>RObr4N*r$RF=F1X1qV?UQ)P zz5ii|5$+f>knUF_w6ag09JA~}SWf~Sf%t2Z#*2@31pr3dE@c+IEx&WXJ}_LJe1T~Mq+8{8GW1fs z{PQetu~x+5btJ=xJ^YNCuzirdMuuq6yy5@)Q#^#VSHbKnPEry~sb9Y86^##s_2Us0 ze5bi$O zieAe6|44m$w~c9<{7RFuplDjvSNIR?zcy1KxwszMFi`T{LU)dHqRFjgz}Y92A!kIG?l{+y0HUT7;E!DubE|*_U>cOv5h)QGtU8xjflT3?tG#nsx&WqtIAW=u6g^qmp1J95g)^*3Xf*NJR6@}Eht14NTUQXxb@Vsy zxN=j3?w%ea@a;P{(0QYwUpw=Ta$RaLkC7IFL;a~7*8(;X6}r}st~Ow*C)?>Lj}#o&yX2(Z$J1F7?2Oplpln?So0 zP*CU5=w7ZrSIaw|Br7-nCyGlN;mB7}ERPAgxmsRwkeV0mj}`XySu71yNj;SH*Uu0G zxfs8V3ga)MUJby2ap$WT{65-%o+FL}ETp@>`B8KV{-o>)gjO$sUUAjQz`MGvC3*QYR;>=)*^-SMhQN+*Uu<6f6%UCdPsEX-l&V?v^+Wst z)3#F>*X5=)tcHTJhG?(O%zxMT5$E3QB~LgX>PJya-K5cMQlaasLFNf@Dy8`~Je0?q z+LWG9oul=F0dyRddiff$Ew+UbHyOOR&xik~C9n3}xQhoJ7(}Sq4;muIs()%~^>e4R zI~dAZO5;n2&}gs^>Wjg756$g=E4vaJ)d+DR9h2+dm!d;Rlrgu-^X0zv&*zD<5w2km zo8Z6k@(%S;y8pcciE`LmF%WMhYGd`5^i17ya~`!u=34O7+hToHLA z_nwlPLO-==7T%Af0F zH{+X+f-Dh(TuDpDN$<%}^gEQfj(bESdC##NtwUfAVvX;;rluJ00a3eD+2WOuVz%@c z2<2B->kB_*zkuN)mAsb66YMu#3TxR5G_Nzo`crm&37sBD(@oE#4r%q3=upgS%P;W;Dbc%lxh7k$0eKmXUgfsdxjwUA!Eg2s(>)i@}Uv9JH za>6!jIsPrx(I?o}up0r`GIYKEUJbRNx=VNg!yD*U(kCF22w_{1k;)@Ks?{s#2Wbf%pEPrcD%^dsApj?@G z`nt?#P;SU!eYX)(HexA`H$fWrRc<~v_wj1QMs^Eni$*z!xVA>JoZ~*!m9corRr)Cdx`1(O zGiquIqO&kxk`aC$Wbhs*ml~!^T+cE{Gk~&dUOXB80(^9qmeNACS|Tpt zAnG>UBc%uD;I=AW%YtgmNvLgyePqzAN8Lsll$h0~2t&nzE!5-_}(4+IG-oM>$^I#9=nS+=iR>Px<^hWMWQHR4705Jv6}rgHdcVk*&nvC zr)(Q1noLUdB;Viqs|uz0B8)L2cXqY@4hALEOTE;8B-LZYQk>6-LaZIOP9?^ucR-=xy{(0w%+)#~r`Ba%NRP7POc*1k+?0Bt;=@pFi7a z_l2T;e#Y8Eb5<+RhLxGIasF!J_+3GzvBoqlqw5M4?i96O`FLl$KiB#(ypnvkmsg2g z9Kck=(CRexUV#gb*U+cZOy*MyNkN9O4(i*>gKX=o7#r;qgH_~C`CQ%MqorkYn%-~e z^bY+q1fGr8P!RNETs1ft8ZfR*^QSHEb*3 zwx#2a?jHIl09+Bxyv*I>wZCJmWo9cH4@!E%@WPM}ODDylzcmlWh2?+zsy&r?xh}*q z4VEu4BOK+t^xmE8cL-sdDo80MHUM}v0_)eKjDGs zvC$*AIFf_xhHCrdt%$0I<_5+Xb#;)qoxy%F{Cvnw9=yPXWg&xj^fnk5O$(*omLn>i zp|!owhX3H!HR~TBWN)%54}8X)UhvI!K7@rv)Nu+gFk@)kk08kA@ zfuX?)lE-JV=>DhRox9)eK7TK|cX~?XL8#urb6g%Dg|*L)@lt02^vF;Q|4PwTNW>#+ zpDKCq7P`%_-W^!loHA585?lBMc`BsCMMUeeq``xq1pF?Xv5lag6JaV!;rc8@dEehAnWW&Mn*q+V|ByDb;npR#;g^p#=f zmW9%z#;PX;ogpkJ<{h!^q9$BM-`orw2Sjx!k2pBt5%{f@16p8Gj7V*Dz_eLDj6JPa z`+&?l!}xkPM!Tb!k`)k@BcVgohPaqe%~`y=1umG7WJ16D=PLGs5EVJ!5|(+| zlu^qMAu3AWEA1q^eS)8x=7oa|qm=2 z`0RZ`83+%!i|lH4fTO8|98sJ7#a-O7tj?++Ao1a;$e!~V_8@~S(eH)aO4-a&BkGoT zXdnAxp;w&_ygUI`hQ@9A*)nW-q=qdTuj5xm6f(pmJxnX8=zoV=@@AaldC}2&d3_S{ zBr=rZ%9e5%cwZ_CA-hPfeoVGuRjlA1%XAQ~`6TICrX_KLuDJ);w?93PdMLNOFOMrl zIE=O@CU24WWl|wV#3zi3rZ*B%EtMGyn?p^~a_L}{1&J2m9pnf2%vJlsx|3Xx>9=>l z;W*Uo>Ej%h`@&L<$H2`e3sWe28gplvDqN3Z%?Hu4UBnSB(@=D*?7NbwAk#Yo=Kb!K zsglpcI|I!Y`^@M=@JM)SEyq&eR%0XnnUoEL3&a>zj1ki3;h)Hsz^*P&e+16Q`h(mY zf{9h412*+zv-o$=&N}NsR9UK(M_;e8rzHFh1sSRx^L{)z@$IH^2NN#mE6Ka9hTqeg zHEi}Ca3@Mf_p$;FRtyq@QMFI@nq&9F6zT5s`bR9ZUZSSJJTF6;^J%9ag5=AiLg#w) zfqItdz3b5zIa>}bw?iVoFq%g4IHZ8|{lMTw7X(K>yT53`Y99l~6a6Jl%KKiTx3NlQ z*}gpxMW$U}n)x8!3*fY|5pl0IJ}_k3u`ciI082v9=YL(bv4=|}l-_aapL#>|z_P~*6h=|$6N({bl#uI!q^#arA07NT z--@!{aI!z#FwZC1Z5wL0(Z+p~1Xg}jO{p8#I|F!E$U9|RCwn;koluDtDxp(jbMrF8 zrgU+kIs~B7odTa19=D`WNGm-OxJ4lesl@Jl)GMPp{{!*9Gf`-_El)%WK<1GP$UG6m zLbQS>Iu2omT zz;~|nWIBMLF8uK((VzZE0&|IyVa2kJL6-1>Gel037r!RtJNZ&7TZ&uxy{#yO=OPkA z`E5T}v4BV%+m#yx0OL!C;YU0%cuoNg3hYjKgl!pYC+3W}yKVkVHUzl8F78yNVARb~ z33g=LnzL+v*A@~nxS4(j=$o2Gu9xl{>cx3P=X{l^xnP}-F-|1z1}h#_q|hvu&OHY? zG8zo3?ZJ8^gBl&jpUl65fBbC9i#;kJPF`T|)j60xrIt76WZ_w@5=-|I(8F i`Q872OAMMj2cEiQ3t!x*yi>1$uNM-s&kDqKef|e5u2~uY literal 0 HcmV?d00001 diff --git a/docs/assets/supervisor-view.png b/docs/assets/supervisor-view.png new file mode 100644 index 0000000000000000000000000000000000000000..e3100cdd3ba63a6c5db890839a9d2b9121374548 GIT binary patch literal 63532 zcmb^Ybx>T}(!dP^At8j|8r&hc1h?S9gF6Hd&OmSvI=H(8cL+W>fdPU;aEHMmxcj@$ zz0bY(oKxTV>iy%b+6B9&*39nRt506F2=Oi)966kfgs7UE!SBrHZfer_!5xTL**^^+2Yx@}+5f#A#_&rw_a6V`WPIbEFw{}4>eaP1_!}}+DCSGi zmttaK7{oY%-@duNeEnMD(`tICHJml&pPuU=2Rv)hQEQg||HK)Seh+PcfV@{p=vV}W z1TXZ57BR)hd8rBPR}jg`JC(p+7zX#&cKfS6+0^T&dUXpa_Wm$!tEQ{~=04y18%&ZEw)M3etQ>IKj5;K4fdyMz=Nm7|90P#boC_*bb31Z(M4!lm6 zWH$KyAY_L#={NHK>|>S^V>6I<9Sw|8)CBPuGJG+5BWHUtQ@LWHKvaIpWx2didGEm? zGv|J0Dm~(*{=1pUKC6n|dLH8y$YtD^7AsT-ksc~papG7sEh`VX7Diqwk+BJnNA!9y zU^mZ*f9=lqzh_B2M^M%Yoq$8K)h&7C(wHM;F=8>E1FPb9^-wC>KjWepF<2W}K79)} zE5~59g&jLTYo!)wB~yu2mPd|6)sQW!2_0u~lU{G0o8aAB&}v!++Jo8AGMYWyfi{XE|~jQfA>B zRpn2dLovG1L1mpy%p#WI$ED~9`MSk#b(`LXJvk4LI{&_TfMFyvkQZcTyL>L)!o2f) z^Nlis--&{`{X(5rC81K=h1A9UbV_`QXQZ9PQb>}A?dU^4-@1X5lnzxZ=fj*$H>uPP1b{Jf;V7+6;{b(lXv^ZtPgO@)c@J_BCqS zg}PL^*H6qh)WYeYM{K&$H;?AbF6k~6)2tIu+>b4g@~z=aW_i!tW(*5{e4EIs-6M}) zRr|Sd_70&dEeRiIhXfF@y**;PXUOX6r*ug-50B1;1tNHOdz&ri#@DgT&^b@vQ{gXF zsqwrrjyl~&ehRB84p770O?<>|u0s)T`_}0~gLNSo9zi2br zD{a5D***+c-|(B=%3yJY>TOV4HMg^cnX<;Zemdx85m3(-5}(v9LfRT{6v8N=$m-Fj46(CygJ)P-GV zilkS7PMZ6yvc&K9m{sP;-vJrx`kU0$kut=AMX$w$2njtCx|UajNlXmmh&sh|9B-{A z;5X(U$!giDan#26W1&OR{6KoXo`Vh26V;7Op3`EN8`X`%o@YVn7#8-=&bMk!AFEKm z3x&%oXmJWfh+;$Y6?%&jWPN0MVHwkQ)?yh=0eJFf*3M}HxatD4qKN#OWW<=t2=673 zL7Ri#;XfW0%4Yc+?Utll-9Rz43JmoQ6(=UT*SEL3nO4di9i+>bi;TS(anv$_LJQk@ z&UgO24#%dDQ)E_($}jz0^m_C>FVIqPDi0bPk3*Z$BNt4PweB8|iG;*5xJ5`s+=F@S zmnkjdpvNcR@FFF{Z<6UgkS{mE3QTdn({(w;x*d7Lt?XO@wR8nx*ISyLr>=08MDK)sAott3nHQc!uj$+z7b{D%F?4#wj?j=w3*xNju zP$C|*SKhjP1xa_5(1T@RvS&z2y(3E@R3<6fbbrj*tU_njLx@JovQTM-RLWs=y+tz~ ztKu0s+wit5`Mq_oSFp~iTFD$1t3j(u_s3i}(Fl?`65WPF z>1bgSo25J>pXBFJ^{mMaJ~Qg6;T-9A^WT^`HmxfQkiNB{zwr#>wV3@pGfxGzbidGo zDx{8)A!GMSw@K;iNh~!PQ;LVyYL+K@YK(e+w)4SZjze0Px`LkJSAA^O6HotcrFLha z&4b5;N7y>GY-&5g$4OS;<20koz$v(pQ~e=@4cWNwvIbuv#jB7qqk7f+fX9=8QHK}* z#f5#)k8bl%+lMB5Sk-Y~pu->0@?xSH%<-8ud}@m_lTIb1GLz;{5B13n+B^;&T+bI9 z(KJsVyU46HC9YV8iNj2+JvwHopnRSp0t<#hGXlmmbo;6C(i%orS$R9N&MBpex5NT2 zHu_g_28ttsEkpYIc1JoBpxP^;V*MfuQ~|FK7G8#h^kGTG*4HnY*0`5jQLQ(vtRrk5 z>}*;1PosWuHl`;3k*yL_e7gV5HEkn*RqT<~)G&YIU$Efms>W9?dUu>VbtURne-(_} zu;*}oRs0s(CSulIj$zSVG@Ha&LUS^dp=CY-8+Id?UvJUIf|X~65*i>$OC-QyYz^Lf zcG%aZGsc4^YGaGl^N2m1?fqADE_%jIEV^(+ya(QI^e3XaqfEW8XD}P9SH-?iSCOL% zcKKnET_~QJy-+4%Ddy+ZmVml(EpztW;e`KUA&dswVPoe_0*ktq_gbshO>j-~UFG@c zi}A`wR`v1Ho>6+`Lel$t1mZSUXG4qWqotwl?l|ApP*6i0Glku!l?HmlLjfSC)Ib@r zz0hB@E=`W_FXC5kSIzojzFB0fHbvs-D8PIwz{bQcw1 zuCE%7GY}cu-Apg{`VCce-7Wt@@{5_tr~|?r7X{ERIH0rBRjUKkSmPI8ozzopa~$yX zt6D_gR>hssWrWesSfTj&omP*{@apyLPZ#qC4GxfvE5MOlyqmsWo^T@5Nh6yoOHyYU z&TNV7jM5Kb8VM>*9Q>v*NzAF56AMw3zGM|2Y& zaNAw9u9w|U^Ut>%qCaD)I8CW&&#bq8o24{dPvc+WWs&=ke$LsA>c4Uk0F8YBztPMkD5r1gqi^rPv$rJlFSpjGFM(hyH%^ z%;9L+_nKCkHdHqm8mb3z!+y*lyS*Pi-z*{!hLFV3D&%JoA)#f7b%?4?*^?|xiQk@+ z{dh*Ycl-3{8Eo2*m8U)E4>>u~Pu!kRnlPT8^x5J3#F6%(^GvgHx$AqZ?>9B@{pS;1 z5^A}$m!hIS054j-7a&OIQ+&*<^Qa-Mt<~X)A4t_VlDpDb!r)S1nfYs*)SLDbH0JU{ zGIyd!64=vjv>h9C5@&1u(d5%eFSz#0%KTl{ck9j-iyCSupW_N43dphJr+}tPVjA;U zNp}_Ak+aZ&v!sDjlHHaBPhTJ}dIoPRq0@FnS785{@aIY`vXK;ym#Lg)A3K(=uh0m& z4venh_>64qH%616fnaWB776+BMXaA5yXpzJi?K(tIs(O>JC@wIAKsgl>vv#h3cQm7 zVpO8K$5J&#PtDbw2+p#%pK3;JiFC2yB1oGugj<_926pqEM08k116| z6IRjT3m2ALSu|}DfK(^lRMRubFD)EVfAmfwDC$Eti*sO9_*(vaEq($`?$&A9)loB2 z20jojBnaWxRb{!>1~b+YcGpmK0x~0Atp~A2AGS| zvBSh_aLCsHhJhMs(^?v0etvdJS-7#TMadu@|6M1~^r1XpJ>LG}m_FUyn&!qaS?@O?nNiOXa#wQrVP-z8jk5(8--#aZ<|QM6(g#`GKH^AP2bHHglVRe?%D|$)U3i> zMMbTgEG-393qJ%;4bqi)f)1K9t=ik%ZU|ehi2W;W=<$+Uu1F=1eAd;@dK~v>f8c(e zFp3GYwD71JG~d}KV=QT+ML5ZDXN={W*q&{5CKJBljqKT$s-<;SV`9GfK<-p-dh)gO zexJu~Ka(N&UM8;a47o(B^{TCqwwU#* z^b+dpjD{tM1<#YQI;g=ZO}6XrXbedB7+~83)JjPNWiy9eS#5pet=i09iJMMpgP4>0%#?NGSlf)7%@UjXCK&@CG!f5^ITxOEj>- z%e~g*ZX|#}z{f6S*5PbBqAU50NYA$%)yY=Eu|{TYyX64*06NiW0ihNyO}v$>qo*$!4o;Y4QoyEfrzVa!cAezYYT&G=57^Z5Tr=O2 z!EFr@7q;?q+s_pZU|w>tzz|qG5#M;;xY6$`rxKyyS80-w*ZL$XS5$A1P@B*p{UQ+n z;fW(TtB|$$6yFlzJ@WC3^+}$6RY5teBl0%Ulo9fOV~Cj(wMLz2LbKOJ~uyx_It#eq71xzK1aHLLAXl8_c*S4;>rx z%HjpN;d0w7Uif3sRXaA=W!PfKQ+H?y_ZNvQU8q_ao3_(vHAf9>aMGO> za8H!VGbJ3V%?tLzPJRu9(o>8qS3P+wDMxB+rIS%Q%IJxjXfirE<>+n~4meI_SeNd) z$;xLuYrz+l;HY+c*HHeQi|hasCVzBcOJq^jq}}YLg|BNpiaLiFjvJ=M7(pn25=|!J z3*3(~V?m3opdy-<5pKk!7rPM%$S{ zkbJcby_Y1a{@hP%7Q6r96)VL^Y{P7GZ<2Ls)}016Y@%=|p0LL8)q$9hSI zaQOeGeWQNf|XQT%%5+t9Iut6iae>k z$(-lW=;-fA*MauCg|ksp#&}ucu}kSOMNGt*5wU|#z3Pu!UDok@tZ#gI6#1ydxaSi~ zc{B}4I#^X--@KY}F#k;4JbnJ*^IPa*Kwk!d#S!I1+L)u;ipu8RTRce}ER+}3QJAqR z?epQ1#veYgWeNF$O=aSXv7tKkBz^&&$*b33rvY^_Ih8EF88y%YZgSD$^U^o%`Qhoo zTJJ5zGPL-%MpD&rFYa&T@?{f=pM}V47#=9_ApI)3sg_LXR=kz{im(z99ifGZTsc$rD0q4e4W({=yS`^#^R`Hz#$>evpuryabW zmmGvV<+uJVPgDDifhA z(*-xc9tBOWnU+jQ3@S?sQ9(gvr8L00yVDTJe?RZOefZ3*I*&%~s@V?00_ikJR`op7 zwu`r25SHlo@*yJl3}HI9{f4C;hC3V%?zs9G$bL7YSw^Zhkn-8GXmqve+;*65)a^sciW5YM<^(wueO5 z4N*#Wg9^X>E zyr3{5T05ef@X2>xkxR8E{l)9qc&#B)NzAHiW{FUL`JNs6ozLk9MyLv&V4M>6dfoF_A-fWl4lWA2hf>Qh zr<(xQRor2Qe-SOAK%&7asja0 z-rTQw=~aUnGaO7k=k_xPjuHws%YdH5rSQS5N}K$FU4oC#<8^`2juGm5-=$Oz0-tj0 z5UW-EN`t!h&s?5n9q1e*lJ@Kyan)Kle;LNSKNyk#KKMJ*u$x}(wjWJx2Hv>YQ+N0)=l@DayD^aJa{zr zu2B_>HtO-I9yVUjS>V{fz!+NBFnt7T+Ck50-C*>fPp{KAg}b1E+$MuK5jPLxSu6SH|-8m;Hi(Vg>Tj{b33d} zp=ezhnsyrdT??G;u(w0FnfrMcj)8AZkBN74Jt^>6%+)vQVak!~C0en?$|dOpH>!TE z(0I2}EHSaHSkKGH6%{q>_$G_|5)6y`%BjluI~@rc%=>q8&iJU|mBw||$kD2*3PiaT zmeh|nQNL52t2j3UwqhR!)fYRn_cgj2Y>{4@6e5GLuq3vE1Ce%l*xohd8kIK;sXK{H zHzmcjM1)t0nwn#g`_oHzW*{Zwi7R67=Bs+3x|REnnZVoXL{Z&Q2nAr71@cb7`SkkV zz1DDa$*?i%ACSLm>Ey(N$un=ruUQ$Vx}*ezIt`~khPJxbl$G~}2&OQ~8RQn;)NyPa zsYJPaEz9z+PvqeU}yz!`8s{JCY<#*Gz49FwK-7n_w2-h9Mdm*~!`U!ke7 zs1PkkKfduz94m?vIF%A5VX>*EwDW=T_%#`qzPB^WqdxqmA~_yY5@6!nF)5lbF-jKF z5vcF4?w{|fXlwdcre;Ej=MwH*Ql4=}JXSWcD13hNa7Xq6Q+@#AzE4UxUG}0{{?26N z6_a+mG6|{Oc&op8ULUj1I}?5_*KY6Zd^}h^*enR-t(HyjY?8|49#D7L_+H6i9!3kbtQ0U~ z=CDFiVn#6Ii~$wuHRv(ch}Jhp?94AaTFbeKA*gLMwlHjkYfzl}p6JNSIx`h#U8QBWb^rl1g;`ugl1-NM=kIH&=fTwMIe@Dh+GhcBE#8A|ZcfZRQ zho#XDa?uQ`X`$dyS|=U7uE&%XgeE7m?iF7{^$&?-S-zr`3 z^M}3pcM(-Eq>2Zhb)=3>=&*Rc{^rcP0>(%xc#jVhcRx0%AWuHsAOWQpoIy?7zGI4m znDu@6&*LtkVcE2~GtwEbzQg~p0Aw`iRM38ZWT)+_?p}pE;3$&nLuz7KV~;N17qcnE zRc=Ec(rjSMIujq z!__FsrzfR%*>mf4)SkzHmSI4lIFN7=7rHJkC9irBj*>@=MghxWrWiTbXIy@E+M00j zmG~V6nW_1(L8Y;mlvE9mUcfZBP`X$X-3Fg8Z1TC8`a-q03C_=C@d|XwV^`D6x1eVo zH?-mW=i7H-ktfIF*i-LJW+qXs3#lYe^Y=bcxQh(jYA(Av7RLrI#<)`qiQG_EU?Dzb zbPyhLv&6FFYXRkj_fU-Q=BeiM2%4`xB^i&(WHvCt&IRvCatMsX^awZ|GMTwFJ`AUE z$g+9Fr+}tQ+3=R?yeQkjpJ^%6S_Nr}OQ>~0i}FD{6@r+fznSH*&mtC>Y-72H)3~Yi zB>aIgNk7c=e%9b;_7M?oxO^Ps4H^LlUqtF-@mh|CeVnvUGWG=)?J+x)4%?c34k~x- zTk#q^NC8WOW;VO+dJtU6^;6NQ_iKLECeXfv6+6whFtwC%o288z&$EHODPe^+xuz53 zHj;?gS2!TmSZc#_GYSSer-qEOTzOGtsZcFdOkj@A;pe98yc~o(dG9S z)dqW0ZB2w!CDKgk5;&L?id+8B-S}&jw-Tw zoy`PRnyA^x=1Rv5YHSBWSCJBT78-@2;W zCKBIwQ8Wt&>Zyln!L)a!mA9q6-J^zJjWp{*h))#3FGX(*( zuK_@&+tSEmbL$(?B-m723UM&jr<8G%9fFuKHoR3I5%b4 zA4ly|to5~jjo+BUV%K11urKp09QP{M-Q=}*BH!EnD2;;`#jX~m3H?*eYET?zOS1X_ z-zVl^w7uf3EJ5F%IN2=KDXV(55YC?=wCk=5kNV-0Yuw%|x*z-Ji;N!jf{zJdXo4=y+uaiPQ>%?7aj8JPP6eJog;@zaz+H%fRh2p`8&B`oJ@}?+4Xd?*b z-^|tcl)$3EWvQUJVeRHGWNViPcX;VE@J#{YVqF;;=5K3XVze|4cSGNvv3nlMO1BW& z%{0k|);9Xr5WMTtfU<&n%-()#yQg*kjry7O;=?yfCzP9w{x||f$@+IslDxQ@;xZf- z8)}20%QzB~-k(P~c_|He=e1_jP>gXY+3z+c)nul7RSCih5(*&JNn3dd>HN;JxY8ff zkuHZom>bl*&Cu^03R_u3r>&SjcDIH-gJ>!8XiF zmm}(tb5A9N6dbUXwP1Qel?VwU5bk$PO&~(bTI^=D=+hluS;2PMtYz=5HxolDT zGy`X?M-Qu7b($TW^s%UmA#|3Kf^6D$wgRmxHb2wLV|DE9mtA2odrj0Ev&eWhBQLYf zf~&0f?b^%AMAue9flVqc)1qS>Sib$~kN39>-pSDtTX9*=g`@^a(<&U&!VlD$H30UBar$n=r7jYa5-PJUf$U73`pUl7 zjG*sF4bMm-m2GI$3?v>7gOmtC+Ycy8p}bY077yosNU+v&qt>P$)by-wX3#j@eu?k+ zdskHg1$0>lb{9hirs>1-c=&q&vBKo3RI1xcXoA>|mj<}F(2{qeKR&j*Y?(KN(tfj` zuwQIAHkb@&bN#%6Wp=}yv}24)SJW9^*5O5-!RBrtk>|L7dmdrp7M-aZV7&e>dM5}` zzMDQ(p!ElDG)kF<=fAFTK4{KhW)6lUBlvl3GODrRA1-@;(;L;x zghoo?H5)&SSD&%r_Vtp)Z{gVyYmSpFyun_;!9MIx;jLU-vH-ElCh)!ia*pnn(wB_c zV#MUr4ggW4e&{Jr{VmxKIu$>m*i*0T6e9A4`uvMyAv!XF*w)6+OaE#=bHvr^PbK`T zVc>s3gPrw=5liY+D)+w-`EH6{`UhLw&!IT_w4ODguCfJ`0$a?D)_tr7X}+x_jvPR36xLl+H&=uaa4WeCR7%a5Vqc2on8{{(eH zXaFkHj*rO(=TEZzS)lh{sNPesG)L0kiT`Q#|IDziCIOsZ)8_Z@%71zGpN5P_0=VT4 zckXZWe|X6MH_GUw67NNzpr8zhj}^lc-Oor#t5;9l z3%>$iq>%MF`)Ex3=V|)0Du8=G#+QMiFX@`o@6W=Ag2WPN7CqTVaB;uC{zoj9-2dZ~ zs#~B=ue(=llS45^-~S((FccBRR%6-aFmLmMu?_I#{z2jI5eWZ4X;J=3-P9QHe~_1O zrW$|7M*pe;92hUszpelO$6s=UKgi3d-Xmgs!@!4+(SH69 z^9_gmF?5~yA3-{{ZU_&=693apNZghWBVlX)wmx`6>; zaVV^;M=vV`HOgm?BXEhT1x@($>>WO^Ew`Zr4p9UK^6EwD<9oPu0!M@Z`SZ0Xq)0!R z?QiDreux|+$1^^~|DDy`bx6|l^> zSQm47{Qd1?*pwZqKg2+p=do3ZFvrC*dT%cFx7{vIkpY*rBlX06jEW`MwHV9Y;dO(! zj8>=ZO6Ly$*ZJ~Wz<#-up8x(UB6z92!sZhc%faR80|kq5ALINCZC2oqA4f49Q+;)T z;Tz*oOcaVt>xU~XQM@47E-l7!FS_M~`D<4>-ihG;fs!Pv&&HZY9@#<2eK%}9-*Onq;P4El8- z(8x+*r}uzf@H0aos(~*qBy}qOuZuPC8T=ERq1TYNEpJRDsU&}`Z$1XT(3YR>1tmuh zP9p+le`F3ukv&|7d(2)L_+Q}Zzrw-4EqyT)z};QRa(>!obN(^Ou>vmRH7>6}hwSAz zu=B_OPIMJ-E_R;3DXaw>aO#ItI86sm=6`6)v5!S(T9INzC^%6pqaEb1C0bfVyG8q7 z9{gW74z7RD?B0Qtjou@_$@GE5>$pp;L}hg_;!OXk6m9-u8N+RB^|M2c9|Wtr z-9A|_dYw6(%zB07>x7OKyYT82==NIRjC@KH+v#tbg9rfrh3PG?-ShlZy_AdOVY5*uL)W&@ECRA0+3v~ z&$g?^D*DLpolayoMxs6xFKJVZXa%;o9tlU0iJ;yTQ4ilLVDCyqOgsuv20`*=u-Ad= z3rhjx)x<)hKTC@eJ#?3d)Pgks*2>uKZyw2&4kJ}8qwFg9aD<`Af;k zU5e<>^UZBwF}x0bO&0C@)K(CPn96&46=7)Fx%?1va=v?3{Bt^on@%_Yj0(jt)81k>(dUf~7&NCKEIoX;~5BA?3fO(&6~6yoZC*2o-3Kx9y6sR`u0U*S_bZo)=F7)^yk6P(`PE|7@DGxp z6mgDjFiXJwlzKqU;i%ojNLDeG<4ybZJvK4lZSrl4o4WY}1J%Y+wMn0f*Vllrv2;o< z#3CS##I?%)e`_dd$k4b8^5a7)2Vbv33ivrnvaYkp1H)kQ=jl4q*0wfw>r1iwiy8NT zi)e^*60<7tRGB(7Q{al-Y@Qg`;&45BBFF6PMcBiDI#Dd2la%Se`ExAl&}c@@%C(*9 z30~jF+&SJaCtffLrOMww_D6a=@rJHgIIgP%l;cXgj{B zILtK^w@@l1rgxBrJ60}30>Ip4UVuuJa}ll3Tz+%lY>yZ=^c&4@_#gWr&AXB!*xR$V zx*28rS=4@lJN-^_c1JcVPccHO$1=J;6fn6NY2^Rs7z^CT%67h?P(&;dF-H@x$p@dD z&!KF-O##5-0jJ85(F2>y!6{GibX@)z3bU|$dc#Uym*0Y+o>ycP-4?sb<(~}->OW)$ z1c*=C_wI(gqFE*BBP!FdmxM_mE9cz1%0|S`-ctPN%%LheKHanii267VFLvBh;?L#~ zeb~_B_*;s@38h~q_eTMVJGG{}gg}+OS-q#`{o^8mK&gnJNuStJJJomVnMXD;z4c({ z@lrFYMF4pqy^6zCQ3;;He{^)>ldrkxf%<|1Vlb9Cx=#|OO)hyXI z5)K<;0LX>URZzGSwy!O(fepizz;zZzWa?`tFP7IMbaCU9_;7G^C4Rpb-vy% z>{mfYP=bAi#K&}YC-|gFt+Mh1=!GUC)c>J{a#aB?OT-hIk4a1p37tgqcM&uylH&6R0Z zRU1u^)~+QPd-g~z@r$fZ?skA4yUf(+uShnA{HLrNeu%6p5KuQcnn_;Tq<;T$L)%y% z1_aPQ+`Wj8*ZI!$3lTM@JlYAeT70A3ntHd3+xj_oGyTx;px#Iyu`;~rAl^5#-62Y= zb7%NJlUeD6M~_QNto5Nz;G}9l(0VN=)P?OHTr@ z;03-MMl+>$bo%+|qlzsS$Z3w3h@(uFYRl+1x(~7t3wgXkhTW0-KI&cbFKEhk3V3ZG z23j0t9eg~ziz|JrS!vL>JFy*0?vEaZGZF#?V>1qRU{cW;U(653aFPSEl-E*&wHoc8 zjIM8Q)^9J;8GL6(D%7&kN%@0G1g_Q(78-xv4A0Q()cNtxIj_OgU%q-(bHVie&Q&^y zw>lWRjMti_#D%9zOiGnPvF3r#GapOEw{k^Bf2u-`GvT`O+iiOBiT8_$?s&RUc1u0O zi*FbpMKscz0#UlxWWV0fDd~4M_MEl=h825@>aV#`C_iEbCal3GB~rDB^_xXD zySV_js7@ zF{iiNJccFQEPAbISy|P`s+nQD?xzSW_h&|W-s7wiv)BQt-O`HtS??+r7j#M&{}l~0 z-hC)<3r0V%{SL)i+nzn?_-@*7MJ(hqu!_Loo{0+s;xKqb|Hb>qw=$C(F1Bf~K7RF+ z`x_L%v1m5wYE$t?oz%0vGCdo^fK#2uN?`8qEf0U{@!cAMabbi~=G1Gj_&1 zF{Jbwb*o5XSsIT(#-yf;f;$wh+5(zGa z?ZJTMfSdV1UC@UtCS*dc;+%+gK9;lEn>=_xLX}n+2*J!VFJ5VFH5M1F_qZS_li;vb zYWCVR|1IKx000Tf?DN!i3jlu}!=vPKuho*vpey#y@&>55Mk=IVttL0Ce=PpQqE@^p zaiG3EL>$HETM3=Ey^H0scQM;Ew<10@>s3-N{$#hwOiTE2O|0o;m(CJ&HZ7FzP6pdw zypfYVo=~YE+&sChuX@!(eGxkAy7gORpg{c`g zg$cMFDI4tOf|lF(SMO?Q$S+)Xg~uF#s*XX~xw!WE;hbpZ@A+yuUtB5$Y?$d+aQasD zdQmqd8;>OcpDN@^T|?Hmt-+_8+bmL}+UxE4!346u|It|6_9rtCZC6UQ zy)0A0DVktUKECf#({89tLHWnRf(!C|WHCW5YU zSb^NdORHY99n9l80#NR)r)Q6baX#Xp&0JwN){i760;Ex`ECN1{!cs9bi(<8EYO6`R z3bZe?%^>crA>AMvIiBK`O$u5=5})Gvp)qP}vpV|)j3AUv6+?J2GM%H>GyUC{(0-?# zW$U`z^vQ)vg=d>m`#;@w0mfC)V$mnE4>}$aSJ9OB2iCW;NDNqJWvi{bDQy?0ZmFU?ejRaimqDyf;3+HHY)w1g?j| z9}PYMiE(Foqdi5IiA+4L%Gbpvlw{ED`AX>!M?pA!tjcR6VwuW$z;7-9ep4=x*D~d$ zfR6l-z~af2k~LRi z@et(~L|UPPJJ1tx3OYzw%kJJ(I=iu(Jy3a_+S3Aq&9WbRr#=r%a%kSCn$ADf2qkN^R%YjI>@Y>6beoik#?s5z|b{L20HBNWPWgxU7 zo;vKLS3v;aQW=0&N%0FhgGI;=(Cb&K&KwQk>E(gkCi`4Xl@2*x1YEhSt*xzNjZYtJ zR;uyMWPOW_p>6JGQuHtWo^2y>4w3U=F$#MUnxc>6kAFnv??jsQ~1_ zp6dgaGuC(@lip!&FLw-Vy@Rb~h)W@so`UzJ=f#^mG$(*g1n;dB+`SaXu1Ex}9hL3O zRq%!3^_LyKamPR<;?vh>MAv2KF=%sXxwXfHM>@B~&k@U*w7&2BWsO`U>TsT8YJK>b zrC4PYI8YFAP^iB;+S-cGvvt|@GS~ilt|;0Gq@;@Qna7@F0X=D!_CAj#H8p;Wx;e3@ z5pmRAlR(c)qGzd_v^F*j))!sGv=_k)i@3@M8l@z|SM~^DxpFtgqK)QDF7nw;K%Zx! z=j=iHw(%mgX%=FM?zlZr_*#rQBQEeWA@7$stigzQty&tnq}lJ$cod2foS1iY)LpQL zyC3U-pR_?8E&L!B1i>37VxUme6Yh~6IX6|q#NB^VL9x|&D%I1L zEXUUTwbF;C#&DEi>BMevh~V)S_L0lq(Plex0|_+ zO%6&!$yWM~YqchUS%%lF)N}m_8^ALB!+-4)E0>et!s@ySrJ{E#z1FQ~Y|CIjm~{o= zMUTE7tG1z;r&}t9VrRRVtoZ7hefVMgV-_N+Fj=V%pVvFoHGOa=!pp@~1FXmFylF7u zJ83f6s1Xur(t*Mm-g$G)$zhJa9Ud=+17R}b_l3kYmUSKX0g}&H3GrO*r{Be-J5v@+ zQKaZx`6W+sT$4`)3_kZwZDH6JOwbAg_UGFp`eG~a>*@x>Fh%1?x%+pR(}1;Hz61~)Ip8ND2#rwvfxt&`SaOCR3=qe@;XxU+b^R-Qlcj#rPeMJ1Legrj zjlkBQaYkDP=#2uJoK}oC_n40_E;M}^wR164sFlLZ7{yHDeGLl#>Z<|_-ni%UU1KJsiX-_=$q`cQf;VA#5>S_ zhL8m)CE$78%BEj1IR3G-kXeA;Sj(Q~)^@)2SwL`=tJJ3$S^Xf*E!r-za*{)$S+HSRn=(l7J;2no@KnD z+7rzSgN3W2UaOf>Y#lkAo*2=kscer6!mzeGbl=ri){3DM12ODxwb;%)RxvBfdb%$Ns&09&&A7RI^UVj<_1z!1cLJ~}0~=pR6* zT=#+0X1PY`P4tJQHyrs-??bWW>M4WHfR6r1hRQar-KpYBA-m<)@NVf*a;L)Ady}(~ z=`P)LK7!5!xfxNnd-YXDF+IkLuk#yh7NT+9#=#bHZya`iITMSxD{hu(=MmpOAbba% zkzS<{9OV6dY|t05%8G90TC+1KcM&1`c~!()geS=tWWPd zMr89j@L)Bjk676M4|{JNR^`{Oi%Li-ErLj=lG5EEUDDm4AT8Yj(%mf}(k2H0%eb%@3x7Io5I{Vt^k9A%C0aNCD=R4jpp7A{QeLwf;Lj8hd3!108%T}%Hn|QK}kFtX0YtnrWEk|HC7mJI7^`>%eNF1UIp1+it zr%buMev-^~CH7c77SC=}K&w5YE?hd}<>UkAfLCGZ`1v`Z>stF#f;M{{A&!!aKc@v* zs_mf|-+s(*Iay2P76i`{Esajo7{p&+@Ani*9yqo|btK8ECvVHcuAP)ft+s@(u5FZ( z9$QG3R4R8x`Vgwj%+~6*c)p~cJ!K)7B*~roI8!f=L+N<|cO%wlQ9w)e`Oid=Gp?S& zaj5)fUhRTxITVAM@2X0Xe!M_&5Tjc9+QSMeXn0)sGvD7c0Ni$+{8ZgdDI_3NpNRdYg_-;6Z2t|!axm>LY!6DBfQ{6xI^ ze`J?6^1V{MNs3yGBOO^W2koFaQ2# zzbOE;r(ggPP%(F;-LC?AgIhdnTe>zKRJqT_ zv{=(M!dI%jUxq<#=nLWPi`}v}L3O-2C-m*hLwe_8x6$f-uLJaK;$Oyu$G%-5ff*un zeflA`YIb-9`v?F_I59f`C?>->_OuRI1(YG2m|>6#Ic*LTtdJSQmgcmHpl^0r9~#0v zY<4CIDdo-%=EUN7-1ZIouZ>7=@8F)5)Y`6*SWNay#Bk%B@A8347#20&d6I7Rn|vH? zoXTnzyW;wfvCNJxqrGeTf`i|FwS5UCna)S;&WDGXgJ}&&b#J3nmuIW&X_brL4PqEE zm;|rmaX3a|5pr111g(%$uXKha(6k^$gEXk!EW;xd;1&k0txSh!CzUjiAmk4|w=NhQ z;T&khJSJsNpG{3`ve9PRcR#1GF}w+Hnvg6kkqQrf4Knd|&Bew#W&8Cd%AY}bWX+Pj z35*FVOJ{*LuGiEjTCg(B?nj!0qstN%%3s{+F`mfnk?u&$oNKT37@(fVTF4XqwI8|7 z8q-E(TUM(uc>QI!^Edq%=n9_-c0Zy%p~tqXt|@%Aq_(SmNN6Z%Mz2kSVr%Vt?{)kN z4&|_hWQ(T0@k>jtLV_q$%6SUbWfR{tV;zPX-Dp{q2D%oLD^7q3wu&Hs7G^TEjJs%Q~pVE7ML?VW;PThpl3xXBbm(R zMrzuhS}OAria&dLaKGN+mUP;m?_Q&pkEBtNqI@a^2%x9+=XiO_Y6Fy$U_hrN25Ez3Vk*~D}_!=jW46n=a{BlK=G<0Rgo_XpSZ zl9#16uB_LcSozJxk`|}qEhAf0$n)3Ue#qE)_fLl40>blo!33hHPK%vdtwE@DyMO^R zD^Ct#Aclvh!lW;&Bmvv-7etLI#6iVQS%!)KZd|*ObL`wd0cnEzyROZF;O)hQp zqu08=uK_^U4i&0T#XtMN5S8Y|m!II0PG&x z6I)Gr$iPvhyujP@U5!XG7dvNX*GHtSUJ=AbOZVia^u>@nW%!!@q-O7j6pmW=EgA;B zCZBJTSlOaSWWD}&0A2!kEE|9QD6w5)Wi3^kqmsJvcwMU4u9wh0Z;(eH~3L zmj7;vOn2Uy)9#dlL8p0p6q}zgt1F}k3urWa{eq&&g+5bI-~_>wNe8;!U32L+xx+G+ za@M>3zg8`5o(pS3R*fx7NHCt=%B7{g6!0{>gvg8bajFSAd<_L6RtVPM)CJ|OyWS%r zLAs&odM!c%tk4>3xJU{~n#;?c`J-{2FX>O#rN3k>N%MN$BP}l4u54j8pKX13c@q8a zl?S3kFjX8b*xnS9N~F#5WX^skixkJMy~%r&hYxpaBF=8z_Rc^ZFsQrQpBv&G$Sfpu zF#eE479gm8|)E z{E;j|&~IOstTd%)Mi-W0Rsb;TxSS_Ijgck#41G99m!tlbi%_2F^L9KGMZ44K_y0s@ zgZ7(1sPcyFRVCVg_^-ckc^^D^Ab|_bfQA28>KhD9sQ3upWb$8$k1*E23=BC^@_zJB zW#Qkyl6?kxvB0nW+<(9FPd^y)0)YaPCkn>DANT($ND>r+rI{QH%E z`hh7O+$f&FW2C>A&OiM?LIzc0A@vshL#y~t10g}u1%eCiTJ9s5NWoKTjLS)4-NmCU zFY#;%bS@%ArDO~#{xlTCrHs1cDt+LQ60l(A1h%&4V>T0IH+VJ*W6 z2;0sTg{~Q3W3qsetVM?5FP%3gnSP3eBo5m+)PZUBlp*;fSU9z6OP-VgL4aV|0W0t| z{3Y3^(x~6gwNuVzY~c3dk-^nwg#P{2P;9|at_>Qcfj_Vk@n#gRd4V#45gy!U@V6WHz5SAt>~_`O@e#*qc-zNl=0q)PUO8_ znQ|JWg?^>>gL}iD;43M5KhpEtXOkd6zraVY{sIE#AE!)+F|2k1WPJp^rP*FrF)$kg z*Yg5b8i;oSVgKB<(CXiJ{YOqmLWR9tPeB#od3lxw#)-hc_OPKvof(YoGm=)^OgB`l z?vAD|-(pf2ApL~O*O-zXO;IwroOYBRq^dnzeFgDfy6cRF>arifG$SooL~;`Cvk`qn z1#|e1^=9Bt_*#abpV`Eo9vh8W7~-T1xBHCoGC+*`80k=KQ`Z-AWK(*iZviIBAAO#n z3{|UxqzuvB;SSFRjz1@g?TV9ExiPfM4#=s5FYqb;Pdoyw?zc;0W7xKhO)JD0%jz&A zw=~elfyn^r4E_}aklwX`V3Whs8GAsYVgW7Bm(tQwtpzI)#EaG(>8ZnwTa%U9l3syW zv1p1QChImkXE(P@>jSwj*F^l+=w5dYFgNG~$W!cPHdtW1=$epj&bFR1mQI_Fgr>ZK z111JxKd0M^52BceD%7=q=%ai=&G2S1d#IkNappf7E_CCONoRfus*irG`2z3afG?lH#P&d~ z^H%Ul+^pi~#i|fOv3geNB<3Z9@x$hxq}*8PzAy%VP@xCO9cp}$e`Byp7f`>qZx7MV z-CbG52z})kCEi6RCak^idrvEY+@*q|*8YMYi(1MX0KnN=5pC1;mNkWib_GBRLcKcr zYX^_{)G4>O+~FuyW9>+BJ2vl7M2A8$8p+MEq-%|O=#Xj|75FecCC+emzCL}S`C~gA z)D^Qd;VocoZBpoO`LtM_@j}jnQPI)Sfu{)yaANZ=)!S5SiZ5dY4yhuBw73^u{JWCH^%ztJ%%SE+1*HJXl>%eIjtb?8?KuZ z4Aj5q4lU2Sk$#e5!VCx2!Z^9UWrx0HlUBia&8v5u#0Y5kMryeYPL{s*8xL&&DBtZJ zf?BVZkn8X93;gi%)uiTkyNAtxMHT|UvISt?{P#nvh;={m*WicfK~T)pSEa&?m6 z%J3v6K{(an(>anH;zhCe?Ir&!t>$n|vstO{JA;R*uIdVA{I4{hJJ%mXmXO*J08#<_R!V%V!!K|Bx?ZFRYzX%~$@g ztmq(FPWJflX4=R3eicV5p*TMUn-S~^Hv3|)hVrrl*t4ZCe5RW{O07~?j^>9K0$+Xd zDaI}1TQk^sVZGyIr2YH27vW|_q_6{?Bm z%F!$4Nr(k3stlr~qEZ2f+w$Y}*xbVm{7Pzr4GS_)!=CqiX+$Q~3KG z4e_eyW?$N$*jN@gneI|%|qKbfgQD^H@3G56!PiHL7$HGcjq0vuPr+( zG&t?nDWnpv(V(iVbiM$EAMZVH)oZv)bQ>~{-P2c;=(L#O0A(lAP%Ue7w5oEZPk-Sd z`DuF^7Mf%#C}4ng4{vj}m-*br?M}PH!Wtz_ZWm}bXaqtDC2$zeo`vf2@hn%$LuKGS zG1LLu_EpJY9sMv`!l}k3ip6Btr~81dMtmjxgbM6BQ(kePkI98bj$TSnxRm=6o=LhK)2t{hA;!lry%tR+stq!E3RZm z`5D8{sH4bzD!Ejo7-}@70~n2IKCgRNod)!G2WXPji^Runp!AA!sE(yvOQN)Ys}ldm zkgMSPbCKh8^(sSl&8n?%pd(m?j1(vGSZ^U(!5z=rRBzH#L}JPunb z&6@Y$n(f%%k0Wqc&3A=vWk+%;y`4~XQDeQW;?r$(4*Kyy$>{eFCW#oTIP;L&#$L5; zAfhWoXA=oHh0NGLd{t=UEnv99YOz{)4u%yA$bC8ilsuBVWRuG4!j3Y8!kz{{ALE^~ z5m$A#A862Y#s4F_fMPPfML*oxxWOSA$#ZqSV&d6KGA#ru-c#bhuSlwZXI$M)~g(ymu~ z%9W;D+)zOn`?5xT|BmoBQ}_cALAM?H$W6k(RwA-+)0ep`A3i^-C(Sx)LO@@cr>Y;N4Khu$fd~#q891`_^bq*EWv* z1ix0gd~~T@6+%xuJ&Wtb`vB1(to(}mI_JkNuD?n|-CiDT z{O*ZotL$v5x*3F@s_q91wJjqilHWr!wZ#S*Kzf3iSq~3+7FKE3Ho2U8Nk>qZwY-+#w*z6}^adi9Qem=)loBTCsQl}jlGq}H0`FwG4QN7lg z(sWQ8gUWlYt!bys?SQmH?Xe3I0Z-#&%h|Jcwxjplb=aWhPGogqw_0i{H9VDg8oruS zQ0PT8-9QA8Ch;X0@<YdF|Xgus{dHzQOM96qu;Pq zUw7xrbL_C`bqf#x6m5BSM>^GaW766>V!lRe;*rDz_nbjH=?VRU;jVf3qCP`C2=1%G^)!UiJ94HgvdDi=!Cpqq=Lwnmq}GKU+Y z91gsf>xo*q!8MsV|8rTH&u>!SPX6Hcw5SMB7ypMAU_`9gK|20bFc^d7>#!6C4}$4g z#M&Rs91L{Yew-TKp#+PF{tw-Y8r~Wz4%WA6FHG_3wJ1cz9)F8t5UN*-+?RvdLfhJp zMtJ=?M@Uc%BKVAF)5`Qq&_gBrQr!bjM{(V}{l>i8;vMhO$LR^KK4ILP(Rj2+ z!ibiIh{964MuJTNYzIG(Y}s~xwE90uu&kd|0jga|jgkU-FJ_zhVOFfe*RX^5?qJso zBp=n?ov#8)IWNEJ;*?DB3&*@eIge+wsuB}-E96uL-Gf6oA+KY^ze2Rc6biI0We5b| zRcI2_e4`Ta@;o5l<%oQ2!8SMcoshp%4&7+sC}5hDnAVs*<)a6QWJ(>C-e({UDn)2G zzeayytog(?dg}g8e8O(Mc5D4ehtv7+$yr4qC1#6e)CTv20!9hjW?YvG|G;XY@hPy+tEgjT^J;F) zJGY{2ywLUNquO9TaXgA%@o>)#R=94rR4sxPL8w6L;pXn=oK$Oz7}6$33*p_Jk^#b?2MAiFK;E^?ZJ3 zU}bs5Ch@tUL^_YbJOam(@ z##J!8R{ybd6KEZIIo<1^+POm>*wh7~d zWV0Qsm9iGw*JI}U^Eokv@9u;RgHB?BF$wo$dIud@l*C4hw>bQ`60C{vp)#!+zSh?A z6(5wuGa-P>>LHU3AmVin1SNwyBnr#Kyg&mj8lSD8*z~ZsD&)DG(7k&p-l-Kg`*Kf{H~*9&wA-nCh1f1(BDGKoOH_N zqNeEF8T^ATAn5We)nr1PsW4St>(9%RsDhW_u^h%Bm195ScVG7_xH<1CPN(OyCx@}e zu%Uw79c?UK8u!V4dVkTpUD&6Zmu*@zhqbE7YMs}GJjt?3C7V>ZNK5(QMj9-%K~L~F z?yq-S3*1?~xe=7+m&%13N z9+^dIKCze-=uO85>sKL6VS7BEZtT`lCxgKfpDm@-W(Br@>~wQc_w}vVQi0(Ih-zta zNP^vlGRuKzAgylW*I#{p^a2fJkT60%BsL4>x7ptb$D0NJJQ(ujqP7}ph{;eoYKE0o zPCQFE-r|xSQW*PTBrtj<-_5ba5)w1*?cJ@F!s*LA*+XR04J)bDO>TzJ@ockE>8xQZw@0HL8uK^#hyu3cS%114vmWu3 zcP17yfQ%HXG$w;h=d~VMgu~-CzGb;0ny-0LW3?o-_MLEI>Iik`IT?^Dl>2e+ndn$O zC*5ydS(B286(t?4*?}wOV23f=1Nr#mnhxgbdWUVuLs}81{k8g;L9TX=>o%O+5YgU* z%#OP7{3x3_51iza-?CkOE_T_Y4`@fDx|dD|GZE+k_YWTzYS&aXh=xGNDuw{;Ah=f9 z7Xrp+Z%ZxJJbLr9At_}qTmt~&j zE!;^aHCHid=G7|m-;ix5EZDk%TzEjX8p^&5*WP6}!K;0m1W{I30; zOv_rg1AZ!*L?4!=3^7n`fMxgqD!>?s*SEkpNZC=M0!_5ebZXGU zJYTW!;$^n{fn45FS#gD8v%aV8EiLfCrGau8M*;UoL&PBEbR@#`vXfGo`%aFlt zo)G8V-o^HElCb|V$cbNYp6o^73Aj8@=M56|x1%h#>?VrvQaKiDsog=JVI(&j+{NRt z;vT%aP0@sYySb1IZF>&@0AGtSY8bxzZU3pi7`Fle}B8-q@Ca^YU>x@y)Xn z+#==nbc&<_)lw_A-C2)E99E0Mzk*H1Nz~4@J}l*z&c?dzCbpTsZoAU+QoB$W5I>0^ zxR1Q^FG}Qc3rZ$r@SUQ_44SZ0dI8C}+2RF8A%U}Og?S?0`wI=?q=UG=DeO0dFm8Ks zl}G!bALK8h0D_U9yhTIi6;8nT)pg)`WKRBZ+gmjI5zt$>{%aYBi$nt6#U(R7fn>&Q zdur(cQby{bSg&mRUO2VMwlMeN_=qQ5}g2Jhk?aoAlxsw%Y+MzTF7WmM{Vs$TXUCtZ}cEkQ5C)AYdZwJ*|*o%!}^j>Q}6 zd%M8aT8|y6sv?J-$pz3q#IikJVy?k#QQGA4T3)Wt0F`XtqtP(jWDdPr@!SmZvyG3j zDV1gNoMIJsi`wpYPl@X641R9!V#i=t+F)02xL>Wn{WUIWTbns8PSRDKM<0rIy*uha z^20->Q2Kkp)Fwpd)fdf?0p0$bmc*1?GqTaCuCD9Tl6_M=`QPl#!qljnB)Bh5O1EwN zbUnUkpjkwYt=s)}SG;4msX!&ZV#c!0rvXwm#jTz%T-&@Et&F!%x z-TAh6M#275*LK!)YjUcO+mVY&CAwdK=43{w53^l1Wn64&F-;}AU}nC=gV+Jd|Km12 zqw=J*+8X!bWGLqz{_CZ*h6|u#97|wG`l>$lZ2C&Y^{s7?T)Zh zMmHyGg)50AQjgFx`ycP)bE>4D%XNe>npVQsR#VI5hSQIfdpyCRwfmg()O*qS4u=u~ z8t7E_7~k`e%drOvIA7N0L~PIIpw9kS9aSExA?S~S->p#_>}seeJ^Fk_oFCC+SKI&X zr~P=h3D=f-^llsZ4T*(AgU8(O?MKWX5GRY5f?tpQ&W}N@Vh0Yf4dz3NA2^K9!?JB@ z@814fV6DAFjL@Kx?z#CY^E7Hq9>_0|HG*fewRqSZqb}ZQTv4*57lG%!ORB2!N)Kf$ zH$1#bJhiZ8&UI?S6!H_XqpENt-!?x~+}NFL?`KdIdFq`ka`}mR zZYn~ptrlzjLlQ~~L{RstIBdILjWYCp)JswytuzyFIG8*-2|v)aWGpeBoZV*2(4P%` zT#iCr`F&$3&CikW1_*sphC-E0LbpE7Z+$Gd#_nh!XbJ3Fb z7qZ!bNxEzJ$jMPH2KD23+~-N8ps_pLV#;dTn56Re{(K$oj;P9>n3Qgjac^nr7u*Z@ zcl0cZAC$gH`1B=|YN9DR>rQWt<*Haz>=D+Jbn{7P_vM%lNhe>#csn8U%KMfMNaZOu z)aCG1@KZ{i*U1oIeVoB2xB6LOIuM$;`MyswWs$y)DdU{?J`(GvK8I!U_|TgP(=4tX zfuj7t2VoJBR4WNumNj8)>Mnr?-7;hq&jgQ*RQfuMxk{4(pk64|ILD2BYI0T3-}qzd z0dFkkM=<;?=<7J|jXNsoXN@W^b8n2JFp}iT8J6@U9 z4OxmuXr*o8gIfS41Kw7soWwSLoH?YZ#eP-W6E#xBUZKEzO>Ank!xrOA=$$GaNlpbzfK+xl^D^5vR1MN;K zZrbKQH3P{PF0pN@}j*{UE1*!>CtaVSPFe2*nQT^ z3W8Uxl7=aLJ0|r`C8=C=)Zdc>zL`mFhYM)CFyEl7y*dJ{q0c48aA8w$1ZrbEyo8`0 z?7i}kn0E~71EX$z(cGx8N7ITx-HnpA%P8%-p#te(jGU=1_<9s1@YBmLY0);(jjnv5 zT;eo!>v`~q5bV9Y_u2e|M=A!TvG&Y&IfiGeSXunXP_tKLL2pz~SV5;liR%va`>#+| z^w}yN9EH9qEI~rz%*}ONnynb=pyH<&RuPm97k!Y~7d0u&T@BO^?Kp`qi9*GVi;a@*!Z!HS_b9Ck;Y9|B;frb)rq z%l!9WPccyX4QYwyL@)p+pTyH?orI!_qgl1pm&VcKz=Ie zdN2@!&~zC4h)38+4EFD@bOS%F%fv0w`hZ-Pe$x_xo#ht_cspb=2oBvPLOf$qkVa*K z{#EZ;`A3P^SD11hb+|G7$-V$5=PJUTsWj`@T!9<@E^^?PCqU%#yOzmU@o>N$x|GF_ zji|sF9g(+&zwvB5ZBD))PI+?zM0$tMV5-&9$$q*^wAR9_4~6Z#)T<7K4&9D0(AJSO za~{98TB-}#W?QP{)w?Ic!3kUKL@^lo{5tPlF>;AcgTa*!e0{C7idesN6!QywcB_Zb zL^Ep*i&Y3|J;78e1?#hjj&?f7X*anF}5A4ztBH2`ghd0Dtq|(1b6S<5CbO&ReC@LymUXQm^;li+5yu$@7$Ga&}aq(=E6rx(sJ1SFYUB(U7tcC5WjIXx7 zny*w!*uS9f1Zc9+kaKe4p%e2$$`pxLmfMValG#`QI_b;H_zSYuZ7^NsM%*QznsdB2 zR)~#1S(>?@5PLC^Q0{ofJD$IU|8VDo8G!z=O-;`$Bp64k&>cXQA?{blpxKYVCz%r4 z(9p2VZ~~z}wF=KfswHeiBZ-xG%pqUUPTKnNS8JH?B&S-hofOKHWzc!}ZX{C=*UhCXzuEBy2 zDzk_2?ILm7C)ix7d2)Y$qc?6jjDXXZ(EcMoyWDNavq-xN`N@X7>Z(?N+?~xzJLT?7 z6`C(ULBJhiIL(UXRIWuNw~CZI=OzFQY!HS&*}m+*0#4^=#As<>@1aZ-9=Che zwJiLP&=)LulvNHp%AGwcaBmDo={`=^>f8bS?bF6&4)k8Nf`Gt4>gUxVxN@B-j~4|? z`9&0S>|DkRwNUMUGbkadJfn$~-%Fr&gM47$RQm;uUxM1OyGUU;*#GIQQS-0GpIGuC zkW(o5XjvtnOovSWuJ%!HY-#qc^V*qO`R_FsbI0k-4!z}_7&jwiow=GKQ2;m`%Ef(2 zDmqp8ic~H=(O*t4H8^{A zSvBP99T%SOK77PRc#0q9aT5+Y^e}?R#pS*}w2nwUqtp8CXM$XO>v=k?ntHWQOe)U+M6)5{Il=C(x5Tg7Coz!-|p_gJguK(WuTR0kZgVwujb7aizqH@dU_+TO@EuBBA>1o``b`r*@q zxIQydkuRs^XeGm?y)0&6)2)R@1zM{Aa{H98gJ1utMbj1IQ)hE8?!1+wZT>dpTU{k} z)^dvVSGhaYZ3P4ktLwuR@>m+JCqHATyK%*nd6JKj{#_)_RDhL3A-tP<*w{6xJiA| zs&}XVJ?O?=DI2{jl0fBP-ndlHL&WFmm&#RH2uHPk0Al*v6vw9V&_)y-X`_?h@}$(o zL|i{)>HP_LjT@;ISr|c~go|9(=7brWz>qD_2%j7Q6hh8cupdj)Uy`zrz;s~i^KmqW zxQ_~U0TGY5&yFpr6{PGwPw7VI8H1_@UyVc*tdt%)>xo+8xB%e z&2;||`$`9{v^g5_2^n#p`1UJ<;c>w%$O!i7FJ8odOy@*QA+t41=OrH z91InhltZycWO0x|wwKqO8UGizPtchL4np0%dya`24ni+7=E)bA@BPUQ9vdZ^j(q^Q zi?jjjkBYzz#kR{(9=a>9ihJLQ-#QZf>e3Dp;NVCaj#$5VSH^7krN2sC;c z>79gZqitwC=Fdc+_wjxodLdDWI z3ZK_xy2}VKU2%k@gAt2;pGBf;F<;1ABlpDAD2L8DoxBAD0DW?|WG4TYn<|s{-owfE z1?BXkow|n`PEx6!+6r=r7;o1Jk~;E=G+(+TsT_Jie+;-q1?U{o$Id^`#Rg$l+`%*Ew~xSUm64GJ>R$gtNjJBDXaP7& zk!=lDA|k0A5a20MgcKCW;NjuPNW7U-^R-Bxr^A1u!obG%O-y_*Dkero!wg$`wSbBPhf0v1<0>wbSBEnqE7Iqt;N2Ran@Da50N(v&v z?oZegJ~fD^0`9F~NI?dD`a;*42J-mvoi#g{dofj9bPUMON#6Z|AwtT^u~@5C+AYf! z7~SVm=xQUxt_mjUU>sdBY)YHJ=_PR84X&JiF45j-@biTm+<5JDvTvcI$XSYF`-0h6 zJZl-l;WKGMPI6I|fvqHo4^kioH8q14)x%kn59TjA;8ih7W=Tb`qn zhVR8Oq-P6!YU`B&5&RR$vYZ<@?&6%7+u2tm+M6HxQP=A2+5>-m6pF2B!4VxTa$=-6 z4;8IsrbgjSt>yP zfWW}aTMX_4>K79BGWyw zut-d2#ZOi7JU_56)L(|*4?m`5p@hL2_VMP~i^Ez48{aajsU+B(Y;x0gjF9cwu>aav zsUiLX!u6jtFbPuegoKhY#F++VP|^CNU>B(|;9ls9iHWWJjIjY`UC5b1G2FPT)4E&h zG#7VK{5sDmP|!>sGGojV>Hk0}wsVCcNV2=V0jc5=8R>gCYeDetzTR$YSWB;77PB({ zV;>jeMHV0Z_Bx7ALkj-yEhltdaF`wI&;yn!FeqF|`4a^Ze`A1n#6BqnXSotI>guNF z3Y?PlxkF=^toL4AZV&QPOh(64iD;c7MRtGN4NWV%Z5EDkvVqD&;-`&6>k27bP1;|nb!>hWs; z&;J43PV#4kjKJ;52tgD5d6|DC`mOxnU7=`LhhCs!x<6R+dqoPsOwixsm*?V61$Hz8 zXg(fF8%z4%CC&fN1be}c6PW?ZEB~+my?FxcQYfr&*+}FU-!bhk5)u&X?iV*kKr`RQ zwq0gDb_*cl$+~o|Q(i%nvhZe=&0;cLR^7Ry}ixoj@(0 zUaRvMSlfq%bAY_=5(UP~{LsU{j+|!;-RqF7PZ0JuN$Ytpfv4%~0a;KI&EHQIoo{|I zk3dPPY!|=C zr|9&Oz{-B++4YMAZ?1-3N*LMDvrYHXjUTr+yLBx_cfel^yJq0&pCY@#2@w&o!L4v; zG2xXW2H+Wh-5UR=40>KvY8Yb5<-9g@S%S}j)Sv>~(^X2OQ9%LrzY0;vC@6+=%fg|T znuxo?QQGx>cLCCg&X~w}!QL8kv?5$SU6I~{;1Qm|5a8i?M=tC8`HOF!Jo4SxrYl)R zv8fKW-xzjn!Cp_bWlGk14S4|}KiQBIo-9gI%+;m!RTpIeODz@CW_&s0+K%yU=rs~J zTihmIwrB~kQxRfMzLRK8_6LTCcR~G@%k(`PwVFOO9y6_VHc!I1^@{Hb@QOuWMS5Q# z)sE_RQVay=72#mSk>SE(c(>kok!>Dh!t*xjPYoweED4nx`i~zRi`@af5_nbD{<>c6 ze6q5)>d#~`II8aQjQ*IaH`epXm>C1uN8WsSx*sMI5V3X`g`>Zr&I~j=!N@VDg^l%}to#Ai2gt84>Mm0SiZ6mbV;k&L^ToIHJ#iYzdaRhKExFzk zMYu6|7tGkg9mcYB7nqm#{(vh~o(zTqsrKCvKw7=ax>u$fO>$q#4{torcPdov?*j6W zqqUigGnXFwb8J(`` zFSJ-iBdqom8b#UShDJo@KzseGYp*vb;^FB()|);aeFAXo=il@V`7KScJk;^@NhKw~ z0gs)C*WK@<28)pD;CR6ee~EV8AQCx@2)2##knd+$ChI}bXAH`ZS{{yYli3ao6DvUd zY1vwf%wmXqmHx&CsNgooyqt?1DVwj}1-kh~LL@B>`R1g~ zDx2cjQ42Gz%?Dm)%0&r51L;g?ppi^u=e&t;HKF#ZAv)ht-rcVjF2lP08g;JBc?(~S zs0|9Mej@uCfKlT*K_L3}ZMMT)dhd478vi$pjjV^$UGEvM+D&*}K&dZd!HDmvd{%Kz;3pLr5M5r-%cy(%m&Lo|Li{$zp1;FPutQ>tz6m(gGb!#4$g49Q8g%0~rm<>}F%dc96I0JYP-U@E9Wsxm@4c)HJ&SJ$KIn&aGXnNYjK*1c2WypPM#na36Q@GC932Q+(dXPch&N<@*pXm9VZS-pJCmeyqV%Pue|2tW_d zBe8vw7Z+`&gKC5i=dn&0|i93Fy zbF$s(qZf`Jwfws||F5Pv^E{k!DH z+ZSJCXETNUNAYfj(7(wf3E%A{0L8q+9Ff(5^u|uO!#eso@veY7U-O3+FAy+w#h&4R z*`!D!e9lR-i-6_DvzlbFiehWmrbg{pXo>YUED0Z+cX8LX2Tq5TE3$n45!rUIq!O44 zRZN+y(5`NlGfwn#N=^!EmuNE@uGNcy*01o)_}weDXBDij&}EXsmLCEffj|#343cd0 zbO{FaK}Po5r>&rScI37xx6+ta)hf{UhP@jrzfi5vug%x>t4v+}cJi9IW8Q?)yCGq$ z0BEC3BDV(&8Ug3;;{)<;xz1(}@iNanSnXPy7mt_R;qu?*K!8_%`G=F-L>B9$-a_}f z=*@*f@(R1EU2IKGjJrwa#tL@Pk3vb8q}KeVK)-6^geV^WYb z8?Mr)c4wjrlJeZlt)mWnQEi@2%h-N=;jN92`{IxixK(J}ZEr0I_{+p(Efek%DI^~Y zc%CYqZgFU{;is_Eb$#|PEHm2Jcy;7xFyCA>0DKChAZ0kq_yQ$f%DlmTh)+QI&a-@lZDpAG@V8@Lav-g0O*y; zL--oztKs-siK5%bBgOW5?tsZ@5oNqs3$@BZwJoPljUG@q8Kj?Qot7T_fG-oN$n|NS zc2&71LYdPPHygRtMDl>M&4Z75KR?0zOK~*f?meE1H)fO0VA{i7pSwAF$1l?(f}k%z zwj7Lsi-+O8(s6A7RP*tCE(8+?IDMx1>;+CO1_hoGL`;InwiQp!#=cNWq|s1rvzdoz zaagZSsM!MtBd2nUDNkcJQrfPF7oj6Y-&e!6 z4vq?4G5SXw)2cBj+#cGVD)eDkKqz=&4JL1H_(8Vij;V8@u|tCTAsN;iG8i>K7>$B0 ziNG;aub;-zIJE2q9xqF^n}RzKtFxSz3;_EZOXHgTrHbjR9&R6&%}k;Cq9rb)hA%j` zL4S9%4GHC`Xz{~9yu+^xrnTI{AQI3NF9uxH%1w3{KF&2G?#;5Rxk<&p))OCiA2*@t zb$7>De}NZ@fNGxUG=%nkshqL-VI{Sbev~MVPEn@nb@A+JYn%ngxVmcm=B!&p98>vj z(me1grjU+@&yjBO@03=}cE5V}90<8L(G`BB9S3**WR!o&^pw{srm1$+wDpeVHXm>2 z{oDByB->d73_DnhOFc8AJt6AcANAKx^1ik?%`VqTLopdc{egw*na`+X@xBO&bZNx8 z7cYnVanAIsd~fZYb<^0mACrF&2=f}d7IVH_jxzaBuzSVwpKsk24vv&Fnx;{5UMxQTg}uoLk4NWVBulgXN)P{&sv4?OaWXasq^} zh*Ymx4ahbg_o!i8iuA_l?LsPk?WAiY(^Q&(5QhpBHcDafNKR0m@){Ggq3q39p|cgV z_*U&y>vt7(4&fOVdiK5Z2WH_}Iswhwu9hFR?-0=^x_uDBW;->b$>CCfiZG8`0JM9% z-z-FY$649C^1Va6LBG?40rcQLX4w=-{(kx$3U}!AvL0QmVICeD7Fmb|x>dIBmI(^UlgwHiG!5e&pz+ctH4fOoAPdWE2*9}b}5ANe(5Sy z(DCpRId=@5S%-vYS`4%4x_85wT{th9#adM^w>?6J`PBm)eJGy(MZ_b^O3oYc%Qy+JnQvxlN!Mo7X%r>8p~)F~MmNC11)4ga^#?_p&w{o;leG9jLXNl|&}lX}+4g!dUOI#^l!?=xg~2SqQXTTu(O0KhEo}iH7O*W`=CQ4Lnn_hbvX#`rVz> z!v~>|;k<75SM$k?hAm<&SMwklH_xbEHET^yhTMECN&2WaMMJTZ1~ns;{9=+z}Py)=A;&UzSOZW=7V-YtqJ6hYWRvV7v|s zE;oi2R{97s6yxs_+Vig1j3TH&^o*>riD|&RvS_C(1)>EK_r%8q5mg(XUk2Q}a5=c` z;lsGrK5Cz4i&x%MghFHpepbK|X!yocr;2{2S#LvVAA2oU0||Q3ah<43%HGlnl<>^? ziY83K7WWK3$R%Ai@;uHp`)VeTIAv9Dg zx=@6}VvkQkF-6*wgbVC?3pS;5ih8sHn=jWh4Ymzpu2H)u-?EbwbSlc)1;Y4iO2{qC3tf*Vwwtek21Iz)@#jozgAncY4af-jkD_9;+Bhx@Zz_0 z;rz8!lDk}1#ynr)B8S1-N8diN6x)n3gS42#%+1j{rc2@`QDBVHC8pvNt09{#^Wg-R$}0ZBZ=n^7!UYUcMYWyBM@`*4CJXl6wpjCOIS?clhECJ{VF|08gKW-ZfE`k zeJpJGLnHyHle#I?q$Zl$;uIf;katAjZc7B;>@Jr+qF5TK!~HaD<%5lD=)REo2#R== zN;K*$9mOG)Ob~3Ndo%3jq@mJB*PZ>s3y6=3R6*szl%GX51XPXqtDP-hYUU<3Qy?^R z)YQUQJj>k0KydRVQumpPkmE9vs{EGnAXD#1C@4zo1}22&_oOpO0Elu6$)yk>qj`qb%}y2Q}ph2yDQ_kx#PQ` zrSgx)<1XHd_42(DR!+yWz7xF5g*0+qq*b<&g+IL*M*Y$N3_c^+KPO=X`q z>!#ndZq=?i){NAaN@o4gE(U((ca;RZwMDQ8Y`= zRyd8IrIh{~H!l0{TQ}q3re-0Ld2jS)k14_iC^#b;8wW&OL0A&gw=_}d-rLujz_7I1 zqTvyj)G9W@AU*-AVu#*IX=<>kFE7Jqx9@A^Ii=TCMn#n8{Qv`fV$jE_HjSYFJ?4Vf z${x*}#}C|_op(OwQ(k*bmpitnu4dL2LGzA;8^W;Omg=Q)zBgQ1LK?WHPBtm=4p}~} z0}*#Wh1a>|B0;={>C1(D(}1$s!)`qgBE;_ds1yce!kd{-1#^1$a7y|x^*X_=Pb`%H313KH&sfu*2C*4Os-Ask zDqA))QE4>j|E7re>1IPp1sLe%CuBRwDBf0rSwf7j+aF#_T^scdh@ed~U!%}2dtB9Y zXk%-+Ts@fdPP%$wX=KKerTNYAcS?^U`z+ma1pxU7P!cL@H!W+s8lHszQTKFdQfeE{ zf>vQzGqX}yV52F74Dz^`9_z+8f`jKc*o~=T!u;zG&yRJMmp!>ZZoNO0JHAXlSiIQt z-nID9N?>AvOJs0{z(G-;VZDX2_~Euci&fK3EQjF8MZ^<>#ruKLKah330=M@boa_!u z#6Z|ATlx}Qj(4>NXf>>dY(E2vY9xo&^Bi6A06s)09ijw+*=|Ink{c$`qj&XdpPvQX zkUQXUGah+W2S#5iK^T-+eL%PkD|P)Xm__9@Q9ZAZqU!(80D3Z8?oZkrKxS} z2pDh#kyU`o4b(DnC8Wzwe!m`jmJqbxIy*p-k9HG7%kj@s*#=iQl535v@~u5n*OLTS zL$Ch5=xuzvya7R`*nTjr%!k#zB8=+5=hlHt)ansy|5`eBPqKjK8wBI`MI*1a?EJaA z^;k@U56`a*b!lZp$h+-tLR%G+vy{BsK`^l^Ndzk`NO~~l_~V9?W~|55gq+oNdO0PF zNXv??>au|3QV5&TWl}n^v;kMSG0V(w&;joBUe@caQeh;sg0k7!$q#bTpQ!9*LTZA( ztI1}mX|n|7Iy7eQ=_e?%b>h5u@uJ{ie|W}`T&b(_u^!3OSJF&|?rUb}?C*ZEENS5( ztYB>Z{i75=%d!A+@$iDx3ScWF=(Gk8&ZlS!vvfaBTxvn{g2OKya%F--cGK*5 zq#%NuOK3laMLPBp@FpSvN?x4lkMic>tU%YB`bJee;V_jam~LyGu_Y*7Cg}`s>tNIq z5}qS*1$4(9?7OVfIVmA)4)^4uKBPTYRxI5;xr%I89eDa?JQHb>)dWvD0F2Fxfn3dD zN&YmcUU`sM4m~NFYmaMXCoWOK;I)qDds5NTYb_sHZ0=5^h~dXb#_=tNSdj{5m8%-n zqqg|nc4)Uk-AfYOx1`5OYE=+x@1Da==Z*0#^FMfO)x%+EBChcr$;u*@!MgRaYWtNq z#jm;6)R4eipnW9+5Q5b7G91`SI2GXVT}S&yb3Q^E@NgS?G0XF|^$S<L-4r@X~L55;90hA(@hp1q>{@_U2}_Ut_P7~x&-HH{Z(i!bFGtZ}*?^qQHq`tOEB z3ajC@XebHBUNn>T4~H#y%jEkYxA9%C*|b>Gb&s-Jb+6V!k-Glbkok5SArD5 zZ@vmSp4p%^Ke*Qb!{jdMjEb0 z0!YdAcq=A~{WKeeqBXrJ0L*pzDkk9=jPn;?;U5QM*_&DJbkFhu0gG4~xS0?2Z$BK+ z&imPnq(o4FHV*vzB<=5C0ET4#`hs7HOE>H*Y>DDH&#)RmTohy|<3F`vEF_O#7E(u+OGVC<*oKxO3Q)F(G66vkS&BVeFW*MvkB$C8_zFAvf!6hk%{YtSUWR zA?^jrZjUrV2I?mik@n~=cpC1qOk1C?QxSIS;67|pV@?1Qvc0E3)bhLH>w{>?Mv!$@ zcA?vUkNeYCL?lWjOANWi0o*P<1!fhTBlve$K^k$|KTA~pXS%n5Nqn0B{XfnB=Cx`> zO&mm)sXBf^E4ZN+BV16!q!p#GT(4FZBcuw@m%qkr%AIHz_hPIQ- zhwmIQ@g2k~h$sn>!UKgUUw_t$^MvQTN!hSoVw^e!&+`$1lFy!%=xJui$6`2450H4c zxD1SlAiaav-&vV3R#6wZ4+zm!C^ZCv`5i;s@?fLV0_(e4>gvhxC9@|NVVz=*0mT); z91Wvx%eiN*QA)2ITU+>c#NtDMUG>8he-?EhQLr@|MiQi%r%ytscqJS(Ah?Emh&sxH zHUcHu@u0j(d`$D9-Qb>!xPLr&Fq>a`zsEyY_v2rGHiEM{Q7b_!jGe%1)=<>K_X_C5 zEC?{sl-Q4mQnIDv{A_Fl70;eI{;Cyq*GW@oUX*txJ5v0aFp?x6DSV_deg%yVn3GSYiO#UM0+g=lqA%cf^C5-;%n$y$3yKpVu5NC}ONNZoEgA=Mp~r z%KLyo$U5JoNbB~qr{i?S*yyofZy0jKx6GscJ~{j5Wnj#O1EL_4T+{C`KdZ2IJaZq{ZX6f^>$9OrNf%7k*rW+WK& zl2&*aWP?hv+5(^a+F^LdVsqh}dx`vnVCHJLw5AliW4;)2Iq{NWqLb?Iyz5oUIa`Od zu%vPg;W;J^JG<+j1|Rza7JUMbowi9T<8)>UJ}$1wH`G;j)j37f-cjn~fe$YqcCboK z%$!X%o_lRs=S~5psBG@CC|j5 zbBM_P7Zz+;{pwfH(v*C-AQEyO^wqb3k;EtwN$p$p^x!?Qg%@4WDK=Mq0igdCol?mn z8&y5)_5fJLmtt`xR@BwO!~W5cBhmNu&|DPb+Y>DUcl&h0Cy4|ZK@LM^!(N?358oxy zEGicx^j%4>t6td$!&zdzW>66i=+P_pl7(=P8xjo&mh3{SUDhv#@{ShD4J-Mz*p=&G ze8|4>*GnFr%b_N3c#I5zMS%CcACS}}&v+fLDf2%a!%7n5{>)R43I}jNMlhvByTS<=Q#FyQiKhUKgPYKImAAexRm~+CrX$pjx+J8PRbsjh`j?mDJ}&ygTFoz71x>v zv(?VO#NTZMV*oM6kanF5-WH|H`Jf9dsy3q~--$5DxhMMN1U&W0dJltxgBy1eLLL+} zP=n!PV|9wuahdRyA=@zd*bd{RX+tpSI{k1QbclY$5p_nUZ#$v?P&4aj$4lU70-z+;CF2?JbD4~ zs7vqtp@M!^#}brMP}CqS_-AdO@an)X4k(29@N`AS(leDyYftyT1Z%_ zyUW&!7}a}`MQ8M2Y(M=I&;2j$`>Wg1%Wn>V9hU~i*WHV~c9B-x!#6C+*?1wR!&Pt7Y_ zM+?O}U=YQi2r5_4`D^yZ1JEaZb*M}wnvGaGOSR=&iK_()36~KwunMeM1EN;*P3X*= z8!YjnEW^5`iyNxIzvqqaJGFqgr2#YV3+>l5*~3iR>Y7tw$1k0%Dd>ToSorvdmB#21 zaxM6NLf?CjbUXqgqNYNoaVRZCxUge90+kP=cUm40+G~j)L5{K}OQ1!nkXMKkXmj?O zgm(NcfI0RFzWwl%x6MLY8qT9+@&f9qw>&YS1ph3$Fm;}>K5x$&b3XAU_<5ei_OJ`- z)|rc*_H$~3Z;9z{63P%!-ihzY{R-wdvC#J<@j+!m$eT;3i0qK-Vj8T_)eA)jL%1WS3RMUdz5xZMw$NF-fa_r75jjOGAPEq83JwE zF!Zr)jc*0$yvrhOP2jV*)3dIS6PKFc)pfT`mkhVr(NmEWEBz3 zSos5YaG-yxazl!_J?1z;ZA(1+zK4a0s$fD1PSq*<>(!~8Pb2p-)!PxT#T>6B*26Vw zIXbn6zKQHFYdHRpD&u@SUPl_mqVR&(Ox_sado#5QLuCS!uCGgjJqO!u=Md7SoXl%- z$z_5SY>wUDN1mH%H-<2BJQPvlxiE*L>2x7`koH-T#G7mwlK;|Ql=-gE3SixvG*&Rk z4;(&TuF+QO>-8u$I`;e;FiTFf#*vu0gGM<+`^7H@VZFSoS9j)dLrx3ln(bDLr1|9T z^5|_14_D#v2nd_I!DFMnp}dp`EyJF$ceb_JMLg>4bdS8%iyaKt+Q5o>Z+pd$-z3T>D-KBbAlwb`y0xbOy$ z9vJMb-GPofueJoL&}FS4_?Ys%n#u?w2--5n#Wm?NWd7>zc>(z=p_&*UX`_jd^pgkN ziFV>nT6;%xMajrU+<`W0u6E+o2H@=VCIB!6@Wb?D5tvHPedx}C0+y&!b8eAU84o|7| z*T!kbaV(>xTm1xC?D|mA_@;QJ6keL(yk)yW<@DpYI=k0#_ZCUh{z$YCU#f^H;McTfPa(>Z9 zDP_b}T?>@5K#=-;()484Xa1%DR`r^OB>lS%@uQhmP0JVIPPp%wOplqRCKOy9GPy`Q zV!49A@b*tEwqsibr1-m1leY#zqHItb=_!=Swmg_iyHbD$9eFe#)TOjCa^kt0<~s5E zGX&awSbzu4O)>;C=~|1vxPI-vlmN5b;ZA`+hA(OvT@^PfW|GtOB65^3x|Osg%FOx( zKzx~sBn%~_{NeFkDF8U5wAW)lIg6NaTWyI@LCX&P9KCXxNv2L&vkQTxki}JJYlijw z`V$q7@)BmUhjYx2piR%882J5$TU?kN2gtl7G0o+p{X)8Tsx7?B?;EFjNf1qGWSkod zCLvvi*}Am}i3S#NbIX0|vV0)B)!@H>?&)40vZP)=%QnSpVS8>-4Nw#x=$w@>ae?hN zUgeW`v=K7tPLf$W#r!9}P$0e5?nhLj92<3Q3Ti7!I|mS@4j7$*zXOUb6;GCtf~O3; z799{1j-~KOD0VH1j6BO+XHFUbOPT`{zVp-0fMuj0lY*vAAq?(5+q0h(3z;#FMl2Ic8Ae#c73qO3##VV*oy=Mf_%(4$8vN$iC5w0xpbYHMvLpU z*Wsn6VY-LuppMG;2&_E)68A-2Fz8_U`z6=@=Sct$w{nHh-l2C%<*Z7z`9;)rFRrg& zz*UAkmqICRFd<@rM?c6fM6=~uhQ|}5Di0bzq4zj^YalimYafJheGTX3q@?Y)_PI+P zA`r7>bp`+O^wOA}}N z(m9qbv|1U82Gt_+Y`%TcAS=ZU;p~sS$Ye8yRCWI{`G<*j1yk?*M-vYK^&rlcfa@L6 z%V$tI1Qc?4@u{j-3*QO=|6^hny5#WKr;B~G&I}g}S@JjU7ReSv+!?w2i}y#3X_^Nm zep>LA!4sh?7$Q*Vir|e$BQ6tFuCfLk#cDa!7cIMbj-SyQR7ubf6CV_i0)SaY5(K$X zPic1tPZqgu+UN@agU34RoJN;nCc$-*J&As-oRr{Zs5_zSWVb3sjmtnp>$m@Pxk zr+M*@5P&~PlJ{Bs`i7HU!7MW=0QLN|*U%n0i>sbLie1F7@o|z(A#UJ%-#~CI^(RkW zGak8iWLt54CRJGxx$$2~dzY;fd`3M?{-|1VxeIDq*`_!B80;_KslACsKu}`v>h)tB zFEMflW9bQUBw*st>*)!wnl6C=XKQy-Ihaq%%hFm4e+LD7hCi74X^d)4uu7)U3XG0u zA1}*!K>d+NudpW%VzF`y0Drrl347QPVdyv8^(T40LUI7r``C7(+ZGU*va=tUcH3mAOjsW9!I0D z5ISK_+lHvCrg)s%CFhhGQ>Hufm1hwx8B$43Q3)QuaD$vuRqv77qq=jz^!(9VPr13w z;W~i-xDr_oN0+QZ*$`8091F(illN8}*Z7z*hKhKw)zB=-lh;W}81i*wW1Ll0^FlkGGpe%ekso9lyLuqZ2Np7z^M$vqYOEEV3 zZ7_u=8Dq1YZs9YEt;OCDT_-)0X22dD@-7K+xVGI*U2gZIPs8a<7*(_wu;+}?4_gl# zc&U@T`zihBMj^5J_^$w8?}ab??(}FSfy-uoH#WfW&M{Ikb#^@wwp1}U=~TUb@YqJ| zBqo|%;*#jy`K&)YIZWJ5lZ#s7PfzY?lV(>+@qj9x4V{@agHl|2yXr{CMvU65BUk&v z=KD4jS_nRp^3lnhb-E-4y-_j^?8_0cn}bvCnV~jr&#W`1Z0BjVcD~aM%ot*js_t2< zr2=9-ed=1UaxaS^R91K3ce#@M*<0Dcc(6Ypkjgj{$L9ls#5A8#zIgH}<(AXA{0R5SLP5an05?z;Fr9GW+5J_elbQgW9r|fi zF<1|NC+7f~TVJ)am5j07EBrk5fI7!FU=e2DU;3KoH>Rh$S4w8j{WPkK&8%ZCZayo# zmBAtR31#}4V*FDKgfU~KOG85K&vg;1+X3w6NL;%Oco9aK}^)7*! zd=nARqtB~Uvq0G#vLNyMaj~9iQb}u89!FS?mq;mUb$)L@pKgG-vclwzM4)}1!;N4V zXuoJ7D*=PS8Wsumbzg2cde2&3KZYtB*CYdf;4^T;Ky{;xkiN$RZz6#<6McrHY`mJ= z3B4G@O98uXX!`POFYAfz)vxBSv&;1zWkrBPixbL-2Y_LnY7r0d0GO4 zqlnd3r@^`1;jpqei9;rtk=waG=e~CX z78dD4t$SNb9ZgROAP<@ou&_I^>lEVpd*4jzUqz7HS00&}VdIbxxDZrJMagQ>K9~0P zX6zw`Xj5z&6eWADuP(?#GXgs&c{s@E?IatdM=nN*R&5nfW8;w9n|n7Dd2d9`?>Jfr zY6T0)=&yU}B%JX%i%r{@_iFB2Y``Rh{%s)NS)4drk9f|ERFUdR)#j@R2V`dl4#5@3 zCgiyvb3~r#s{a`*nPe*Z0?z5`N5aPwqusHLO3?UfBTZJxX>ad<0uE-^XY&sQaR}HE zU1xmEu)rXO?~6T8^T_RkD<+PMc(JhYaYTI$7~0=+kXAEDNYSsxg>9Cp}gbR2<3ZPbr_?hHt!o%(hZktz=J z+&4N30G};Hzu5`n-`DeEC%y7xfMVYM8g~-Qb*^co)h(|8I;dT*>ARmo%PGXdqfQR)-;SiLZ(?9xb>x6P&Q=o9MvXEvxSt9}aA<{5)o}R2*8c$J3Q>^VIVg=Xp&oBzD#Vw-Yr_kKTo$(MB}Oo$)-R zMp!ZvP0vmku&}jo@{XsC>^+~QZn>!U)H6v=m*S!fYs)I?Wv6#>C$@(LjrK!8X@q<) ztkukgIw!>BV-G5=;WJ-;)gaSH;QQXk_gMtrHC1A^Tqx~l=`)2`I7q@F7y#v#^}GEt z=xJT^|9dI8EF4@xr}!nF$B35pEs{>c^W<=;-pke1btm=E8B{hdH?c@(vQ^OeC$B_L zwC+G9WbJ)psApsv2OhrHTM@7G1=MCrRzdd~b zK6SBopG)cS&#Jh*!u5m|3;QOh#1q-;Z0?-i5>9Uk|C26O71~hb{G!AH>oB-!{`Ic$ z$GOshV_^dn$Tks%#z@y>|=ED{xf__+I2cGql`U5&3=^A#}cbvtN(`x zD$v_q{K5_gUBCYEEunacYIcO-fz+WR4na+wHyvtN5dF<4=2zX{fCL9c0tt36&vc3e ztku?D3S8QczD-<3Od&K6!RBA{$va?EqY}0q(LO5}grXyWutPW9H7@wtKnTuNEZ&}& zN&k5P7DSz$UIL%)zR0kei2cV&;bGLU5&NBf6NSBgux){&VB^PJyeIS~L}Afqd@h5n zL6ICTz<7GAJ{2NPwVPA9>r^2uWL%G**^>Ah$?4ZIz=6#}=jFDaxgp_oh}!)IJKoyb zT9Vk^0O!S~6}ANWJL!fvY8{!sheV~(5q)X$h0xo{G;9pUeproem<9Xq?DLa(ByQJW z{rzi1w47T->HxQ={#D^^OlR0@-vgjVxAsN>s5pevqML z3(7Q!L@pWqA0+-3j^!teaai83E2aC$ky7 zMsz2`7^Cxk19cN`lOJ!V>gFDR$?}JS{F&sW4i!-qu`PS&*^}_P-=StM5 z3>4Wx*_R45q+MLpbbZ}H#L2sV;wVf+EGR*d6y<$zX31#{Tv6Zrj^iZQecF1tIkcp= zWRN#JJBt3vOUVZfVx$NX9-E&R{u#P6vTTymlOjNaB0trjPVLB32lv$UKFtaIdpUuS zi}v-z@o4Cu_QPL=BW8-e&zCDIpMwB&Xxik2lLv z4`_uDSzH~n>iPl-#=1Z~dWIqa7~+%&oa2r*eyw_vp7~zx;0mp}S0`7nu=9DQ@zJ~8 zupP?zU+K9`PvAd%eA%!KWTW2HA9~4OL1@p5-#NIVydoCk@a12dX7JEhhoell_7xe? z_}Dn|I4HwdoQKpnFR<5LAEaEgi~(1o82N1t@n06 zK~dGW5dQQXkQ9Qw;Og=AZiBQoSxb4?AGe9qRNZM1@ie=68h8C4oDkN}2|O~ZKt=pr zf)!jB8iE@zsa5giU#Zm{qxzZ}(L?295pZ7ABsTQI4x6e;>yM7JkHuh1ZP;7a8E^>n zf$QtUa`lLLvjXhD6>ne%o>-2%D+0Z9Tl1aq0r}pH>kn_CUtuCly*n>qEhyQem^Qu- zOT;(=V2ihzTa6h9CdBiTtbUO9o`^74fW}w34zc{UasMkN`h#XC8BD+$0FMoQ*{ErE{Q*t02Oszp1!EaHE zUY5Bq30{yLiYalzvj!3z#F3b3v^t(j%pn2EKME3VrrOf^|(7o{i iP0()}!GDPtgd(L~R)rUDiF%y@f8=D8?&L`szxZE``ba?l literal 0 HcmV?d00001 diff --git a/docs/ingestion/kafka-ingestion.md b/docs/ingestion/kafka-ingestion.md index de2024129834..d79296a55b6a 100644 --- a/docs/ingestion/kafka-ingestion.md +++ b/docs/ingestion/kafka-ingestion.md @@ -24,41 +24,39 @@ description: "Overview of the Kafka indexing service for Druid. Includes example ~ under the License. --> +:::info +To use the Kafka indexing service, you must be on Apache Kafka version of 0.11.x or higher. +If you are using an older version, refer to the [Kafka upgrade guide](https://kafka.apache.org/documentation/#upgrade). +::: + When you enable the Kafka indexing service, you can configure supervisors on the Overlord to manage the creation and lifetime of Kafka indexing tasks. Kafka indexing tasks read events using Kafka's own partition and offset mechanism to guarantee exactly-once ingestion. The supervisor oversees the state of the indexing tasks to coordinate handoffs, manage failures, and ensure that scalability and replication requirements are maintained. -This topic contains configuration reference information for the Kafka indexing service supervisor for Apache Druid. +This topic contains configuration information for the Kafka indexing service supervisor for Apache Druid. ## Setup To use the Kafka indexing service, you must first load the `druid-kafka-indexing-service` extension on both the Overlord and the MiddleManager. See [Loading extensions](../configuration/extensions.md) for more information. -### Kafka support - -The Kafka indexing service supports transactional topics introduced in Kafka 0.11.x by default. The consumer for Kafka indexing service is incompatible with older Kafka brokers. If you are using an older version, refer to the [Kafka upgrade guide](https://kafka.apache.org/documentation/#upgrade). - -Additionally, you can set `isolation.level` to `read_uncommitted` in `consumerProperties` if either: -- You don't need Druid to consume transactional topics. -- You need Druid to consume older versions of Kafka. Make sure offsets are sequential, since there is no offset gap check in Druid. +## Deployment notes on Kafka partitions and Druid segments -If your Kafka cluster enables consumer-group based ACLs, you can set `group.id` in `consumerProperties` to override the default auto generated group ID. +Druid assigns Kafka partitions to each Kafka indexing task. A task writes the events it consumes from Kafka into a single segment for the segment granularity interval until it reaches one of the following limits: `maxRowsPerSegment`, `maxTotalRows`, or `intermediateHandoffPeriod`. At this point, the task creates a new partition for this segment granularity to contain subsequent events. -## Supervisor spec +The Kafka indexing task also does incremental hand-offs. Therefore, segments become available as they are ready and you don't have to wait for all segments until the end of the task duration. When the task reaches one of `maxRowsPerSegment`, `maxTotalRows`, or `intermediateHandoffPeriod`, it hands off all the segments and creates a new set of segments for further events. This allows the task to run for longer durations without accumulating old segments locally on MiddleManager services. -Similar to the ingestion spec for batch ingestion, the [supervisor spec](../ingestion/supervisor.md#supervisor-spec) configures the data ingestion for Kafka streaming ingestion. +The Kafka indexing service may still produce some small segments. For example, consider the following scenario: +- Task duration is 4 hours. +- Segment granularity is set to an HOUR. +- The supervisor was started at 9:10. +After 4 hours at 13:10, Druid starts a new set of tasks. The events for the interval 13:00 - 14:00 may be split across existing tasks and the new set of tasks which could result in small segments. To merge them together into new segments of an ideal size (in the range of ~500-700 MB per segment), you can schedule re-indexing tasks, optionally with a different segment granularity. +For information on how to optimize the segment size, see [Segment size optimization](../operations/segment-optimization.md). -The following table outlines the high-level configuration options for the Kafka supervisor spec: +## Supervisor spec configuration -|Property|Type|Description|Required| -|--------|----|-----------|--------| -|`type`|String|The supervisor type; must be `kafka`.|Yes| -|`spec`|Object|The container object for the supervisor configuration.|Yes| -|`ioConfig`|Object|The I/O configuration object to define the connection and I/O-related settings for the supervisor and indexing tasks.|Yes| -|`dataSchema`|Object|The schema for the indexing task to use during ingestion. See [`dataSchema`](../ingestion/ingestion-spec.md#dataschema) for more information.|Yes| -|`tuningConfig`|Object|The tuning configuration object to define performance-related settings for the supervisor and indexing tasks.|No| +This section outlines the configuration properties that are specific to the Apache Kafka streaming ingestion method. For configuration properties shared across all streaming ingestion methods supported by Druid, see [Supervisor spec](supervisor.md#supervisor-spec). -The following example shows a supervisor spec for the Kafka indexing service. +The following example shows a supervisor spec for the Kafka indexing service:

Click to view the example @@ -78,8 +76,8 @@ The following example shows a supervisor spec for the Kafka indexing service. "dimensionExclusions": [ "timestamp", "value" - ] - }, + ] + }, "metricsSpec": [ { "name": "count", @@ -93,13 +91,13 @@ The following example shows a supervisor spec for the Kafka indexing service. { "name": "value_min", "fieldName": "value", - "type": "doubleMin" + "type": "doubleMin" }, { "name": "value_max", "fieldName": "value", "type": "doubleMax" - } + } ], "granularitySpec": { "type": "uniform", @@ -123,7 +121,7 @@ The following example shows a supervisor spec for the Kafka indexing service. "type": "kafka", "maxRowsPerSegment": 5000000 } - } + } } ``` @@ -131,31 +129,46 @@ The following example shows a supervisor spec for the Kafka indexing service. ### I/O configuration -The following table outlines the configuration options for `ioConfig`: +The following table outlines the Kafka-specific configuration properties for `ioConfig`: |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| -|`topic`|String|The Kafka topic to read from. Must be a specific topic. Druid does not support topic patterns. To ingest data from multiple topic, see [Ingest from multiple topics](#ingest-from-multiple-topics). |Yes|| -|`inputFormat`|Object|The [input format](../ingestion/data-formats.md#input-format) to define input data parsing.|Yes|| +|`topic`|String|Single Kafka topic to read from. To ingest data from multiple topic, use `topicPattern`. |Yes if `topicPattern` isn't set.|| +|`topicPattern`|Multiple Kafka topics to read from, passed as a regex pattern. See [Ingest from multiple topics](#ingest-from-multiple-topics) for more information.|Yes if `topic` isn't set.|| |`consumerProperties`|String, Object|A map of properties to pass to the Kafka consumer. See [Consumer properties](#consumer-properties) for details.|Yes|| |`pollTimeout`|Long|The length of time to wait for the Kafka consumer to poll records, in milliseconds.|No|100| -|`replicas`|Integer|The number of replica sets, where 1 is a single set of tasks (no replication). Druid always assigns replicate tasks to different workers to provide resiliency against process failure.|No|1| -|`taskCount`|Integer|The maximum number of reading tasks in a replica set. The maximum number of reading tasks equals `taskCount * replicas`. The total number of tasks, reading and publishing, is greater than this count. See [Capacity planning](../ingestion/supervisor.md#capacity-planning) for more details. When `taskCount > {numKafkaPartitions}`, the actual number of reading tasks is less than the `taskCount` value.|No|1| -|`taskDuration`|ISO 8601 period|The length of time before tasks stop reading and begin publishing segments.|No|PT1H| -|`startDelay`|ISO 8601 period|The period to wait before the supervisor starts managing tasks.|No|PT5S| -|`period`|ISO 8601 period|Determines how often the supervisor executes its management logic. Note that the supervisor also runs in response to certain events, such as tasks succeeding, failing, and reaching their task duration. The `period` value specifies the maximum time between iterations.|No|PT30S| |`useEarliestOffset`|Boolean|If a supervisor manages a datasource for the first time, it obtains a set of starting offsets from Kafka. This flag determines whether it retrieves the earliest or latest offsets in Kafka. Under normal circumstances, subsequent tasks start from where the previous segments ended. Druid only uses `useEarliestOffset` on the first run.|No|`false`| -|`completionTimeout`|ISO 8601 period|The length of time to wait before declaring a publishing task as failed and terminating it. If the value is too low, your tasks may never publish. The publishing clock for a task begins roughly after `taskDuration` elapses.|No|PT30M| -|`lateMessageRejectionStartDateTime`|ISO 8601 date time|Configures tasks to reject messages with timestamps earlier than this date time. For example, if this property is set to `2016-01-01T11:00Z` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps earlier than `2016-01-01T11:00Z`. This can prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments, such as a realtime and a nightly batch ingestion pipeline.|No|| -|`lateMessageRejectionPeriod`|ISO 8601 period|Configures tasks to reject messages with timestamps earlier than this period before the task was created. For example, if this property is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps earlier than `2016-01-01T11:00Z`. This may help prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments, such as a realtime and a nightly batch ingestion pipeline. Note that you can specify only one of the late message rejection properties.|No|| -|`earlyMessageRejectionPeriod`|ISO 8601 period|Configures tasks to reject messages with timestamps later than this period after the task reached its task duration. For example, if this property is set to `PT1H`, the task duration is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps later than `2016-01-01T14:00Z`. Tasks sometimes run past their task duration, such as in cases of supervisor failover. Setting `earlyMessageRejectionPeriod` too low may cause Druid to drop messages unexpectedly whenever a task runs past its originally configured task duration.|No|| -|`autoScalerConfig`|Object|Defines auto scaling behavior for ingestion tasks. See [Task autoscaler](../ingestion/supervisor.md#task-autoscaler) for more information.|No|null| -|`idleConfig`|Object|Defines how and when the Kafka supervisor can become idle. See [Idle supervisor configuration](#idle-supervisor-configuration) for more details.|No|null| +|`idleConfig`|Object|Defines how and when the Kafka supervisor can become idle. See [Idle configuration](#idle-configuration) for more details.|No|null| + +For configuration properties shared across all streaming ingestion methods supported by Druid, refer to [Supervisor I/O configuration](supervisor.md#io-configuration). + + +#### Ingest from multiple topics + +:::info +If you enable multi-topic ingestion for a datasource, downgrading to a version older than +28.0.0 will cause the ingestion for that datasource to fail. +::: + +You can ingest data from one or multiple topics. +When ingesting data from multiple topics, Druid assigns partitions based on the hashcode of the topic name and the ID of the partition within that topic. The partition assignment might not be uniform across all the tasks. Druid assumes that partitions across individual topics have similar load. If you want to ingest from both high and low load topics in the same supervisor, it is recommended that you have a higher number of partitions for a high load topic and a lower number of partitions for a low load topic. + +To ingest data from multiple topics, use the `topicPattern` property instead of `topic`. +You pass multiple topics as a regex pattern. For example, to ingest data from clicks and impressions, set `topicPattern` to `clicks|impressions`. +Similarly, you can use `metrics-.*` as the value for `topicPattern` if you want to ingest from all the topics that start with `metrics-`. If you add a new topic that matches the regex to the cluster, Druid automatically starts ingesting from those new topics. Topic names that match partially, such as `my-metrics-12`, are not included for ingestion. #### Consumer properties Consumer properties must contain a property `bootstrap.servers` with a list of Kafka brokers in the form: `:,:,...`. -By default, `isolation.level` is set to `read_committed`. If you use older versions of Kafka servers without transactions support or don't want Druid to consume only committed transactions, set `isolation.level` to `read_uncommitted`. +By default, `isolation.level` is set to `read_committed`. + +If you use older versions of Kafka servers without transactions support or don't want Druid to consume only committed transactions, set `isolation.level` to `read_uncommitted`. + +Additionally, you can set `isolation.level` to `read_uncommitted` in `consumerProperties` if either: +- You don't need Druid to consume transactional topics. +- You need Druid to consume older versions of Kafka. Make sure offsets are sequential, since there is no offset gap check in Druid. + +If your Kafka cluster enables consumer-group based ACLs, you can set `group.id` in `consumerProperties` to override the default auto generated group ID. In some cases, you may need to fetch consumer properties at runtime. For example, when `bootstrap.servers` is not known upfront, or is not static. To enable SSL connections, you must provide passwords for `keystore`, `truststore`, and `key` secretly. You can provide configurations at runtime with a dynamic config provider implementation like the environment variable config provider that comes with Druid. For more information, see [Dynamic config provider](../operations/dynamic-config-provider.md). @@ -169,51 +182,20 @@ export SSL_TRUSTSTORE_PASSWORD=mysecrettruststorepassword ``` ```json - "druid.dynamic.config.provider": { - "type": "environment", - "variables": { - "sasl.jaas.config": "KAFKA_JAAS_CONFIG", - "ssl.key.password": "SSL_KEY_PASSWORD", - "ssl.keystore.password": "SSL_KEYSTORE_PASSWORD", - "ssl.truststore.password": "SSL_TRUSTSTORE_PASSWORD" - } +"druid.dynamic.config.provider": { + "type": "environment", + "variables": { + "sasl.jaas.config": "KAFKA_JAAS_CONFIG", + "ssl.key.password": "SSL_KEY_PASSWORD", + "ssl.keystore.password": "SSL_KEYSTORE_PASSWORD", + "ssl.truststore.password": "SSL_TRUSTSTORE_PASSWORD" } +} ``` Verify that you've changed the values for all configurations to match your own environment. In the Druid data loader interface, you can use the environment variable config provider syntax in the **Consumer properties** field on the **Connect tab**. When connecting to Kafka, Druid replaces the environment variables with their corresponding values. -#### Task autoscaler - -You can optionally configure autoscaling behavior for ingestion tasks using the `autoScalerConfig` property of the `ioConfig` object. - -The following table outlines the configuration options for `autoScalerConfig`: - -|Property|Description|Required|Default| -|--------|-----------|--------|-------| -|`enableTaskAutoScaler`|Enables the auto scaler. If not specified, Druid disables the auto scaler even when `autoScalerConfig` is not null.|No|`false`| -|`taskCountMax`|Maximum number of ingestion tasks. Set `taskCountMax >= taskCountMin`. If `taskCountMax > {numKafkaPartitions}`, Druid only scales reading tasks up to `{numKafkaPartitions}`. In this case, `taskCountMax` is ignored.|Yes|| -|`taskCountMin`|Minimum number of ingestion tasks. When you enable the autoscaler, Druid ignores the value of `taskCount` in `ioConfig` and starts with the `taskCountMin` number of tasks to launch.|Yes|| -|`minTriggerScaleActionFrequencyMillis`|Minimum time interval between two scale actions.| No|600000| -|`autoScalerStrategy`|The algorithm of `autoScaler`. Druid only supports the `lagBased` strategy. See [Autoscaler strategy](#autoscaler-strategy) for more information.|No|`lagBased`| - -##### Autoscaler strategy - -The following table outlines the configuration options for `autoScalerStrategy`: - -|Property|Description|Required|Default| -|--------|-----------|--------|-------| -|`lagCollectionIntervalMillis`|The time period during which Druid collects lag metric points.|No|30000| -|`lagCollectionRangeMillis`|The total time window of lag collection. Use with `lagCollectionIntervalMillis` to specify the intervals at which to collect lag metric points.|No|600000| -|`scaleOutThreshold`|The threshold of scale out action. |No|6000000| -|`triggerScaleOutFractionThreshold`|Enables scale out action if `triggerScaleOutFractionThreshold` percent of lag points is higher than `scaleOutThreshold`.|No|0.3| -|`scaleInThreshold`|The threshold of scale in action.|No|1000000| -|`triggerScaleInFractionThreshold`|Enables scale in action if `triggerScaleInFractionThreshold` percent of lag points is lower than `scaleOutThreshold`.|No|0.9| -|`scaleActionStartDelayMillis`|The number of milliseconds to delay after the supervisor starts before the first scale logic check.|No|300000| -|`scaleActionPeriodMillis`|The frequency in milliseconds to check if a scale action is triggered.|No|60000| -|`scaleInStep`|The number of tasks to reduce at once when scaling down.|No|1| -|`scaleOutStep`|The number of tasks to add at once when scaling out.|No|2| - -#### Idle supervisor configuration +#### Idle configuration :::info Idle state transitioning is currently designated as experimental. @@ -235,129 +217,59 @@ The following example shows a supervisor spec with `lagBased` autoscaler and idl ```json { - "type": "kafka", - "spec": { - "dataSchema": { - ... + "type": "kafka", + "spec": { + "dataSchema": {...}, + "ioConfig": { + "topic": "metrics", + "inputFormat": { + "type": "json" + }, + "consumerProperties": { + "bootstrap.servers": "localhost:9092" }, - "ioConfig": { - "topic": "metrics", - "inputFormat": { - "type": "json" - }, - "consumerProperties": { - "bootstrap.servers": "localhost:9092" - }, - "autoScalerConfig": { - "enableTaskAutoScaler": true, - "taskCountMax": 6, - "taskCountMin": 2, - "minTriggerScaleActionFrequencyMillis": 600000, - "autoScalerStrategy": "lagBased", - "lagCollectionIntervalMillis": 30000, - "lagCollectionRangeMillis": 600000, - "scaleOutThreshold": 6000000, - "triggerScaleOutFractionThreshold": 0.3, - "scaleInThreshold": 1000000, - "triggerScaleInFractionThreshold": 0.9, - "scaleActionStartDelayMillis": 300000, - "scaleActionPeriodMillis": 60000, - "scaleInStep": 1, - "scaleOutStep": 2 - }, - "taskCount":1, - "replicas":1, - "taskDuration":"PT1H", - "idleConfig": { - "enabled": true, - "inactiveAfterMillis": 600000 - } + "autoScalerConfig": { + "enableTaskAutoScaler": true, + "taskCountMax": 6, + "taskCountMin": 2, + "minTriggerScaleActionFrequencyMillis": 600000, + "autoScalerStrategy": "lagBased", + "lagCollectionIntervalMillis": 30000, + "lagCollectionRangeMillis": 600000, + "scaleOutThreshold": 6000000, + "triggerScaleOutFractionThreshold": 0.3, + "scaleInThreshold": 1000000, + "triggerScaleInFractionThreshold": 0.9, + "scaleActionStartDelayMillis": 300000, + "scaleActionPeriodMillis": 60000, + "scaleInStep": 1, + "scaleOutStep": 2 }, - "tuningConfig":{ - ... - } - } + "taskCount": 1, + "replicas": 1, + "taskDuration": "PT1H", + "idleConfig": { + "enabled": true, + "inactiveAfterMillis": 600000 + } + }, + "tuningConfig": {...} + } } ```
-#### Ingest from multiple topics - -:::info -If you enable multi-topic ingestion for a datasource, downgrading to a version older than -28.0.0 will cause the ingestion for that datasource to fail. -::: - -To ingest data from multiple topics, you set `topicPattern` instead of `topic` in the supervisor `ioConfig` object. -You can pass multiple topics as a regex pattern as the value for `topicPattern` in `ioConfig`. For example, to -ingest data from clicks and impressions, set `topicPattern` to `clicks|impressions` in `ioCofig`. -Similarly, you can use `metrics-.*` as the value for `topicPattern` if you want to ingest from all the topics that -start with `metrics-`. If you add a new topic that matches the regex to the cluster, Druid automatically starts -ingesting from those new topics. Topic names that match partially, such as `my-metrics-12`, are not included for ingestion. - -When ingesting data from multiple topics, Druid assigns partitions based on the hashcode of the topic name and the -ID of the partition within that topic. The partition assignment might not be uniform across all the tasks. It's also -assumed that partitions across individual topics have similar load. It is recommended that you have a higher number of -partitions for a high load topic and a lower number of partitions for a low load topic. Assuming that you want to -ingest from both high and low load topic in the same supervisor. ### Tuning configuration -The `tuningConfig` object is optional. If you don't specify the `tuningConfig` object, Druid uses the default configuration settings. - -The following table outlines the configuration options for `tuningConfig`: +The following table outlines the Kafka-specific configuration properties for `tuningConfig`: |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| -|`type`|String|The indexing task type; must be `kafka`.|Yes|| -|`maxRowsInMemory`|Integer|The number of rows to aggregate before persisting. This number represents the post-aggregation rows. It is not equivalent to the number of input events, but the resulting number of aggregated rows. Druid uses `maxRowsInMemory` to manage the required JVM heap size. The maximum heap memory usage for indexing scales is `maxRowsInMemory * (2 + maxPendingPersists)`. Normally, you do not need to set this, but depending on the nature of data, if rows are short in terms of bytes, you may not want to store a million rows in memory and this value should be set.|No|150000| -|`maxBytesInMemory`|Long|The number of bytes to aggregate in heap memory before persisting. This is based on a rough estimate of memory usage and not actual usage. Normally, this is computed internally. The maximum heap memory usage for indexing is `maxBytesInMemory * (2 + maxPendingPersists)`.|No|One-sixth of max JVM memory| -|`skipBytesInMemoryOverheadCheck`|Boolean|The calculation of `maxBytesInMemory` takes into account overhead objects created during ingestion and each intermediate persist. To exclude the bytes of these overhead objects from the `maxBytesInMemory` check, set `skipBytesInMemoryOverheadCheck` to `true`.|No|`false`| -|`maxRowsPerSegment`|Integer|The number of rows to store in a segment. This number is post-aggregation rows. Handoff occurs when `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens first.|No|5000000| -|`maxTotalRows`|Long|The number of rows to aggregate across all segments; this number is post-aggregation rows. Handoff happens either if `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens earlier.|No|20000000| -|`intermediateHandoffPeriod`|ISO 8601 period|The period that determines how often tasks hand off segments. Handoff occurs if `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens first.|No|P2147483647D| -|`intermediatePersistPeriod`|ISO 8601 period|The period that determines the rate at which intermediate persists occur.|No|PT10M| -|`maxPendingPersists`|Integer|Maximum number of persists that can be pending but not started. If a new intermediate persist exceeds this limit, Druid blocks ingestion until the currently running persist finishes. One persist can be running concurrently with ingestion, and none can be queued up. The maximum heap memory usage for indexing scales is `maxRowsInMemory * (2 + maxPendingPersists)`.|No|0| -|`indexSpec`|Object|Defines how Druid indexes the data. See [IndexSpec](#indexspec) for more information.|No|| -|`indexSpecForIntermediatePersists`|Object|Defines segment storage format options to use at indexing time for intermediate persisted temporary segments. You can use `indexSpecForIntermediatePersists` to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. However, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published. See [IndexSpec](#indexspec) for possible values.|No|Same as `indexSpec`| -|`reportParseExceptions`|Boolean|DEPRECATED. If `true`, Druid throws exceptions encountered during parsing causing ingestion to halt. If `false`, Druid skips unparseable rows and fields. Setting `reportParseExceptions` to `true` overrides existing configurations for `maxParseExceptions` and `maxSavedParseExceptions`, setting `maxParseExceptions` to 0 and limiting `maxSavedParseExceptions` to not more than 1.|No|`false`| -|`handoffConditionTimeout`|Long|Number of milliseconds to wait for segment handoff. Set to a value >= 0, where 0 means to wait indefinitely.|No|900000 (15 minutes)| -|`resetOffsetAutomatically`|Boolean| Determines how Druid reads Kafka messages when partitions in the topic have `offsetOutOfRangeException`. This feature behaves similarly to the Kafka `auto.offset.reset` consumer property. If `resetOffsetAutomatically` is set to `true`, Druid automatically resets to the earliest or latest offset available in Kafka, based on the value of the `useEarliestOffset` property (earliest if `true`, latest if `false`). Druid logs messages to the ingestion task log file indicating that a reset has occurred without interrupting ingestion. Setting `resetOffsetAutomatically` to `true` can lead to dropping data (if `useEarliestSequenceNumber` is `false`) or duplicating data (if `useEarliestSequenceNumber` is `true`) without your knowledge.
If only one partition in the topic has `offsetOutOfrangeException`, the offset is reset for that partition only.
If `resetOffsetAutomatically` is `false`, the exception bubbles up causing tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation, potentially using the [Reset Supervisor API](../api-reference/supervisor-api.md#reset-a-supervisor). |No|`false`| -|`workerThreads`|Integer|The number of threads that the supervisor uses to handle requests/responses for worker tasks, along with any other internal asynchronous operation.|No|`min(10, taskCount)`| |`chatAsync`|Boolean|If `true`, use asynchronous communication with indexing tasks, and ignore the `chatThreads` parameter. If `false`, use synchronous communication in a thread pool of size `chatThreads`.|No|`true`| |`chatThreads`|Integer|The number of threads to use for communicating with indexing tasks. Ignored if `chatAsync` is `true`.|No|`min(10, taskCount * replicas)`| -|`chatRetries`|Integer|The number of times HTTP requests to indexing tasks are retried before considering tasks unresponsive.|No|8| -|`httpTimeout`| ISO 8601 period|The period of time to wait for a HTTP response from an indexing task.|No|PT10S| -|`shutdownTimeout`|ISO 8601 period|The period of time to wait for the supervisor to attempt a graceful shutdown of tasks before exiting.|No|PT80S| -|`offsetFetchPeriod`|ISO 8601 period|Determines how often the supervisor queries Kafka and the indexing tasks to fetch current offsets and calculate lag. If the user-specified value is below the minimum value of `PT5S`, the supervisor ignores the value and uses the minimum value instead.|No|PT30S| -|`segmentWriteOutMediumFactory`|Object|The segment write-out medium to use when creating segments. See [Additional Peon configuration: SegmentWriteOutMediumFactory](../configuration/index.md#segmentwriteoutmediumfactory) for explanation and available options.|No|If not specified, Druid uses the value from `druid.peon.defaultSegmentWriteOutMediumFactory.type`.| -|`logParseExceptions`|Boolean|If `true`, Druid logs an error message when a parsing exception occurs, containing information about the row where the error occurred.|No|`false`| -|`maxParseExceptions`|Integer|The maximum number of parse exceptions that can occur before the task halts ingestion and fails. Overridden if `reportParseExceptions` is set.|No|unlimited| -|`maxSavedParseExceptions`|Integer|When a parse exception occurs, Druid keeps track of the most recent parse exceptions. `maxSavedParseExceptions` limits the number of saved exception instances. These saved exceptions are available after the task finishes in the [task completion report](../ingestion/tasks.md#task-reports). Overridden if `reportParseExceptions` is set.|No|0| - -#### IndexSpec - -The following table outlines the configuration options for `indexSpec`: - -|Property|Type|Description|Required|Default| -|--------|----|-----------|--------|-------| -|`bitmap`|Object|Compression format for bitmap indexes. Druid supports roaring and concise bitmap types.|No|Roaring| -|`dimensionCompression`|String|Compression format for dimension columns. One of `LZ4`, `LZF`, `ZSTD` or `uncompressed`.|No|`LZ4`| -|`metricCompression`|String|Compression format for primitive type metric columns. One of `LZ4`, `LZF`, `ZSTD`, `uncompressed` or `none`.|No|`LZ4`| -|`longEncoding`|String|Encoding format for metric and dimension columns with type long. One of `auto` or `longs`. `auto` encodes the values using offset or lookup table depending on column cardinality, and store them with variable size. `longs` stores the value as is with 8 bytes each.|No|`longs`| - -## Deployment notes on Kafka partitions and Druid segments - -Druid assigns Kafka partitions to each Kafka indexing task. A task writes the events it consumes from Kafka into a single segment for the segment granularity interval until it reaches one of the following limits: `maxRowsPerSegment`, `maxTotalRows`, or `intermediateHandoffPeriod`. At this point, the task creates a new partition for this segment granularity to contain subsequent events. -The Kafka indexing task also does incremental hand-offs. Therefore, segments become available as they are ready and you don't have to wait for all segments until the end of the task duration. When the task reaches one of `maxRowsPerSegment`, `maxTotalRows`, or `intermediateHandoffPeriod`, it hands off all the segments and creates a new set of segments for further events. This allows the task to run for longer durations without accumulating old segments locally on MiddleManager services. - -The Kafka indexing service may still produce some small segments. For example, consider the following scenario: -- Task duration is 4 hours. -- Segment granularity is set to an HOUR. -- The supervisor was started at 9:10. -After 4 hours at 13:10, Druid starts a new set of tasks. The events for the interval 13:00 - 14:00 may be split across existing tasks and the new set of tasks which could result in small segments. To merge them together into new segments of an ideal size (in the range of ~500-700 MB per segment), you can schedule re-indexing tasks, optionally with a different segment granularity. -For information on how to optimize the segment size, see [Segment size optimization](../operations/segment-optimization.md). +For configuration properties shared across all streaming ingestion methods supported by Druid, refer to [Supervisor tuning configuration](supervisor.md#tuning-configuration). ## Learn more @@ -366,4 +278,4 @@ See the following topics for more information: * [Supervisor API](../api-reference/supervisor-api.md) for how to manage and monitor supervisors using the API. * [Supervisor](../ingestion/supervisor.md) for supervisor status and capacity planning. * [Loading from Apache Kafka](../tutorials/tutorial-kafka.md) for a tutorial on streaming data from Apache Kafka. -* [Kafka input format](../ingestion/data-formats.md) to learn about the `kafka` input format. \ No newline at end of file +* [Kafka input format](../ingestion/data-formats.md#kafka) to learn about the `kafka` input format. \ No newline at end of file diff --git a/docs/ingestion/kinesis-ingestion.md b/docs/ingestion/kinesis-ingestion.md index 8a550c3d4018..eb9a14e52f3b 100644 --- a/docs/ingestion/kinesis-ingestion.md +++ b/docs/ingestion/kinesis-ingestion.md @@ -29,7 +29,7 @@ import TabItem from '@theme/TabItem'; When you enable the Kinesis indexing service, you can configure supervisors on the Overlord to manage the creation and lifetime of Kinesis indexing tasks. These indexing tasks read events using the Kinesis shard and sequence number mechanism to guarantee exactly-once ingestion. The supervisor oversees the state of the indexing tasks to coordinate handoffs, manage failures, and ensure that scalability and replication requirements are maintained. -This topic contains configuration reference information for the Kinesis indexing service supervisor for Apache Druid. +This topic contains configuration information for the Kinesis indexing service supervisor for Apache Druid. ## Setup @@ -37,19 +37,11 @@ To use the Kinesis indexing service, you must first load the `druid-kinesis-inde Review [Kinesis known issues](#kinesis-known-issues) before deploying the `druid-kinesis-indexing-service` extension to production. -## Supervisor spec +## Supervisor spec configuration -The following table outlines the high-level configuration options for the Kinesis [supervisor spec](../ingestion/supervisor.md#supervisor-spec). +This section outlines the configuration properties that are specific to the Amazon Kinesis streaming ingestion method. For configuration properties shared across all streaming ingestion methods supported by Druid, see [Supervisor spec](supervisor.md#supervisor-spec). -|Property|Type|Description|Required| -|--------|----|-----------|--------| -|`type`|String|The supervisor type; must be `kinesis`.|Yes| -|`spec`|Object|The container object for the supervisor configuration.|Yes| -|`ioConfig`|Object|The [I/O configuration](#supervisor-io-configuration) object for configuring Kinesis connection and I/O-related settings for the supervisor and indexing tasks.|Yes| -|`dataSchema`|Object|The schema used by the Kinesis indexing task during ingestion. See [`dataSchema`](../ingestion/ingestion-spec.md#dataschema) for more information.|Yes| -|`tuningConfig`|Object|The [tuning configuration](#supervisor-tuning-configuration) object for configuring performance-related settings for the supervisor and indexing tasks.|No| - -The following example shows a supervisor spec for a stream with the name `KinesisStream`. +The following example shows a supervisor spec for a stream with the name `KinesisStream`:
Click to view the example @@ -129,114 +121,51 @@ The following example shows a supervisor spec for a stream with the name `Kinesi ### I/O configuration -The following table outlines the configuration options for `ioConfig`: +The following table outlines the `ioConfig` configuration properties specific to Kinesis: |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| |`stream`|String|The Kinesis stream to read.|Yes|| -|`inputFormat`|Object|The [input format](../ingestion/data-formats.md#input-format) to specify how to parse input data.|Yes|| |`endpoint`|String|The AWS Kinesis stream endpoint for a region. You can find a list of endpoints in the [AWS service endpoints](http://docs.aws.amazon.com/general/latest/gr/rande.html#ak_region) document.|No|`kinesis.us-east-1.amazonaws.com`| -|`replicas`|Integer|The number of replica sets, where 1 is a single set of tasks (no replication). Druid always assigns replicate tasks to different workers to provide resiliency against process failure.|No|1| -|`taskCount`|Integer|The maximum number of reading tasks in a replica set. Multiply `taskCount` and `replicas` to measure the maximum number of reading tasks.
The total number of tasks (reading and publishing) is higher than the maximum number of reading tasks. See [Capacity planning](#capacity-planning) for more details. When `taskCount > {numKinesisShards}`, the actual number of reading tasks is less than the `taskCount` value.|No|1| -|`taskDuration`|ISO 8601 period|The length of time before tasks stop reading and begin publishing their segments.|No|PT1H| -|`startDelay`|ISO 8601 period|The period to wait before the supervisor starts managing tasks.|No|PT5S| -|`period`|ISO 8601 period|Determines how often the supervisor executes its management logic. Note that the supervisor also runs in response to certain events, such as tasks succeeding, failing, and reaching their task duration, so this value specifies the maximum time between iterations.|No|PT30S| |`useEarliestSequenceNumber`|Boolean|If a supervisor is managing a datasource for the first time, it obtains a set of starting sequence numbers from Kinesis. This flag determines whether a supervisor retrieves the earliest or latest sequence numbers in Kinesis. Under normal circumstances, subsequent tasks start from where the previous segments ended so this flag is only used on the first run.|No|`false`| -|`completionTimeout`|ISO 8601 period|The length of time to wait before Druid declares a publishing task has failed and terminates it. If this is set too low, your tasks may never publish. The publishing clock for a task begins roughly after `taskDuration` elapses.|No|PT6H| -|`lateMessageRejectionPeriod`|ISO 8601 period|Configures tasks to reject messages with timestamps earlier than this period before the task is created. For example, if `lateMessageRejectionPeriod` is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, messages with timestamps earlier than `2016-01-01T11:00Z` are dropped. This may help prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments, such as a streaming and a nightly batch ingestion pipeline.|No|| -|`earlyMessageRejectionPeriod`|ISO 8601 period|Configures tasks to reject messages with timestamps later than this period after the task reached its `taskDuration`. For example, if `earlyMessageRejectionPeriod` is set to `PT1H`, the `taskDuration` is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`. Messages with timestamps later than `2016-01-01T14:00Z` are dropped. **Note:** Tasks sometimes run past their task duration, for example, in cases of supervisor failover. Setting `earlyMessageRejectionPeriod` too low may cause messages to be dropped unexpectedly whenever a task runs past its originally configured task duration.|No|| |`recordsPerFetch`|Integer|The number of records to request per call to fetch records from Kinesis.|No| See [Determine fetch settings](#determine-fetch-settings) for defaults.| |`fetchDelayMillis`|Integer|Time in milliseconds to wait between subsequent calls to fetch records from Kinesis. See [Determine fetch settings](#determine-fetch-settings).|No|0| |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional permissions.|No|| |`awsExternalId`|String|The AWS external ID to use for additional permissions.|No|| |`deaggregate`|Boolean|Whether to use the deaggregate function of the Kinesis Client Library (KCL).|No|| -|`autoScalerConfig`|Object|Defines autoscaling behavior for ingestion tasks. See [Task autoscaler](../ingestion/supervisor.md#task-autoscaler) for more information.|No|null| - -#### Task autoscaler - -You can optionally configure autoscaling behavior for ingestion tasks using the `autoScalerConfig` property of the `ioConfig` object. -The following table outlines the autoscaler configuration options: +For configuration properties shared across all streaming ingestion methods supported by Druid, refer to [Supervisor I/O configuration](supervisor.md#io-configuration). -|Property|Description|Required|Default| -|--------|-----------|--------|-------| -|`enableTaskAutoScaler`|Enables the auto scaler. If not specified, Druid disables the auto scaler even when `autoScalerConfig` is not null.|No|`false`| -|`taskCountMax`|Maximum number of ingestion tasks. Must be greater than or equal to `taskCountMin`. If greater than `{numKinesisShards}`, Druid sets the maximum number of reading tasks to `{numKinesisShards}` and ignores `taskCountMax`.|Yes|| -|`taskCountMin`|Minimum number of ingestion tasks. When you enable the autoscaler, Druid ignores the value of `taskCount` in `ioConfig` and starts with the `taskCountMin` number of tasks to launch.|Yes|| -|`minTriggerScaleActionFrequencyMillis`|Minimum time interval between two scale actions.| No|600000| -|`autoScalerStrategy`|The algorithm of `autoScaler`. Druid only supports the `lagBased` strategy. See [Autoscaler strategy](#autoscaler-strategy) for more information.|No|`lagBased`| +#### Data format -##### Autoscaler strategy +The Kinesis indexing service supports both [`inputFormat`](data-formats.md#input-format) and [`parser`](data-formats.md#parser) to specify the data format. Use the `inputFormat` to specify the data format for the Kinesis indexing service unless you need a format only supported by the legacy `parser`. For more information, see [Source input formats](data-formats.md). -:::info -Unlike the Kafka indexing service, Kinesis indexing service reports lag metrics measured in time difference in milliseconds between the current sequence number and latest sequence number, rather than message count. -::: +The Kinesis indexing service supports the following values for `inputFormat`: +* `csv` +* `delimited` +* `json` +* `avro_stream` +* `avro_ocf` +* `protobuf` -The following table outlines the configuration options for `autoScalerStrategy`: - -|Property|Description|Required|Default| -|--------|-----------|--------|-------| -|`lagCollectionIntervalMillis`|The time period during which Druid collects lag metric points.|No|30000| -|`lagCollectionRangeMillis`|The total time window of lag collection. Use with `lagCollectionIntervalMillis` to specify the intervals at which to collect lag metric points.|No|600000| -|`scaleOutThreshold`|The threshold of scale out action. |No|6000000| -|`triggerScaleOutFractionThreshold`|Enables scale out action if `triggerScaleOutFractionThreshold` percent of lag points is higher than `scaleOutThreshold`.|No|0.3| -|`scaleInThreshold`|The threshold of scale in action.|No|1000000| -|`triggerScaleInFractionThreshold`|Enables scale in action if `triggerScaleInFractionThreshold` percent of lag points is lower than `scaleOutThreshold`.|No|0.9| -|`scaleActionStartDelayMillis`|The number of milliseconds to delay after the supervisor starts before the first scale logic check.|No|300000| -|`scaleActionPeriodMillis`|The frequency in milliseconds to check if a scale action is triggered.|No|60000| -|`scaleInStep`|The number of tasks to reduce at once when scaling down.|No|1| -|`scaleOutStep`|The number of tasks to add at once when scaling out.|No|2| +You can use `parser` to read [`thrift`](../development/extensions-contrib/thrift.md) formats. ### Tuning configuration -The `tuningConfig` object is optional. If you don't specify the `tuningConfig` object, Druid uses the default configuration settings. - -The following table outlines the configuration options for `tuningConfig`: +The following table outlines the `tuningConfig` configuration properties specific to Kinesis: |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| -|`type`|String|The indexing task type; must be `kinesis`.|Yes|| -|`maxRowsInMemory`|Integer|The number of rows to aggregate before persisting. This number represents the post-aggregation rows. It is not equivalent to the number of input events, but the resulting number of aggregated rows. Druid uses `maxRowsInMemory` to manage the required JVM heap size. The maximum heap memory usage for indexing scales is `maxRowsInMemory * (2 + maxPendingPersists)`.|No|100000| -|`maxBytesInMemory`|Long| The number of bytes to aggregate in heap memory before persisting. This is based on a rough estimate of memory usage and not actual usage. Normally, this is computed internally. The maximum heap memory usage for indexing is `maxBytesInMemory * (2 + maxPendingPersists)`.|No|One-sixth of max JVM memory| -|`skipBytesInMemoryOverheadCheck`|Boolean|The calculation of `maxBytesInMemory` takes into account overhead objects created during ingestion and each intermediate persist. To exclude the bytes of these overhead objects from the `maxBytesInMemory` check, set `skipBytesInMemoryOverheadCheck` to `true`.|No|`false`| -|`maxRowsPerSegment`|Integer|The number of rows to aggregate into a segment; this number represents the post-aggregation rows. Handoff occurs when `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens first.|No|5000000| -|`maxTotalRows`|Long|The number of rows to aggregate across all segments; this number represents the post-aggregation rows. Handoff occurs when `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens first.|No|unlimited| -|`intermediateHandoffPeriod`|ISO 8601 period|The period that determines how often tasks hand off segments. Handoff occurs if `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens first.|No|P2147483647D| -|`intermediatePersistPeriod`|ISO 8601 period|The period that determines the rate at which intermediate persists occur.|No|PT10M| -|`maxPendingPersists`|Integer|Maximum number of persists that can be pending but not started. If a new intermediate persist exceeds this limit, Druid blocks ingestion until the currently running persist finishes. One persist can be running concurrently with ingestion, and none can be queued up. The maximum heap memory usage for indexing scales is `maxRowsInMemory * (2 + maxPendingPersists)`.|No|0| -|`indexSpec`|Object|Defines how Druid indexes the data. See [IndexSpec](#indexspec) for more information.|No|| -|`indexSpecForIntermediatePersists`|Object|Defines segment storage format options to use at indexing time for intermediate persisted temporary segments. You can use `indexSpecForIntermediatePersists` to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. However, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published. See [IndexSpec](#indexspec) for possible values.|No|Same as `indexSpec`| -|`reportParseExceptions`|Boolean|If `true`, Druid throws exceptions encountered during parsing causing ingestion to halt. If `false`, Druid skips unparseable rows and fields.|No|`false`| -|`handoffConditionTimeout`|Long|Number of milliseconds to wait for segment handoff. Set to a value >= 0, where 0 means to wait indefinitely.|No|0| -|`resetOffsetAutomatically`|Boolean|Controls behavior when Druid needs to read Kinesis messages that are no longer available.
If `false`, the exception bubbles up causing tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation, potentially using the [Reset Supervisor API](../api-reference/supervisor-api.md). This mode is useful for production, since it highlights issues with ingestion.
If `true`, Druid automatically resets to the earliest or latest sequence number available in Kinesis, based on the value of the `useEarliestSequenceNumber` property (earliest if `true`, latest if `false`). Note that this can lead to dropping data (if `useEarliestSequenceNumber` is `false`) or duplicating data (if `useEarliestSequenceNumber` is `true`) without your knowledge. Druid logs messages indicating that a reset has occurred without interrupting ingestion. This mode is useful for non-production situations since it enables Druid to recover from problems automatically, even if they lead to quiet dropping or duplicating of data.|No|`false`| |`skipSequenceNumberAvailabilityCheck`|Boolean|Whether to enable checking if the current sequence number is still available in a particular Kinesis shard. If `false`, the indexing task attempts to reset the current sequence number, depending on the value of `resetOffsetAutomatically`.|No|`false`| -|`workerThreads`|Integer|The number of threads that the supervisor uses to handle requests/responses for worker tasks, along with any other internal asynchronous operation.|No| `min(10, taskCount)`| -|`chatRetries`|Integer|The number of times Druid retries HTTP requests to indexing tasks before considering tasks unresponsive.|No|8| -|`httpTimeout`|ISO 8601 period|The period of time to wait for a HTTP response from an indexing task.|No|PT10S| -|`shutdownTimeout`|ISO 8601 period|The period of time to wait for the supervisor to attempt a graceful shutdown of tasks before exiting.|No|PT80S| |`recordBufferSize`|Integer|The size of the buffer (number of events) Druid uses between the Kinesis fetch threads and the main ingestion thread.|No|See [Determine fetch settings](#determine-fetch-settings) for defaults.| |`recordBufferOfferTimeout`|Integer|The number of milliseconds to wait for space to become available in the buffer before timing out.|No|5000| |`recordBufferFullWait`|Integer|The number of milliseconds to wait for the buffer to drain before Druid attempts to fetch records from Kinesis again.|No|5000| |`fetchThreads`|Integer|The size of the pool of threads fetching data from Kinesis. There is no benefit in having more threads than Kinesis shards.|No| `procs * 2`, where `procs` is the number of processors available to the task.| -|`segmentWriteOutMediumFactory`|Object|The segment write-out medium to use when creating segments. See [Additional Peon configuration: SegmentWriteOutMediumFactory](../configuration/index.md#segmentwriteoutmediumfactory) for explanation and available options.|No|If not specified, Druid uses the value from `druid.peon.defaultSegmentWriteOutMediumFactory.type`.| -|`logParseExceptions`|Boolean|If `true`, Druid logs an error message when a parsing exception occurs, containing information about the row where the error occurred.|No|`false`| -|`maxParseExceptions`|Integer|The maximum number of parse exceptions that can occur before the task halts ingestion and fails. Overridden if `reportParseExceptions` is set.|No|unlimited| -|`maxSavedParseExceptions`|Integer|When a parse exception occurs, Druid keeps track of the most recent parse exceptions. `maxSavedParseExceptions` limits the number of saved exception instances. These saved exceptions are available after the task finishes in the [task completion report](../ingestion/tasks.md#task-reports). Overridden if `reportParseExceptions` is set.|No|0| |`maxRecordsPerPoll`|Integer|The maximum number of records to be fetched from buffer per poll. The actual maximum will be `Max(maxRecordsPerPoll, Max(bufferSize, 1))`.|No| See [Determine fetch settings](#determine-fetch-settings) for defaults.| -|`repartitionTransitionDuration`|ISO 8601 period|When shards are split or merged, the supervisor recomputes shard to task group mappings. The supervisor also signals any running tasks created under the old mappings to stop early at current time + `repartitionTransitionDuration`. Stopping the tasks early allows Druid to begin reading from the new shards more quickly. The repartition transition wait time controlled by this property gives the stream additional time to write records to the new shards after the split or merge, which helps avoid issues with [empty shard handling](https://github.com/apache/druid/issues/7600).|No|PT2M| -|`offsetFetchPeriod`|ISO 8601 period|Determines how often the supervisor queries Kinesis and the indexing tasks to fetch current offsets and calculate lag. If the user-specified value is below the minimum value of PT5S, the supervisor ignores the value and uses the minimum value instead.|No|PT30S| +|`repartitionTransitionDuration`|ISO 8601 period|When shards are split or merged, the supervisor recomputes shard to task group mappings. The supervisor also signals any running tasks created under the old mappings to stop early at current time + `repartitionTransitionDuration`. Stopping the tasks early allows Druid to begin reading from the new shards more quickly. The repartition transition wait time controlled by this property gives the stream additional time to write records to the new shards after the split or merge, which helps avoid issues with [empty shard handling](https://github.com/apache/druid/issues/7600).|No|`PT2M`| |`useListShards`|Boolean|Indicates if `listShards` API of AWS Kinesis SDK can be used to prevent `LimitExceededException` during ingestion. You must set the necessary `IAM` permissions.|No|`false`| -#### IndexSpec - -The following table outlines the configuration options for `indexSpec`: - -|Property|Type|Description|Required|Default| -|--------|----|-----------|--------|-------| -|`bitmap`|Object|Compression format for bitmap indexes. Druid supports roaring and concise bitmap types.|No|Roaring| -|`dimensionCompression`|String|Compression format for dimension columns. One of `LZ4`, `LZF`, or `uncompressed`.|No|`LZ4`| -|`metricCompression`|String|Compression format for primitive type metric columns. One of `LZ4`, `LZF`, `uncompressed`, or `none`.|No|`LZ4`| -|`longEncoding`|String|Encoding format for metric and dimension columns with type long. One of `auto` or `longs`. `auto` encodes the values using sequence number or lookup table depending on column cardinality and stores them with variable sizes. `longs` stores the value as is with 8 bytes each.|No|`longs`| +For configuration properties shared across all streaming ingestion methods supported by Druid, refer to [Supervisor tuning configuration](supervisor.md#tuning-configuration). ## AWS authentication @@ -244,19 +173,16 @@ Druid uses AWS access and secret keys to authenticate Kinesis API requests. Ther 1. Using roles or short-term credentials: - Druid looks for credentials set in [environment variables](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html), -via [Web Identity Token](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_oidc.html), in the -default [profile configuration file](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html), and from the -EC2 instance profile provider (in this order). + Druid looks for credentials set in [environment variables](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html), via [Web Identity Token](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_oidc.html), in the default [profile configuration file](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html), and from the EC2 instance profile provider (in this order). 2. Using long-term security credentials: - You can directly provide your AWS access key and AWS secret key in the `common.runtime.properties` file as shown in the example below: + You can directly provide your AWS access key and AWS secret key in the `common.runtime.properties` file as shown in the example below: -```properties -druid.kinesis.accessKey=AKIAWxxxxxxxxxx4NCKS -druid.kinesis.secretKey=Jbytxxxxxxxxxxx2+555 -``` + ```properties + druid.kinesis.accessKey=AKIAWxxxxxxxxxx4NCKS + druid.kinesis.secretKey=Jbytxxxxxxxxxxx2+555 + ``` :::info AWS does not recommend providing long-term security credentials in configuration files since it might pose a security risk. @@ -402,7 +328,7 @@ This window with early task shutdowns and possible task failures concludes when: Note that when the supervisor is running and detects new partitions, tasks read new partitions from the earliest offsets, irrespective of the `useEarliestSequence` setting. This is because these new shards were immediately discovered and are therefore unlikely to experience a lag. -If resharding occurs when the supervisor is suspended and `useEarliestSequence` is set to `false`, resuming the supervisor causes tasks to read the new shards from the latest sequence. This is by design so that the consumer can catch up quickly with any lag accumulated while the supervisor was suspended. +If resharding occurs when the supervisor is suspended and `useEarliestSequence` is set to `false`, resuming the supervisor causes tasks to read the new shards from the latest sequence. This is by design so that the consumer can catch up quickly with any lag accumulated while the supervisor was suspended. ## Known issues diff --git a/docs/ingestion/supervisor.md b/docs/ingestion/supervisor.md index 31a332148239..bd5963f461f7 100644 --- a/docs/ingestion/supervisor.md +++ b/docs/ingestion/supervisor.md @@ -28,14 +28,206 @@ Supervisors oversee the state of indexing tasks to coordinate handoffs, manage f ## Supervisor spec -You use a JSON specification, often referred to as the supervisor spec, to define streaming ingestion tasks. +Druid uses a JSON specification, often referred to as the supervisor spec, to define streaming ingestion tasks. The supervisor spec specifies how Druid should consume, process, and index streaming data. +The following table outlines the high-level configuration options for a supervisor spec: + +|Property|Type|Description|Required| +|--------|----|-----------|--------| +|`type`|String|The supervisor type. One of `kafka`or `kinesis`.|Yes| +|`spec`|Object|The container object for the supervisor configuration.|Yes| +|`spec.dataSchema`|Object|The schema for the indexing task to use during ingestion. See [`dataSchema`](../ingestion/ingestion-spec.md#dataschema) for more information.|Yes| +|`spec.ioConfig`|Object|The I/O configuration object to define the connection and I/O-related settings for the supervisor and indexing tasks.|Yes| +|`spec.tuningConfig`|Object|The tuning configuration object to define performance-related settings for the supervisor and indexing tasks.|No| + +### I/O configuration + +The following table outlines the `ioConfig` configuration properties that apply to both Apache Kafka and Amazon Kinesis ingestion methods: + +|Property|Type|Description|Required|Default| +|--------|----|-----------|--------|-------| +|`inputFormat`|Object|The [input format](../ingestion/data-formats.md#input-format) to define input data parsing.|Yes|| +|`autoScalerConfig`|Object|Defines auto scaling behavior for ingestion tasks. See [Task autoscaler](#task-autoscaler) for more information.|No|null| +|`taskCount`|Integer|The maximum number of reading tasks in a replica set. Multiply `taskCount` and replicas to measure the maximum number of reading tasks. The total number of tasks, reading and publishing, is higher than the maximum number of reading tasks. See [Capacity planning](../ingestion/supervisor.md#capacity-planning) for more details. When `taskCount` is greater than the number of Kafka partitions or Kinesis shards, the actual number of reading tasks is less than the `taskCount` value.|No|1| +|`replicas`|Integer|The number of replica sets, where 1 is a single set of tasks (no replication). Druid always assigns replicate tasks to different workers to provide resiliency against process failure.|No|1| +|`taskDuration`|ISO 8601 period|The length of time before tasks stop reading and begin publishing segments.|No|`PT1H`| +|`startDelay`|ISO 8601 period|The period to wait before the supervisor starts managing tasks.|No|`PT5S`| +|`period`|ISO 8601 period|Determines how often the supervisor executes its management logic. Note that the supervisor also runs in response to certain events, such as tasks succeeding, failing, and reaching their task duration. The `period` value specifies the maximum time between iterations.|No|`PT30S`| +|`completionTimeout`|ISO 8601 period|The length of time to wait before declaring a publishing task as failed and terminating it. If the value is too low, tasks may never publish. The publishing clock for a task begins roughly after `taskDuration` elapses.|No|`PT30M`| +|`lateMessageRejectionStartDateTime`|ISO 8601 date time|Configures tasks to reject messages with timestamps earlier than this date time. For example, if this property is set to `2016-01-01T11:00Z` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps earlier than `2016-01-01T11:00Z`. This can prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments, such as a realtime and a nightly batch ingestion pipeline.|No|| +|`lateMessageRejectionPeriod`|ISO 8601 period|Configures tasks to reject messages with timestamps earlier than this period before the task was created. For example, if this property is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps earlier than `2016-01-01T11:00Z`. This may help prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments, such as a streaming and a nightly batch ingestion pipeline. You can specify only one of the late message rejection properties.|No|| +|`earlyMessageRejectionPeriod`|ISO 8601 period|Configures tasks to reject messages with timestamps later than this period after the task reached its task duration. For example, if this property is set to `PT1H`, the task duration is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps later than `2016-01-01T14:00Z`. Tasks sometimes run past their task duration, such as in cases of supervisor failover. Setting `earlyMessageRejectionPeriod` too low may cause Druid to drop messages unexpectedly whenever a task runs past its originally configured task duration.|No|| + +For configuration properties specific to Apache Kafka, see [Kafka I/O configuration](kafka-ingestion.md#io-configuration). +For configuration properties specific to Amazon Kinesis, see [Kinesis I/O configuration](kinesis-ingestion.md#io-configuration). + +#### Task autoscaler + +You can optionally configure autoscaling behavior for ingestion tasks using the `autoScalerConfig` property of the `ioConfig` object. + +The following table outlines the configuration properties for `autoScalerConfig`: + +|Property|Description|Required|Default| +|--------|-----------|--------|-------| +|`enableTaskAutoScaler`|Enables the autoscaler. If not specified, Druid disables the autoscaler even when `autoScalerConfig` is not null.|No|`false`| +|`taskCountMax`|The maximum number of ingestion tasks. Must be greater than or equal to `taskCountMin`. If `taskCountMax` is greater than the number of Kafka partitions or Kinesis shards, Druid set the maximum number of reading tasks to the number of Kafka partitions or Kinesis shards and ignores `taskCountMax`.|Yes|| +|`taskCountMin`|The minimum number of ingestion tasks. When you enable the autoscaler, Druid ignores the value of `taskCount` in `ioConfig` and starts with the `taskCountMin` number of tasks to launch.|Yes|| +|`minTriggerScaleActionFrequencyMillis`|The minimum time interval between two scale actions.| No|600000| +|`autoScalerStrategy`|The algorithm of `autoScaler`. Druid only supports the `lagBased` strategy. See [Autoscaler strategy](#autoscaler-strategy) for more information.|No|`lagBased`| + +##### Autoscaler strategy + +:::info +Unlike the Kafka indexing service, Kinesis reports lag metrics measured in time difference in milliseconds between the current sequence number and latest sequence number, rather than message count. +::: + +The following table outlines the configuration properties for `autoScalerStrategy`: + +|Property|Description|Required|Default| +|--------|-----------|--------|-------| +|`lagCollectionIntervalMillis`|The time period during which Druid collects lag metric points.|No|30000| +|`lagCollectionRangeMillis`|The total time window of lag collection. Use with `lagCollectionIntervalMillis` to specify the intervals at which to collect lag metric points.|No|600000| +|`scaleOutThreshold`|The threshold of scale out action. |No|6000000| +|`triggerScaleOutFractionThreshold`|Enables scale out action if `triggerScaleOutFractionThreshold` percent of lag points is higher than `scaleOutThreshold`.|No|0.3| +|`scaleInThreshold`|The threshold of scale in action.|No|1000000| +|`triggerScaleInFractionThreshold`|Enables scale in action if `triggerScaleInFractionThreshold` percent of lag points is lower than `scaleOutThreshold`.|No|0.9| +|`scaleActionStartDelayMillis`|The number of milliseconds to delay after the supervisor starts before the first scale logic check.|No|300000| +|`scaleActionPeriodMillis`|The frequency in milliseconds to check if a scale action is triggered.|No|60000| +|`scaleInStep`|The number of tasks to reduce at once when scaling down.|No|1| +|`scaleOutStep`|The number of tasks to add at once when scaling out.|No|2| + +The following example shows a supervisor spec with `lagBased` auto scaler enabled: + +
+ Click to view the example + +```json +{ + "type": "kinesis", + "dataSchema": { + "dataSource": "metrics-kinesis", + "timestampSpec": { + "column": "timestamp", + "format": "auto" + }, + "dimensionsSpec": { + "dimensions": [], + "dimensionExclusions": [ + "timestamp", + "value" + ] + }, + "metricsSpec": [ + { + "name": "count", + "type": "count" + }, + { + "name": "value_sum", + "fieldName": "value", + "type": "doubleSum" + }, + { + "name": "value_min", + "fieldName": "value", + "type": "doubleMin" + }, + { + "name": "value_max", + "fieldName": "value", + "type": "doubleMax" + } + ], + "granularitySpec": { + "type": "uniform", + "segmentGranularity": "HOUR", + "queryGranularity": "NONE" + } + }, + "ioConfig": { + "stream": "metrics", + "autoScalerConfig": { + "enableTaskAutoScaler": true, + "taskCountMax": 6, + "taskCountMin": 2, + "minTriggerScaleActionFrequencyMillis": 600000, + "autoScalerStrategy": "lagBased", + "lagCollectionIntervalMillis": 30000, + "lagCollectionRangeMillis": 600000, + "scaleOutThreshold": 600000, + "triggerScaleOutFractionThreshold": 0.3, + "scaleInThreshold": 100000, + "triggerScaleInFractionThreshold": 0.9, + "scaleActionStartDelayMillis": 300000, + "scaleActionPeriodMillis": 60000, + "scaleInStep": 1, + "scaleOutStep": 2 + }, + "inputFormat": { + "type": "json" + }, + "endpoint": "kinesis.us-east-1.amazonaws.com", + "taskCount": 1, + "replicas": 1, + "taskDuration": "PT1H" + }, + "tuningConfig": { + "type": "kinesis", + "maxRowsPerSegment": 5000000 + } +} +``` +
+ +### Tuning configuration + +The `tuningConfig` object is optional. If you don't specify the `tuningConfig` object, Druid uses the default configuration settings. + +The following table outlines the `tuningConfig` configuration properties that apply to both Apache Kafka and Amazon Kinesis ingestion methods: + +|Property|Type|Description|Required|Default| +|--------|----|-----------|--------|-------| +|`type`|String|The tuning type code for the ingestion method. One of `kafka` or `kinesis`.|Yes|| +|`maxRowsInMemory`|Integer|The number of rows to aggregate before persisting. This number represents the post-aggregation rows. It is not equivalent to the number of input events, but the resulting number of aggregated rows. Druid uses `maxRowsInMemory` to manage the required JVM heap size. The maximum heap memory usage for indexing scales is `maxRowsInMemory * (2 + maxPendingPersists)`. Normally, you do not need to set this, but depending on the nature of data, if rows are short in terms of bytes, you may not want to store a million rows in memory and this value should be set.|No|150000| +|`maxBytesInMemory`|Long|The number of bytes to aggregate in heap memory before persisting. This is based on a rough estimate of memory usage and not actual usage. Normally, this is computed internally. The maximum heap memory usage for indexing is `maxBytesInMemory * (2 + maxPendingPersists)`.|No|One-sixth of max JVM memory| +|`skipBytesInMemoryOverheadCheck`|Boolean|The calculation of `maxBytesInMemory` takes into account overhead objects created during ingestion and each intermediate persist. To exclude the bytes of these overhead objects from the `maxBytesInMemory` check, set `skipBytesInMemoryOverheadCheck` to `true`.|No|`false`| +|`maxRowsPerSegment`|Integer|The number of rows to store in a segment. This number is post-aggregation rows. Handoff occurs when `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens first.|No|5000000| +|`maxTotalRows`|Long|The number of rows to aggregate across all segments; this number is post-aggregation rows. Handoff happens either if `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens earlier.|No|20000000 for Kafka. Unlimited for Kinesis.| +|`intermediateHandoffPeriod`|ISO 8601 period|The period that determines how often tasks hand off segments. Handoff occurs if `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens first.|No|`P2147483647D`| +|`intermediatePersistPeriod`|ISO 8601 period|The period that determines the rate at which intermediate persists occur.|No|`PT10M`| +|`maxPendingPersists`|Integer|Maximum number of persists that can be pending but not started. If a new intermediate persist exceeds this limit, Druid blocks ingestion until the currently running persist finishes. One persist can be running concurrently with ingestion, and none can be queued up. The maximum heap memory usage for indexing scales is `maxRowsInMemory * (2 + maxPendingPersists)`.|No|0| +|`indexSpec`|Object|Defines segment storage format options to use at indexing time. See [IndexSpec](../ingestion/ingestion-spec.md#indexspec) for more information.|No|| +|`indexSpecForIntermediatePersists`|Object|Defines segment storage format options to use at indexing time for intermediate persisted temporary segments. You can use `indexSpecForIntermediatePersists` to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. However, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published.|No|| +|`reportParseExceptions`|Boolean|DEPRECATED. If `true`, Druid throws exceptions encountered during parsing causing ingestion to halt. If `false`, Druid skips unparseable rows and fields. Setting `reportParseExceptions` to `true` overrides existing configurations for `maxParseExceptions` and `maxSavedParseExceptions`, setting `maxParseExceptions` to 0 and limiting `maxSavedParseExceptions` to not more than 1.|No|`false`| +|`handoffConditionTimeout`|Long|Number of milliseconds to wait for segment handoff. Set to a value >= 0, where 0 means to wait indefinitely.|No|900000 (15 minutes) for Kafka. 0 for Kinesis.| +|`resetOffsetAutomatically`|Boolean|Resets partitions when the sequence number is unavailable. +If set to `true`, Druid resets partitions to the earliest or latest Kafka sequence number or Kinesis offset, based on the value of `useEarliestSequenceNumber` or `useEarliestOffset` (earliest if `true`, latest if `false`). If set to `false`, the exception bubbles up causing tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation, potentially through [resetting the supervisor](../api-reference/supervisor-api.md#reset-a-supervisor).|No|`false`| +|`workerThreads`|Integer|The number of threads that the supervisor uses to handle requests/responses for worker tasks, along with any other internal asynchronous operation.|No|`min(10, taskCount)`| +|`chatRetries`|Integer|The number of times Druid retries HTTP requests to indexing tasks before considering tasks unresponsive.|No|8| +|`httpTimeout`|ISO 8601 period|The period of time to wait for a HTTP response from an indexing task.|No|`PT10S`| +|`shutdownTimeout`|ISO 8601 period|The period of time to wait for the supervisor to attempt a graceful shutdown of tasks before exiting.|No|`PT80S`| +|`segmentWriteOutMediumFactory`|Object|The segment write-out medium to use when creating segments. See [Additional Peon configuration: SegmentWriteOutMediumFactory](../configuration/index.md#segmentwriteoutmediumfactory) for explanation and available options.|No|If not specified, Druid uses the value from `druid.peon.defaultSegmentWriteOutMediumFactory.type`.| +|`logParseExceptions`|Boolean|If `true`, Druid logs an error message when a parsing exception occurs, containing information about the row where the error occurred.|No|`false`| +|`maxParseExceptions`|Integer|The maximum number of parse exceptions that can occur before the task halts ingestion and fails. Overridden if `reportParseExceptions` is set.|No|unlimited| +|`maxSavedParseExceptions`|Integer|When a parse exception occurs, Druid keeps track of the most recent parse exceptions. `maxSavedParseExceptions` limits the number of saved exception instances. These saved exceptions are available after the task finishes in the [task completion report](../ingestion/tasks.md#task-reports). Overridden if `reportParseExceptions` is set.|No|0| +|`offsetFetchPeriod`|ISO 8601 period|Determines how often the supervisor queries the streaming source and the indexing tasks to fetch current offsets and calculate lag. If the user-specified value is below the minimum value of `PT5S`, the supervisor ignores the value and uses the minimum value instead.|No|`PT30S`| + +For configuration properties specific to Apache Kafka, see [Kafka tuning configuration](kafka-ingestion.md#tuning-configuration). +For configuration properties specific to Amazon Kinesis, see [Kinesis tuning configuration](kinesis-ingestion.md#tuning-configuration). + +## Start a supervisor + Druid starts a new supervisor for a datasource when you create a supervisor spec. -You can create and manage supervisor specs using the data loader in the Druid web console or by calling the [Supervisor API](../api-reference/supervisor-api.md). -Once started, the supervisor persists in the configured metadata database. There can only be one supervisor per datasource, and submitting a second supervisor spec for the same datasource overwrites the previous one. +You can create a supervisor spec using the [Load Data](../operations/web-console.md#data-loader) UI in the Druid web console or by calling the [Supervisor API](../api-reference/supervisor-api.md). + +The following screenshot shows the [Supervisors](../operations/web-console.md#supervisors) view of the Druid web console for a cluster with two supervisors: + +![Supervisors view](../assets/supervisor-view.png) -When an Overlord gains leadership, either by being started or as a result of another Overlord failing, it spawns a supervisor for each supervisor spec in the metadata database. The supervisor then discovers running indexing tasks and attempts to adopt them if they are compatible with the supervisor's configuration. If they are not compatible, the tasks are terminated and the supervisor creates a new set of tasks. This way, the supervised tasks persist across Overlord restarts and failovers. +Once started, the supervisor persists in the configured metadata database. There can only be one supervisor per datasource. Submitting a second supervisor spec for the same datasource overwrites the previous one. + +When an Overlord gains leadership, either by being started or as a result of another Overlord failing, it spawns a supervisor for each supervisor spec in the metadata database. The supervisor then discovers running indexing tasks and attempts to adopt them if they are compatible with the supervisor's configuration. If they are not compatible, the tasks are terminated and the supervisor creates a new set of tasks. This way, the supervisor ingestion tasks persist across Overlord restarts and failovers. ### Schema and configuration changes @@ -44,21 +236,72 @@ This way, configuration changes can be applied without requiring any pause in in ## Status report -The supervisor status report contains the state of the supervisor tasks and an array of recently thrown exceptions reported as `recentErrors`. -To retrieve the current status report for a single supervisor, send a `GET` request to the `/druid/indexer/v1/supervisor/:supervisorId/status` endpoint. +The supervisor status report contains the state of the supervisor tasks and an array of recently thrown exceptions reported as `recentErrors`. You can control the maximum size of the exceptions using the `druid.supervisor.maxStoredExceptionEvents` configuration. -The two properties related to the supervisor's state are `state` and `detailedState`. The `state` property contains a small number of generic states that apply to any type of supervisor, while the `detailedState` property contains a more descriptive, implementation-specific state that may provide more insight into the supervisor's activities. +To view the supervisor status in the web console, navigate to the **Supervisors** view and click the supervisor ID to open the **Supervisor** dialog. +Click **Status** in the left navigation pane to display the status: + +![Supervisors info dialog](../assets/supervisor-info-dialog.png) -Possible state values are `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`, `UNHEALTHY_SUPERVISOR`, and `UNHEALTHY_TASKS`. +The following example shows the status of a supervisor with the name `social_media`: + +
+ Click to view the example + +```json +{ + "dataSource": "social_media", + "stream": "social_media", + "partitions": 1, + "replicas": 1, + "durationSeconds": 3600, + "activeTasks": [ + { + "id": "index_kafka_social_media_8ff3096f21fe448_jajnddno", + "startingOffsets": { + "0": 0 + }, + "startTime": "2024-01-30T21:21:41.696Z", + "remainingSeconds": 479, + "type": "ACTIVE", + "currentOffsets": { + "0": 50000 + }, + "lag": { + "0": 0 + } + } + ], + "publishingTasks": [], + "latestOffsets": { + "0": 50000 + }, + "minimumLag": { + "0": 0 + }, + "aggregateLag": 0, + "offsetsLastUpdated": "2024-01-30T22:13:19.335Z", + "suspended": false, + "healthy": true, + "state": "RUNNING", + "detailedState": "RUNNING", + "recentErrors": [] +} +``` +
+ +The status report contains two properties that correspond to the state of the supervisor: `state` and `detailedState`. The `state` property contains a small number of generic states that apply to any type of supervisor. The `detailedState` property contains a more descriptive, implementation-specific state that may provide more insight into the supervisor's activities. + +Possible `state` values are `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`, `UNHEALTHY_SUPERVISOR`, and `UNHEALTHY_TASKS`. The following table lists `detailedState` values and their corresponding `state` mapping: -|Detailed state|Corresponding state|Description| +|`detailedState`|`state`|Description| |--------------|-------------------|-----------| |`UNHEALTHY_SUPERVISOR`|`UNHEALTHY_SUPERVISOR`|The supervisor encountered errors on previous `druid.supervisor.unhealthinessThreshold` iterations.| |`UNHEALTHY_TASKS`|`UNHEALTHY_TASKS`|The last `druid.supervisor.taskUnhealthinessThreshold` tasks all failed.| -|`UNABLE_TO_CONNECT_TO_STREAM`|`UNHEALTHY_SUPERVISOR`|The supervisor is encountering connectivity issues with the stream and has not successfully connected in the past.| +|`UNABLE_TO_CONNECT_TO_STREAM`|`UNHEALTHY_SUPERVISOR`|The supervisor is encountering connectivity issues with the stream and hasn't successfully connected in the past.| |`LOST_CONTACT_WITH_STREAM`|`UNHEALTHY_SUPERVISOR`|The supervisor is encountering connectivity issues with the stream but has successfully connected in the past.| |`PENDING` (first iteration only)|`PENDING`|The supervisor has been initialized but hasn't started connecting to the stream.| |`CONNECTING_TO_STREAM` (first iteration only)|`RUNNING`|The supervisor is trying to connect to the stream and update partition data.| @@ -87,9 +330,20 @@ that is, once it has completed a full execution without encountering any issues, state until it is stopped, suspended, or hits a failure threshold and transitions to an unhealthy state. :::info -For Kafka indexing service, the consumer lag per partition may be reported as negative values if the supervisor hasn't received the latest offset response from Kafka. The aggregate lag value will always be >= 0. +For the Kafka indexing service, the consumer lag per partition may be reported as negative values if the supervisor hasn't received the latest offset response from Kafka. The aggregate lag value will always be >= 0. ::: +## SUPERVISORS system table + +Druid exposes system information through special system schemas. You can query the `sys.supervisors` table to retrieve information about the supervisor internals. +The following example shows how to retrieve supervisor tasks information filtered by health status: + +```sql +SELECT * FROM sys.supervisors WHERE healthy=0; +``` + +For more information on the supervisors system table, see [SUPERVISORS table](../querying/sql-metadata-tables.md#supervisors-table). + ## Capacity planning Indexing tasks run on MiddleManagers and are limited by the resources available in the MiddleManager cluster. In particular, you should make sure that you have sufficient worker capacity, configured using the From 76784c7b9b44ac5ad27d8cb355bcde0ae27c354e Mon Sep 17 00:00:00 2001 From: Katya Macedo Date: Mon, 5 Feb 2024 08:58:58 -0600 Subject: [PATCH 07/15] Saving --- docs/ingestion/supervisor.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/docs/ingestion/supervisor.md b/docs/ingestion/supervisor.md index bd5963f461f7..054ebcaa5036 100644 --- a/docs/ingestion/supervisor.md +++ b/docs/ingestion/supervisor.md @@ -43,7 +43,8 @@ The following table outlines the high-level configuration options for a supervis ### I/O configuration -The following table outlines the `ioConfig` configuration properties that apply to both Apache Kafka and Amazon Kinesis ingestion methods: +The following table outlines the `ioConfig` configuration properties that apply to both Apache Kafka and Amazon Kinesis ingestion methods. +For configuration properties specific to Apache Kafka or Amazon Kinesis, see [Kafka I/O configuration](kafka-ingestion.md#io-configuration) and [Kinesis I/O configuration](kinesis-ingestion.md#io-configuration) respectively. |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| @@ -59,9 +60,6 @@ The following table outlines the `ioConfig` configuration properties that apply |`lateMessageRejectionPeriod`|ISO 8601 period|Configures tasks to reject messages with timestamps earlier than this period before the task was created. For example, if this property is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps earlier than `2016-01-01T11:00Z`. This may help prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments, such as a streaming and a nightly batch ingestion pipeline. You can specify only one of the late message rejection properties.|No|| |`earlyMessageRejectionPeriod`|ISO 8601 period|Configures tasks to reject messages with timestamps later than this period after the task reached its task duration. For example, if this property is set to `PT1H`, the task duration is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps later than `2016-01-01T14:00Z`. Tasks sometimes run past their task duration, such as in cases of supervisor failover. Setting `earlyMessageRejectionPeriod` too low may cause Druid to drop messages unexpectedly whenever a task runs past its originally configured task duration.|No|| -For configuration properties specific to Apache Kafka, see [Kafka I/O configuration](kafka-ingestion.md#io-configuration). -For configuration properties specific to Amazon Kinesis, see [Kinesis I/O configuration](kinesis-ingestion.md#io-configuration). - #### Task autoscaler You can optionally configure autoscaling behavior for ingestion tasks using the `autoScalerConfig` property of the `ioConfig` object. From 95d982597fd2fd26a621b7ead844dc9e9ff94088 Mon Sep 17 00:00:00 2001 From: Katya Macedo Date: Mon, 5 Feb 2024 14:13:28 -0600 Subject: [PATCH 08/15] Add input format text --- docs/ingestion/kafka-ingestion.md | 235 ++++++++++++++++++++++++---- docs/ingestion/kinesis-ingestion.md | 15 +- docs/ingestion/supervisor.md | 19 +-- 3 files changed, 216 insertions(+), 53 deletions(-) diff --git a/docs/ingestion/kafka-ingestion.md b/docs/ingestion/kafka-ingestion.md index d79296a55b6a..baa2810f2c1f 100644 --- a/docs/ingestion/kafka-ingestion.md +++ b/docs/ingestion/kafka-ingestion.md @@ -25,13 +25,12 @@ description: "Overview of the Kafka indexing service for Druid. Includes example --> :::info -To use the Kafka indexing service, you must be on Apache Kafka version of 0.11.x or higher. -If you are using an older version, refer to the [Kafka upgrade guide](https://kafka.apache.org/documentation/#upgrade). +To use the Kafka indexing service, you must be on Apache Kafka version 0.11.x or higher. +If you are using an older version, refer to the [Apache Kafka upgrade guide](https://kafka.apache.org/documentation/#upgrade). ::: When you enable the Kafka indexing service, you can configure supervisors on the Overlord to manage the creation and lifetime of Kafka indexing tasks. - -Kafka indexing tasks read events using Kafka's own partition and offset mechanism to guarantee exactly-once ingestion. The supervisor oversees the state of the indexing tasks to coordinate handoffs, manage failures, and ensure that scalability and replication requirements are maintained. +Kafka indexing tasks read events using Kafka partition and offset mechanism to guarantee exactly-once ingestion. The supervisor oversees the state of the indexing tasks to coordinate handoffs, manage failures, and ensure that scalability and replication requirements are maintained. This topic contains configuration information for the Kafka indexing service supervisor for Apache Druid. @@ -39,19 +38,6 @@ This topic contains configuration information for the Kafka indexing service sup To use the Kafka indexing service, you must first load the `druid-kafka-indexing-service` extension on both the Overlord and the MiddleManager. See [Loading extensions](../configuration/extensions.md) for more information. -## Deployment notes on Kafka partitions and Druid segments - -Druid assigns Kafka partitions to each Kafka indexing task. A task writes the events it consumes from Kafka into a single segment for the segment granularity interval until it reaches one of the following limits: `maxRowsPerSegment`, `maxTotalRows`, or `intermediateHandoffPeriod`. At this point, the task creates a new partition for this segment granularity to contain subsequent events. - -The Kafka indexing task also does incremental hand-offs. Therefore, segments become available as they are ready and you don't have to wait for all segments until the end of the task duration. When the task reaches one of `maxRowsPerSegment`, `maxTotalRows`, or `intermediateHandoffPeriod`, it hands off all the segments and creates a new set of segments for further events. This allows the task to run for longer durations without accumulating old segments locally on MiddleManager services. - -The Kafka indexing service may still produce some small segments. For example, consider the following scenario: -- Task duration is 4 hours. -- Segment granularity is set to an HOUR. -- The supervisor was started at 9:10. -After 4 hours at 13:10, Druid starts a new set of tasks. The events for the interval 13:00 - 14:00 may be split across existing tasks and the new set of tasks which could result in small segments. To merge them together into new segments of an ideal size (in the range of ~500-700 MB per segment), you can schedule re-indexing tasks, optionally with a different segment granularity. -For information on how to optimize the segment size, see [Segment size optimization](../operations/segment-optimization.md). - ## Supervisor spec configuration This section outlines the configuration properties that are specific to the Apache Kafka streaming ingestion method. For configuration properties shared across all streaming ingestion methods supported by Druid, see [Supervisor spec](supervisor.md#supervisor-spec). @@ -129,20 +115,18 @@ The following example shows a supervisor spec for the Kafka indexing service: ### I/O configuration -The following table outlines the Kafka-specific configuration properties for `ioConfig`: +The following table outlines the `ioConfig` configuration properties specific to Kafka. +For configuration properties shared across all streaming ingestion methods, refer to [Supervisor I/O configuration](supervisor.md#io-configuration). |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| -|`topic`|String|Single Kafka topic to read from. To ingest data from multiple topic, use `topicPattern`. |Yes if `topicPattern` isn't set.|| -|`topicPattern`|Multiple Kafka topics to read from, passed as a regex pattern. See [Ingest from multiple topics](#ingest-from-multiple-topics) for more information.|Yes if `topic` isn't set.|| +|`topic`|String|The Kafka topic to read from. To ingest data from multiple topic, use `topicPattern`. |Yes if `topicPattern` isn't set.|| +|`topicPattern`|String|Multiple Kafka topics to read from, passed as a regex pattern. See [Ingest from multiple topics](#ingest-from-multiple-topics) for more information.|Yes if `topic` isn't set.|| |`consumerProperties`|String, Object|A map of properties to pass to the Kafka consumer. See [Consumer properties](#consumer-properties) for details.|Yes|| |`pollTimeout`|Long|The length of time to wait for the Kafka consumer to poll records, in milliseconds.|No|100| |`useEarliestOffset`|Boolean|If a supervisor manages a datasource for the first time, it obtains a set of starting offsets from Kafka. This flag determines whether it retrieves the earliest or latest offsets in Kafka. Under normal circumstances, subsequent tasks start from where the previous segments ended. Druid only uses `useEarliestOffset` on the first run.|No|`false`| |`idleConfig`|Object|Defines how and when the Kafka supervisor can become idle. See [Idle configuration](#idle-configuration) for more details.|No|null| -For configuration properties shared across all streaming ingestion methods supported by Druid, refer to [Supervisor I/O configuration](supervisor.md#io-configuration). - - #### Ingest from multiple topics :::info @@ -155,22 +139,18 @@ When ingesting data from multiple topics, Druid assigns partitions based on the To ingest data from multiple topics, use the `topicPattern` property instead of `topic`. You pass multiple topics as a regex pattern. For example, to ingest data from clicks and impressions, set `topicPattern` to `clicks|impressions`. -Similarly, you can use `metrics-.*` as the value for `topicPattern` if you want to ingest from all the topics that start with `metrics-`. If you add a new topic that matches the regex to the cluster, Druid automatically starts ingesting from those new topics. Topic names that match partially, such as `my-metrics-12`, are not included for ingestion. +Similarly, you can use `metrics-.*` as the value for `topicPattern` if you want to ingest from all the topics that start with `metrics-`. If you add a new topic that matches the regex to the cluster, Druid automatically starts ingesting from the new topic. Topic names that match partially, such as `my-metrics-12`, are not included for ingestion. #### Consumer properties Consumer properties must contain a property `bootstrap.servers` with a list of Kafka brokers in the form: `:,:,...`. -By default, `isolation.level` is set to `read_committed`. +By default, `isolation.level` is set to `read_committed`. -If you use older versions of Kafka servers without transactions support or don't want Druid to consume only committed transactions, set `isolation.level` to `read_uncommitted`. - -Additionally, you can set `isolation.level` to `read_uncommitted` in `consumerProperties` if either: -- You don't need Druid to consume transactional topics. -- You need Druid to consume older versions of Kafka. Make sure offsets are sequential, since there is no offset gap check in Druid. +If you use older versions of Kafka servers without transactions support or don't want Druid to consume only committed transactions, set `isolation.level` to `read_uncommitted`. If need Druid to consume older versions of Kafka, make sure offsets are sequential, since there is no offset gap check in Druid. If your Kafka cluster enables consumer-group based ACLs, you can set `group.id` in `consumerProperties` to override the default auto generated group ID. -In some cases, you may need to fetch consumer properties at runtime. For example, when `bootstrap.servers` is not known upfront, or is not static. To enable SSL connections, you must provide passwords for `keystore`, `truststore`, and `key` secretly. You can provide configurations at runtime with a dynamic config provider implementation like the environment variable config provider that comes with Druid. For more information, see [Dynamic config provider](../operations/dynamic-config-provider.md). +In some cases, you may need to fetch consumer properties at runtime. For example, when `bootstrap.servers` is not known upfront or is not static. To enable SSL connections, you must provide passwords for `keystore`, `truststore`, and `key` secretly. You can provide configurations at runtime with a dynamic config provider implementation like the environment variable config provider that comes with Druid. For more information, see [Dynamic config provider](../operations/dynamic-config-provider.md). For example, if you are using SASL and SSL with Kafka, set the following environment variables for the Druid user on the machines running the Overlord and the Peon services: @@ -195,6 +175,8 @@ export SSL_TRUSTSTORE_PASSWORD=mysecrettruststorepassword Verify that you've changed the values for all configurations to match your own environment. In the Druid data loader interface, you can use the environment variable config provider syntax in the **Consumer properties** field on the **Connect tab**. When connecting to Kafka, Druid replaces the environment variables with their corresponding values. +You can provide SSL connections with [Password provider](../operations/password-provider.md) interface to define the `keystore`, `truststore`, and `key`, but this feature is deprecated. + #### Idle configuration :::info @@ -210,7 +192,7 @@ The following table outlines the configuration options for `idleConfig`: |`enabled`|If `true`, the supervisor becomes idle if there is no data on input stream or topic for some time.|No|`false`| |`inactiveAfterMillis`|The supervisor becomes idle if all existing data has been read from input topic and no new data has been published for `inactiveAfterMillis` milliseconds.|No|`600_000`| -The following example shows a supervisor spec with `lagBased` autoscaler and idle configuration enabled: +The following example shows a supervisor spec with idle configuration enabled:
Click to view the example @@ -259,17 +241,202 @@ The following example shows a supervisor spec with `lagBased` autoscaler and idl ```
+#### Data format + + + +The Kafka indexing service supports both [`inputFormat`](data-formats.md#input-format) and [`parser`](data-formats.md#parser) to specify the data format. Use the `inputFormat` to specify the data format for the Kafka indexing service unless you need a format only supported by the legacy `parser`. For more information, see [Source input formats](data-formats.md). + +The Kinesis indexing service supports the following values for `inputFormat`: + +* `csv` +* `tvs` +* `json` +* `kafka` +* `avro_stream` +* `avro_ocf` +* `protobuf` + +You can use `parser` to read [`thrift`](../development/extensions-contrib/thrift.md) formats. + +##### Kafka input format supervisor spec example + +The `kafka` input format lets you parse the Kafka metadata fields in addition to the Kafka payload value contents. + +The `kafka` input format wraps around the payload parsing input format and augments the data it outputs with the Kafka event timestamp, the Kafka topic name, the Kafka event headers, and the key field that itself can be parsed using any available input format. + +For example, consider the following structure for a Kafka message that represents a wiki edit in a development environment: + +- **Kafka timestamp**: `1680795276351` +- **Kafka topic**: `wiki-edits` +- **Kafka headers**: + - `env=development` + - `zone=z1` +- **Kafka key**: `wiki-edit` +- **Kafka payload value**: `{"channel":"#sv.wikipedia","timestamp":"2016-06-27T00:00:11.080Z","page":"Salo Toraut","delta":31,"namespace":"Main"}` + +Using `{ "type": "json" }` as the input format only parses the payload value. +To parse the Kafka metadata in addition to the payload, use the `kafka` input format. + +You configure it as follows: + +- `valueFormat`: Define how to parse the payload value. Set this to the payload parsing input format (`{ "type": "json" }`). +- `timestampColumnName`: Supply a custom name for the Kafka timestamp in the Druid schema to avoid conflicts with columns from the payload. The default is `kafka.timestamp`. +- `topicColumnName`: Supply a custom name for the Kafka topic in the Druid schema to avoid conflicts with columns from the payload. The default is `kafka.topic`. This field is useful when ingesting data from multiple topics into the same datasource. +- `headerFormat`: The default value `string` decodes strings in UTF-8 encoding from the Kafka header. + Other supported encoding formats include the following: + - `ISO-8859-1`: ISO Latin Alphabet No. 1, that is, ISO-LATIN-1. + - `US-ASCII`: Seven-bit ASCII. Also known as ISO646-US. The Basic Latin block of the Unicode character set. + - `UTF-16`: Sixteen-bit UCS Transformation Format, byte order identified by an optional byte-order mark. + - `UTF-16BE`: Sixteen-bit UCS Transformation Format, big-endian byte order. + - `UTF-16LE`: Sixteen-bit UCS Transformation Format, little-endian byte order. +- `headerColumnPrefix`: Supply a prefix to the Kafka headers to avoid any conflicts with columns from the payload. The default is `kafka.header.`. + Considering the header from the example, Druid maps the headers to the following columns: `kafka.header.env`, `kafka.header.zone`. +- `keyFormat`: Supply an input format to parse the key. Only the first value is used. + If, as in the example, your key values are simple strings, then you can use the `tsv` format to parse them. + ```json + { + "type": "tsv", + "findColumnsFromHeader": false, + "columns": ["x"] + } + ``` + Note that for `tsv`,`csv`, and `regex` formats, you need to provide a `columns` array to make a valid input format. Only the first one is used, and its name will be ignored in favor of `keyColumnName`. +- `keyColumnName`: Supply the name for the Kafka key column to avoid conflicts with columns from the payload. The default is `kafka.key`. + +The following input format uses default values for `timestampColumnName`, `topicColumnName`, `headerColumnPrefix`, and `keyColumnName`: + +```json +{ + "type": "kafka", + "valueFormat": { + "type": "json" + }, + "headerFormat": { + "type": "string" + }, + "keyFormat": { + "type": "tsv", + "findColumnsFromHeader": false, + "columns": ["x"] + } +} +``` + +It parses the example message as follows: + +```json +{ + "channel": "#sv.wikipedia", + "timestamp": "2016-06-27T00:00:11.080Z", + "page": "Salo Toraut", + "delta": 31, + "namespace": "Main", + "kafka.timestamp": 1680795276351, + "kafka.topic": "wiki-edits", + "kafka.header.env": "development", + "kafka.header.zone": "z1", + "kafka.key": "wiki-edit" +} +``` + +Finally, add these Kafka metadata columns to the `dimensionsSpec` or set your `dimensionsSpec` to auto-detect columns. + +The following supervisor spec demonstrates how to ingest the Kafka header, key, timestamp, and topic into Druid dimensions: + +
+ Click to view the example + +```json +{ + "type": "kafka", + "spec": { + "ioConfig": { + "type": "kafka", + "consumerProperties": { + "bootstrap.servers": "localhost:9092" + }, + "topic": "wiki-edits", + "inputFormat": { + "type": "kafka", + "valueFormat": { + "type": "json" + }, + "headerFormat": { + "type": "string" + }, + "keyFormat": { + "type": "tsv", + "findColumnsFromHeader": false, + "columns": ["x"] + } + }, + "useEarliestOffset": true + }, + "dataSchema": { + "dataSource": "wikiticker", + "timestampSpec": { + "column": "timestamp", + "format": "posix" + }, + "dimensionsSpec": "dimensionsSpec": { + "useSchemaDiscovery": true, + "includeAllDimensions": true + }, + "granularitySpec": { + "queryGranularity": "none", + "rollup": false, + "segmentGranularity": "day" + } + }, + "tuningConfig": { + "type": "kafka" + } + } +} +``` +
+ +After Druid ingests the data, you can query the Kafka metadata columns as follows: + +```sql +SELECT + "kafka.header.env", + "kafka.key", + "kafka.timestamp", + "kafka.topic" +FROM "wikiticker" +``` + +This query returns: + +|`kafka.header.env`|`kafka.key`|`kafka.timestamp`|`kafka.topic`| +|------------------|-----------|-----------------|-------------| +|`development`|`wiki-edit`|`1680795276351`|`wiki-edits`| ### Tuning configuration -The following table outlines the Kafka-specific configuration properties for `tuningConfig`: +The following table outlines the `tuningConfig` configuration properties specific to Kafka. +For configuration properties shared across all streaming ingestion methods, refer to [Supervisor tuning configuration](supervisor.md#tuning-configuration). |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| |`chatAsync`|Boolean|If `true`, use asynchronous communication with indexing tasks, and ignore the `chatThreads` parameter. If `false`, use synchronous communication in a thread pool of size `chatThreads`.|No|`true`| |`chatThreads`|Integer|The number of threads to use for communicating with indexing tasks. Ignored if `chatAsync` is `true`.|No|`min(10, taskCount * replicas)`| -For configuration properties shared across all streaming ingestion methods supported by Druid, refer to [Supervisor tuning configuration](supervisor.md#tuning-configuration). +## Deployment notes on Kafka partitions and Druid segments + +Druid assigns Kafka partitions to each Kafka indexing task. A task writes the events it consumes from Kafka into a single segment for the segment granularity interval until it reaches one of the following limits: `maxRowsPerSegment`, `maxTotalRows`, or `intermediateHandoffPeriod`. At this point, the task creates a new partition for this segment granularity to contain subsequent events. + +The Kafka indexing task also does incremental hand-offs. Therefore, segments become available as they are ready and you don't have to wait for all segments until the end of the task duration. When the task reaches one of `maxRowsPerSegment`, `maxTotalRows`, or `intermediateHandoffPeriod`, it hands off all the segments and creates a new set of segments for further events. This allows the task to run for longer durations without accumulating old segments locally on MiddleManager services. + +The Kafka indexing service may still produce some small segments. For example, consider the following scenario: +- Task duration is 4 hours. +- Segment granularity is set to an HOUR. +- The supervisor was started at 9:10. +After 4 hours at 13:10, Druid starts a new set of tasks. The events for the interval 13:00 - 14:00 may be split across existing tasks and the new set of tasks which could result in small segments. To merge them together into new segments of an ideal size (in the range of ~500-700 MB per segment), you can schedule re-indexing tasks, optionally with a different segment granularity. + +For information on how to optimize the segment size, see [Segment size optimization](../operations/segment-optimization.md). ## Learn more diff --git a/docs/ingestion/kinesis-ingestion.md b/docs/ingestion/kinesis-ingestion.md index eb9a14e52f3b..5d2838f20277 100644 --- a/docs/ingestion/kinesis-ingestion.md +++ b/docs/ingestion/kinesis-ingestion.md @@ -27,7 +27,7 @@ import TabItem from '@theme/TabItem'; ~ under the License. --> -When you enable the Kinesis indexing service, you can configure supervisors on the Overlord to manage the creation and lifetime of Kinesis indexing tasks. These indexing tasks read events using the Kinesis shard and sequence number mechanism to guarantee exactly-once ingestion. The supervisor oversees the state of the indexing tasks to coordinate handoffs, manage failures, and ensure that scalability and replication requirements are maintained. +When you enable the Kinesis indexing service, you can configure supervisors on the Overlord to manage the creation and lifetime of Kinesis indexing tasks. Kinesis indexing tasks read events using the Kinesis shard and sequence number mechanism to guarantee exactly-once ingestion. The supervisor oversees the state of the indexing tasks to coordinate handoffs, manage failures, and ensure that scalability and replication requirements are maintained. This topic contains configuration information for the Kinesis indexing service supervisor for Apache Druid. @@ -121,7 +121,8 @@ The following example shows a supervisor spec for a stream with the name `Kinesi ### I/O configuration -The following table outlines the `ioConfig` configuration properties specific to Kinesis: +The following table outlines the `ioConfig` configuration properties specific to Kinesis. +For configuration properties shared across all streaming ingestion methods, refer to [Supervisor I/O configuration](supervisor.md#io-configuration). |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| @@ -134,15 +135,14 @@ The following table outlines the `ioConfig` configuration properties specific to |`awsExternalId`|String|The AWS external ID to use for additional permissions.|No|| |`deaggregate`|Boolean|Whether to use the deaggregate function of the Kinesis Client Library (KCL).|No|| -For configuration properties shared across all streaming ingestion methods supported by Druid, refer to [Supervisor I/O configuration](supervisor.md#io-configuration). - #### Data format The Kinesis indexing service supports both [`inputFormat`](data-formats.md#input-format) and [`parser`](data-formats.md#parser) to specify the data format. Use the `inputFormat` to specify the data format for the Kinesis indexing service unless you need a format only supported by the legacy `parser`. For more information, see [Source input formats](data-formats.md). The Kinesis indexing service supports the following values for `inputFormat`: + * `csv` -* `delimited` +* `tvs` * `json` * `avro_stream` * `avro_ocf` @@ -152,7 +152,8 @@ You can use `parser` to read [`thrift`](../development/extensions-contrib/thrift ### Tuning configuration -The following table outlines the `tuningConfig` configuration properties specific to Kinesis: +The following table outlines the `tuningConfig` configuration properties specific to Kinesis. +For configuration properties shared across all streaming ingestion methods, refer to [Supervisor tuning configuration](supervisor.md#tuning-configuration). |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| @@ -165,8 +166,6 @@ The following table outlines the `tuningConfig` configuration properties specifi |`repartitionTransitionDuration`|ISO 8601 period|When shards are split or merged, the supervisor recomputes shard to task group mappings. The supervisor also signals any running tasks created under the old mappings to stop early at current time + `repartitionTransitionDuration`. Stopping the tasks early allows Druid to begin reading from the new shards more quickly. The repartition transition wait time controlled by this property gives the stream additional time to write records to the new shards after the split or merge, which helps avoid issues with [empty shard handling](https://github.com/apache/druid/issues/7600).|No|`PT2M`| |`useListShards`|Boolean|Indicates if `listShards` API of AWS Kinesis SDK can be used to prevent `LimitExceededException` during ingestion. You must set the necessary `IAM` permissions.|No|`false`| -For configuration properties shared across all streaming ingestion methods supported by Druid, refer to [Supervisor tuning configuration](supervisor.md#tuning-configuration). - ## AWS authentication Druid uses AWS access and secret keys to authenticate Kinesis API requests. There are a few ways to provide this information to Druid: diff --git a/docs/ingestion/supervisor.md b/docs/ingestion/supervisor.md index 054ebcaa5036..83d03b347fe2 100644 --- a/docs/ingestion/supervisor.md +++ b/docs/ingestion/supervisor.md @@ -44,7 +44,7 @@ The following table outlines the high-level configuration options for a supervis ### I/O configuration The following table outlines the `ioConfig` configuration properties that apply to both Apache Kafka and Amazon Kinesis ingestion methods. -For configuration properties specific to Apache Kafka or Amazon Kinesis, see [Kafka I/O configuration](kafka-ingestion.md#io-configuration) and [Kinesis I/O configuration](kinesis-ingestion.md#io-configuration) respectively. +For configuration properties specific to Apache Kafka and Amazon Kinesis, see [Kafka I/O configuration](kafka-ingestion.md#io-configuration) and [Kinesis I/O configuration](kinesis-ingestion.md#io-configuration) respectively. |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| @@ -72,7 +72,7 @@ The following table outlines the configuration properties for `autoScalerConfig` |`taskCountMax`|The maximum number of ingestion tasks. Must be greater than or equal to `taskCountMin`. If `taskCountMax` is greater than the number of Kafka partitions or Kinesis shards, Druid set the maximum number of reading tasks to the number of Kafka partitions or Kinesis shards and ignores `taskCountMax`.|Yes|| |`taskCountMin`|The minimum number of ingestion tasks. When you enable the autoscaler, Druid ignores the value of `taskCount` in `ioConfig` and starts with the `taskCountMin` number of tasks to launch.|Yes|| |`minTriggerScaleActionFrequencyMillis`|The minimum time interval between two scale actions.| No|600000| -|`autoScalerStrategy`|The algorithm of `autoScaler`. Druid only supports the `lagBased` strategy. See [Autoscaler strategy](#autoscaler-strategy) for more information.|No|`lagBased`| +|`autoScalerStrategy`|The algorithm of autoscaler. Druid only supports the `lagBased` strategy. See [Autoscaler strategy](#autoscaler-strategy) for more information.|No|`lagBased`| ##### Autoscaler strategy @@ -80,7 +80,7 @@ The following table outlines the configuration properties for `autoScalerConfig` Unlike the Kafka indexing service, Kinesis reports lag metrics measured in time difference in milliseconds between the current sequence number and latest sequence number, rather than message count. ::: -The following table outlines the configuration properties for `autoScalerStrategy`: +The following table outlines the configuration properties related to the `lagBased` autoscaler strategy: |Property|Description|Required|Default| |--------|-----------|--------|-------| @@ -95,7 +95,7 @@ The following table outlines the configuration properties for `autoScalerStrateg |`scaleInStep`|The number of tasks to reduce at once when scaling down.|No|1| |`scaleOutStep`|The number of tasks to add at once when scaling out.|No|2| -The following example shows a supervisor spec with `lagBased` auto scaler enabled: +The following example shows a supervisor spec with `lagBased` autoscaler:
Click to view the example @@ -182,7 +182,8 @@ The following example shows a supervisor spec with `lagBased` auto scaler enable The `tuningConfig` object is optional. If you don't specify the `tuningConfig` object, Druid uses the default configuration settings. -The following table outlines the `tuningConfig` configuration properties that apply to both Apache Kafka and Amazon Kinesis ingestion methods: +The following table outlines the `tuningConfig` configuration properties that apply to both Apache Kafka and Amazon Kinesis ingestion methods. +For configuration properties specific to Apache Kafka and Amazon Kinesis, see [Kafka tuning configuration](kafka-ingestion.md#tuning-configuration) and [Kinesis tuning configuration](kinesis-ingestion.md#tuning-configuration) respectively. |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| @@ -199,20 +200,16 @@ The following table outlines the `tuningConfig` configuration properties that ap |`indexSpecForIntermediatePersists`|Object|Defines segment storage format options to use at indexing time for intermediate persisted temporary segments. You can use `indexSpecForIntermediatePersists` to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. However, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published.|No|| |`reportParseExceptions`|Boolean|DEPRECATED. If `true`, Druid throws exceptions encountered during parsing causing ingestion to halt. If `false`, Druid skips unparseable rows and fields. Setting `reportParseExceptions` to `true` overrides existing configurations for `maxParseExceptions` and `maxSavedParseExceptions`, setting `maxParseExceptions` to 0 and limiting `maxSavedParseExceptions` to not more than 1.|No|`false`| |`handoffConditionTimeout`|Long|Number of milliseconds to wait for segment handoff. Set to a value >= 0, where 0 means to wait indefinitely.|No|900000 (15 minutes) for Kafka. 0 for Kinesis.| -|`resetOffsetAutomatically`|Boolean|Resets partitions when the sequence number is unavailable. -If set to `true`, Druid resets partitions to the earliest or latest Kafka sequence number or Kinesis offset, based on the value of `useEarliestSequenceNumber` or `useEarliestOffset` (earliest if `true`, latest if `false`). If set to `false`, the exception bubbles up causing tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation, potentially through [resetting the supervisor](../api-reference/supervisor-api.md#reset-a-supervisor).|No|`false`| +|`resetOffsetAutomatically`|Boolean|Resets partitions when the sequence number is unavailable. If set to `true`, Druid resets partitions to the earliest or latest Kafka sequence number or Kinesis offset, based on the value of `useEarliestSequenceNumber` or `useEarliestOffset` (earliest if `true`, latest if `false`). If set to `false`, the exception bubbles up causing tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation, potentially through [resetting the supervisor](../api-reference/supervisor-api.md#reset-a-supervisor).|No|`false`| |`workerThreads`|Integer|The number of threads that the supervisor uses to handle requests/responses for worker tasks, along with any other internal asynchronous operation.|No|`min(10, taskCount)`| |`chatRetries`|Integer|The number of times Druid retries HTTP requests to indexing tasks before considering tasks unresponsive.|No|8| |`httpTimeout`|ISO 8601 period|The period of time to wait for a HTTP response from an indexing task.|No|`PT10S`| |`shutdownTimeout`|ISO 8601 period|The period of time to wait for the supervisor to attempt a graceful shutdown of tasks before exiting.|No|`PT80S`| +|`offsetFetchPeriod`|ISO 8601 period|Determines how often the supervisor queries the streaming source and the indexing tasks to fetch current offsets and calculate lag. If the user-specified value is below the minimum value of `PT5S`, the supervisor ignores the value and uses the minimum value instead.|No|`PT30S`| |`segmentWriteOutMediumFactory`|Object|The segment write-out medium to use when creating segments. See [Additional Peon configuration: SegmentWriteOutMediumFactory](../configuration/index.md#segmentwriteoutmediumfactory) for explanation and available options.|No|If not specified, Druid uses the value from `druid.peon.defaultSegmentWriteOutMediumFactory.type`.| |`logParseExceptions`|Boolean|If `true`, Druid logs an error message when a parsing exception occurs, containing information about the row where the error occurred.|No|`false`| |`maxParseExceptions`|Integer|The maximum number of parse exceptions that can occur before the task halts ingestion and fails. Overridden if `reportParseExceptions` is set.|No|unlimited| |`maxSavedParseExceptions`|Integer|When a parse exception occurs, Druid keeps track of the most recent parse exceptions. `maxSavedParseExceptions` limits the number of saved exception instances. These saved exceptions are available after the task finishes in the [task completion report](../ingestion/tasks.md#task-reports). Overridden if `reportParseExceptions` is set.|No|0| -|`offsetFetchPeriod`|ISO 8601 period|Determines how often the supervisor queries the streaming source and the indexing tasks to fetch current offsets and calculate lag. If the user-specified value is below the minimum value of `PT5S`, the supervisor ignores the value and uses the minimum value instead.|No|`PT30S`| - -For configuration properties specific to Apache Kafka, see [Kafka tuning configuration](kafka-ingestion.md#tuning-configuration). -For configuration properties specific to Amazon Kinesis, see [Kinesis tuning configuration](kinesis-ingestion.md#tuning-configuration). ## Start a supervisor From daf74b3bc3b4014bf429fa534fb49775dc554625 Mon Sep 17 00:00:00 2001 From: Katya Macedo Date: Mon, 5 Feb 2024 22:30:29 -0600 Subject: [PATCH 09/15] Update after review --- docs/ingestion/kafka-ingestion.md | 8 +++++--- docs/ingestion/kinesis-ingestion.md | 3 +-- docs/ingestion/streaming.md | 2 +- docs/ingestion/supervisor.md | 4 ++-- 4 files changed, 9 insertions(+), 8 deletions(-) diff --git a/docs/ingestion/kafka-ingestion.md b/docs/ingestion/kafka-ingestion.md index baa2810f2c1f..9f52d5b9dc0a 100644 --- a/docs/ingestion/kafka-ingestion.md +++ b/docs/ingestion/kafka-ingestion.md @@ -122,7 +122,7 @@ For configuration properties shared across all streaming ingestion methods, refe |--------|----|-----------|--------|-------| |`topic`|String|The Kafka topic to read from. To ingest data from multiple topic, use `topicPattern`. |Yes if `topicPattern` isn't set.|| |`topicPattern`|String|Multiple Kafka topics to read from, passed as a regex pattern. See [Ingest from multiple topics](#ingest-from-multiple-topics) for more information.|Yes if `topic` isn't set.|| -|`consumerProperties`|String, Object|A map of properties to pass to the Kafka consumer. See [Consumer properties](#consumer-properties) for details.|Yes|| +|`consumerProperties`|String, Object|A map of properties to pass to the Kafka consumer. See [Consumer properties](#consumer-properties) for details.|Yes. You must set the `bootstrap.servers` property to establish the initial connection to the Kafka cluster.|| |`pollTimeout`|Long|The length of time to wait for the Kafka consumer to poll records, in milliseconds.|No|100| |`useEarliestOffset`|Boolean|If a supervisor manages a datasource for the first time, it obtains a set of starting offsets from Kafka. This flag determines whether it retrieves the earliest or latest offsets in Kafka. Under normal circumstances, subsequent tasks start from where the previous segments ended. Druid only uses `useEarliestOffset` on the first run.|No|`false`| |`idleConfig`|Object|Defines how and when the Kafka supervisor can become idle. See [Idle configuration](#idle-configuration) for more details.|No|null| @@ -143,10 +143,12 @@ Similarly, you can use `metrics-.*` as the value for `topicPattern` if you want #### Consumer properties -Consumer properties must contain a property `bootstrap.servers` with a list of Kafka brokers in the form: `:,:,...`. +Consumer properties control how a supervisor reads and processes event messages from a Kafka stream. For more information about consumers, refer to the [Apache Kafka documentation](https://kafka.apache.org/documentation/#consumerconfigs). + +The `consumerProperties` object must contain a `bootstrap.servers` property with a list of Kafka brokers in the form: `:,:,...`. By default, `isolation.level` is set to `read_committed`. -If you use older versions of Kafka servers without transactions support or don't want Druid to consume only committed transactions, set `isolation.level` to `read_uncommitted`. If need Druid to consume older versions of Kafka, make sure offsets are sequential, since there is no offset gap check in Druid. +If you use older versions of Kafka servers without transactions support or don't want Druid to consume only committed transactions, set `isolation.level` to `read_uncommitted`. If you need Druid to consume older versions of Kafka, make sure offsets are sequential, since there is no offset gap check in Druid. If your Kafka cluster enables consumer-group based ACLs, you can set `group.id` in `consumerProperties` to override the default auto generated group ID. diff --git a/docs/ingestion/kinesis-ingestion.md b/docs/ingestion/kinesis-ingestion.md index 5d2838f20277..e0b887b517af 100644 --- a/docs/ingestion/kinesis-ingestion.md +++ b/docs/ingestion/kinesis-ingestion.md @@ -306,8 +306,7 @@ Kinesis stream. ## Deaggregation -The Kinesis indexing service supports de-aggregation of multiple rows packed into a single record by the Kinesis -Producer Library's aggregate method for more efficient data transfer. +The Kinesis indexing service supports de-aggregation of multiple rows stored within a single [Kinesis Data Streams](https://docs.aws.amazon.com/streams/latest/dev/introduction.html) record for more efficient data transfer. To enable this feature, set `deaggregate` to true in your `ioConfig` when submitting a supervisor spec. diff --git a/docs/ingestion/streaming.md b/docs/ingestion/streaming.md index de0fd5b3eb01..f0f777b6c627 100644 --- a/docs/ingestion/streaming.md +++ b/docs/ingestion/streaming.md @@ -22,7 +22,7 @@ title: "Streaming ingestion" ~ under the License. --> -Apache Druid accepts data streams from the following external streaming sources: +Apache Druid can consume data streams from the following external streaming sources: * Apache Kafka through the bundled [Kafka indexing service](kafka-ingestion.md) extension. * Amazon Kinesis through the bundled [Kinesis indexing service](kinesis-ingestion.md) extension. diff --git a/docs/ingestion/supervisor.md b/docs/ingestion/supervisor.md index 83d03b347fe2..b85c2a84b4bf 100644 --- a/docs/ingestion/supervisor.md +++ b/docs/ingestion/supervisor.md @@ -213,8 +213,8 @@ For configuration properties specific to Apache Kafka and Amazon Kinesis, see [K ## Start a supervisor -Druid starts a new supervisor for a datasource when you create a supervisor spec. -You can create a supervisor spec using the [Load Data](../operations/web-console.md#data-loader) UI in the Druid web console or by calling the [Supervisor API](../api-reference/supervisor-api.md). +Druid starts a new supervisor when you submit a supervisor spec. +You can submit the supervisor spec using the Druid console [data loader](../operations/web-console.md#data-loader) or by calling the [Supervisor API](../api-reference/supervisor-api.md). The following screenshot shows the [Supervisors](../operations/web-console.md#supervisors) view of the Druid web console for a cluster with two supervisors: From a39e67504ddb6536895e127a359e8779a4ccbb46 Mon Sep 17 00:00:00 2001 From: Katya Macedo Date: Tue, 6 Feb 2024 09:08:37 -0600 Subject: [PATCH 10/15] Minor text edit --- docs/ingestion/kafka-ingestion.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/ingestion/kafka-ingestion.md b/docs/ingestion/kafka-ingestion.md index 9f52d5b9dc0a..c55fc5e626c5 100644 --- a/docs/ingestion/kafka-ingestion.md +++ b/docs/ingestion/kafka-ingestion.md @@ -122,7 +122,7 @@ For configuration properties shared across all streaming ingestion methods, refe |--------|----|-----------|--------|-------| |`topic`|String|The Kafka topic to read from. To ingest data from multiple topic, use `topicPattern`. |Yes if `topicPattern` isn't set.|| |`topicPattern`|String|Multiple Kafka topics to read from, passed as a regex pattern. See [Ingest from multiple topics](#ingest-from-multiple-topics) for more information.|Yes if `topic` isn't set.|| -|`consumerProperties`|String, Object|A map of properties to pass to the Kafka consumer. See [Consumer properties](#consumer-properties) for details.|Yes. You must set the `bootstrap.servers` property to establish the initial connection to the Kafka cluster.|| +|`consumerProperties`|String, Object|A map of properties to pass to the Kafka consumer. See [Consumer properties](#consumer-properties) for details.|Yes. At the minumum, you must set the `bootstrap.servers` property to establish the initial connection to the Kafka cluster.|| |`pollTimeout`|Long|The length of time to wait for the Kafka consumer to poll records, in milliseconds.|No|100| |`useEarliestOffset`|Boolean|If a supervisor manages a datasource for the first time, it obtains a set of starting offsets from Kafka. This flag determines whether it retrieves the earliest or latest offsets in Kafka. Under normal circumstances, subsequent tasks start from where the previous segments ended. Druid only uses `useEarliestOffset` on the first run.|No|`false`| |`idleConfig`|Object|Defines how and when the Kafka supervisor can become idle. See [Idle configuration](#idle-configuration) for more details.|No|null| From ac45636b29cde3883a4f010b321773bc8139f269 Mon Sep 17 00:00:00 2001 From: Katya Macedo Date: Thu, 8 Feb 2024 14:03:13 -0600 Subject: [PATCH 11/15] Update example syntax --- docs/api-reference/supervisor-api.md | 22 +++++++++++----------- docs/ingestion/supervisor.md | 2 +- 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/docs/api-reference/supervisor-api.md b/docs/api-reference/supervisor-api.md index a88b1a984a72..38dc368a54f9 100644 --- a/docs/api-reference/supervisor-api.md +++ b/docs/api-reference/supervisor-api.md @@ -845,7 +845,7 @@ Retrieves the specification for a single supervisor. The returned specification #### URL -GET /druid/indexer/v1/supervisor/:supervisorId +GET /druid/indexer/v1/supervisor/{supervisorId} #### Responses @@ -1209,7 +1209,7 @@ For additional information about the status report, see [Supervisor reference](. #### URL -GET /druid/indexer/v1/supervisor/:supervisorId/status +GET /druid/indexer/v1/supervisor/{supervisorId}/status #### Responses @@ -1313,7 +1313,7 @@ Retrieves the current health report for a single supervisor. The health of a sup #### URL -GET /druid/indexer/v1/supervisor/:supervisorId/health +GET /druid/indexer/v1/supervisor/{supervisorI}/health #### Responses @@ -1382,7 +1382,7 @@ Returns a snapshot of the current ingestion row counters for each task being man #### URL -GET /druid/indexer/v1/supervisor/:supervisorId/stats +GET /druid/indexer/v1/supervisor/{supervisorI}/stats #### Responses @@ -1848,7 +1848,7 @@ Retrieves an audit history of specs for a single supervisor. #### URL -GET /druid/indexer/v1/supervisor/:supervisorId/history +GET /druid/indexer/v1/supervisor/{supervisorI}/history #### Responses @@ -2400,7 +2400,7 @@ Suspends a single running supervisor. Returns the updated supervisor spec, where Indexing tasks remain suspended until you [resume the supervisor](#resume-a-supervisor). #### URL -POST /druid/indexer/v1/supervisor/:supervisorId/suspend +POST /druid/indexer/v1/supervisor/{supervisorId}/suspend #### Responses @@ -2823,7 +2823,7 @@ Resumes indexing tasks for a supervisor. Returns an updated supervisor spec with #### URL -POST /druid/indexer/v1/supervisor/:supervisorId/resume +POST /druid/indexer/v1/supervisor/{supervisorId}/resume #### Responses @@ -3255,7 +3255,7 @@ The indexing service keeps track of the latest persisted offsets in Kafka or seq #### URL -POST /druid/indexer/v1/supervisor/:supervisorId/reset +POST /druid/indexer/v1/supervisor/{supervisorI}/reset #### Responses @@ -3330,7 +3330,7 @@ Use this endpoint with caution. It can cause skipped messages, leading to data l #### URL -POST /druid/indexer/v1/supervisor/:supervisorId/resetOffsets +POST /druid/indexer/v1/supervisor/{supervisorId}/resetOffsets #### Responses @@ -3433,7 +3433,7 @@ The terminated supervisor still exists in the metadata store and its history can #### URL -POST /druid/indexer/v1/supervisor/:supervisorId/terminate +POST /druid/indexer/v1/supervisor/{supervisorId}/terminate #### Responses @@ -3553,4 +3553,4 @@ Shuts down a supervisor. This endpoint is deprecated and will be removed in futu #### URL -POST /druid/indexer/v1/supervisor/:supervisorId/shutdown +POST /druid/indexer/v1/supervisor/{supervisorId}/shutdown diff --git a/docs/ingestion/supervisor.md b/docs/ingestion/supervisor.md index b85c2a84b4bf..b5411f4c1685 100644 --- a/docs/ingestion/supervisor.md +++ b/docs/ingestion/supervisor.md @@ -192,7 +192,7 @@ For configuration properties specific to Apache Kafka and Amazon Kinesis, see [K |`maxBytesInMemory`|Long|The number of bytes to aggregate in heap memory before persisting. This is based on a rough estimate of memory usage and not actual usage. Normally, this is computed internally. The maximum heap memory usage for indexing is `maxBytesInMemory * (2 + maxPendingPersists)`.|No|One-sixth of max JVM memory| |`skipBytesInMemoryOverheadCheck`|Boolean|The calculation of `maxBytesInMemory` takes into account overhead objects created during ingestion and each intermediate persist. To exclude the bytes of these overhead objects from the `maxBytesInMemory` check, set `skipBytesInMemoryOverheadCheck` to `true`.|No|`false`| |`maxRowsPerSegment`|Integer|The number of rows to store in a segment. This number is post-aggregation rows. Handoff occurs when `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens first.|No|5000000| -|`maxTotalRows`|Long|The number of rows to aggregate across all segments; this number is post-aggregation rows. Handoff happens either if `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens earlier.|No|20000000 for Kafka. Unlimited for Kinesis.| +|`maxTotalRows`|Long|The number of rows to aggregate across all segments; this number is post-aggregation rows. Handoff happens either if `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens earlier.|No|20000000| |`intermediateHandoffPeriod`|ISO 8601 period|The period that determines how often tasks hand off segments. Handoff occurs if `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens first.|No|`P2147483647D`| |`intermediatePersistPeriod`|ISO 8601 period|The period that determines the rate at which intermediate persists occur.|No|`PT10M`| |`maxPendingPersists`|Integer|Maximum number of persists that can be pending but not started. If a new intermediate persist exceeds this limit, Druid blocks ingestion until the currently running persist finishes. One persist can be running concurrently with ingestion, and none can be queued up. The maximum heap memory usage for indexing scales is `maxRowsInMemory * (2 + maxPendingPersists)`.|No|0| From 9803b11b932a16de7a26d5e94b49125e803047d0 Mon Sep 17 00:00:00 2001 From: Katya Macedo Date: Fri, 9 Feb 2024 12:42:38 -0600 Subject: [PATCH 12/15] Revert back to colon --- docs/api-reference/supervisor-api.md | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/docs/api-reference/supervisor-api.md b/docs/api-reference/supervisor-api.md index 38dc368a54f9..a88b1a984a72 100644 --- a/docs/api-reference/supervisor-api.md +++ b/docs/api-reference/supervisor-api.md @@ -845,7 +845,7 @@ Retrieves the specification for a single supervisor. The returned specification #### URL -GET /druid/indexer/v1/supervisor/{supervisorId} +GET /druid/indexer/v1/supervisor/:supervisorId #### Responses @@ -1209,7 +1209,7 @@ For additional information about the status report, see [Supervisor reference](. #### URL -GET /druid/indexer/v1/supervisor/{supervisorId}/status +GET /druid/indexer/v1/supervisor/:supervisorId/status #### Responses @@ -1313,7 +1313,7 @@ Retrieves the current health report for a single supervisor. The health of a sup #### URL -GET /druid/indexer/v1/supervisor/{supervisorI}/health +GET /druid/indexer/v1/supervisor/:supervisorId/health #### Responses @@ -1382,7 +1382,7 @@ Returns a snapshot of the current ingestion row counters for each task being man #### URL -GET /druid/indexer/v1/supervisor/{supervisorI}/stats +GET /druid/indexer/v1/supervisor/:supervisorId/stats #### Responses @@ -1848,7 +1848,7 @@ Retrieves an audit history of specs for a single supervisor. #### URL -GET /druid/indexer/v1/supervisor/{supervisorI}/history +GET /druid/indexer/v1/supervisor/:supervisorId/history #### Responses @@ -2400,7 +2400,7 @@ Suspends a single running supervisor. Returns the updated supervisor spec, where Indexing tasks remain suspended until you [resume the supervisor](#resume-a-supervisor). #### URL -POST /druid/indexer/v1/supervisor/{supervisorId}/suspend +POST /druid/indexer/v1/supervisor/:supervisorId/suspend #### Responses @@ -2823,7 +2823,7 @@ Resumes indexing tasks for a supervisor. Returns an updated supervisor spec with #### URL -POST /druid/indexer/v1/supervisor/{supervisorId}/resume +POST /druid/indexer/v1/supervisor/:supervisorId/resume #### Responses @@ -3255,7 +3255,7 @@ The indexing service keeps track of the latest persisted offsets in Kafka or seq #### URL -POST /druid/indexer/v1/supervisor/{supervisorI}/reset +POST /druid/indexer/v1/supervisor/:supervisorId/reset #### Responses @@ -3330,7 +3330,7 @@ Use this endpoint with caution. It can cause skipped messages, leading to data l #### URL -POST /druid/indexer/v1/supervisor/{supervisorId}/resetOffsets +POST /druid/indexer/v1/supervisor/:supervisorId/resetOffsets #### Responses @@ -3433,7 +3433,7 @@ The terminated supervisor still exists in the metadata store and its history can #### URL -POST /druid/indexer/v1/supervisor/{supervisorId}/terminate +POST /druid/indexer/v1/supervisor/:supervisorId/terminate #### Responses @@ -3553,4 +3553,4 @@ Shuts down a supervisor. This endpoint is deprecated and will be removed in futu #### URL -POST /druid/indexer/v1/supervisor/{supervisorId}/shutdown +POST /druid/indexer/v1/supervisor/:supervisorId/shutdown From ae7e66b6f01e7bc8fdc080e4fd20031c571f97c7 Mon Sep 17 00:00:00 2001 From: Katya Macedo Date: Fri, 9 Feb 2024 15:02:28 -0600 Subject: [PATCH 13/15] Fix merge conflicts --- docs/design/storage.md | 2 +- .../kafka-supervisor-reference.md | 262 ------- .../extensions-core/kinesis-ingestion.md | 721 ------------------ docs/ingestion/kafka-ingestion.md | 3 +- docs/ingestion/kinesis-ingestion.md | 29 +- 5 files changed, 12 insertions(+), 1005 deletions(-) delete mode 100644 docs/development/extensions-core/kafka-supervisor-reference.md delete mode 100644 docs/development/extensions-core/kinesis-ingestion.md diff --git a/docs/design/storage.md b/docs/design/storage.md index 50fb27b8d411..73e0b85fa9a9 100644 --- a/docs/design/storage.md +++ b/docs/design/storage.md @@ -114,7 +114,7 @@ Druid has an architectural separation between ingestion and querying, as describ On the ingestion side, Druid's primary [ingestion methods](../ingestion/index.md#ingestion-methods) are all pull-based and offer transactional guarantees. This means that you are guaranteed that ingestion using these methods will publish in an all-or-nothing manner: -- Supervised "seekable-stream" ingestion methods like [Kafka](../ingestion/kafka-ingestion.md) and [Kinesis](../ingestion/kinesis-ingestion.md). With these methods, Druid commits stream offsets to its [metadata store](#metadata-storage.md) alongside segment metadata, in the same transaction. Note that ingestion of data that has not yet been published can be rolled back if ingestion tasks fail. In this case, partially-ingested data is +- Supervised "seekable-stream" ingestion methods like [Kafka](../ingestion/kafka-ingestion.md) and [Kinesis](../ingestion/kinesis-ingestion.md). With these methods, Druid commits stream offsets to its [metadata store](metadata-storage.md) alongside segment metadata, in the same transaction. Note that ingestion of data that has not yet been published can be rolled back if ingestion tasks fail. In this case, partially-ingested data is discarded, and Druid will resume ingestion from the last committed set of stream offsets. This ensures exactly-once publishing behavior. - [Hadoop-based batch ingestion](../ingestion/hadoop.md). Each task publishes all segment metadata in a single transaction. - [Native batch ingestion](../ingestion/native-batch.md). In parallel mode, the supervisor task publishes all segment metadata in a single transaction after the subtasks are finished. In simple (single-task) mode, the single task publishes all segment metadata in a single transaction after it is complete. diff --git a/docs/development/extensions-core/kafka-supervisor-reference.md b/docs/development/extensions-core/kafka-supervisor-reference.md deleted file mode 100644 index ad89b73231bb..000000000000 --- a/docs/development/extensions-core/kafka-supervisor-reference.md +++ /dev/null @@ -1,262 +0,0 @@ ---- -id: kafka-supervisor-reference -title: "Apache Kafka supervisor reference" -sidebar_label: "Apache Kafka supervisor" -description: "Reference topic for Apache Kafka supervisors" ---- - - - -This topic contains configuration reference information for the Apache Kafka supervisor for Apache Druid. - -The following table outlines the high-level configuration options: - -|Property|Type|Description|Required| -|--------|----|-----------|--------| -|`type`|String|The supervisor type. For Kafka streaming, set to `kafka`.|Yes| -|`spec`|Object|The container object for the supervisor configuration.|Yes| -|`ioConfig`|Object|The I/O configuration object to define the Kafka connection and I/O-related settings for the supervisor and indexing task. See [Supervisor I/O configuration](#supervisor-io-configuration).|Yes| -|`dataSchema`|Object|The schema for the Kafka indexing task to use during ingestion.|Yes| -|`tuningConfig`|Object|The tuning configuration object to define performance-related settings for the supervisor and indexing tasks. See [Supervisor tuning configuration](#supervisor-tuning-configuration).|No| - -## Supervisor I/O configuration - -The following table outlines the configuration options for `ioConfig`: - -|Property|Type|Description|Required|Default| -|--------|----|-----------|--------|-------| -|`topic`|String|The Kafka topic to read from. Must be a specific topic. Druid does not support topic patterns.|Yes|| -|`inputFormat`|Object|The input format to define input data parsing. See [Specifying data format](#specifying-data-format) for details about specifying the input format.|Yes|| -|`consumerProperties`|String, Object|A map of properties to pass to the Kafka consumer. See [Consumer properties](#consumer-properties).|Yes|| -|`pollTimeout`|Long|The length of time to wait for the Kafka consumer to poll records, in milliseconds.|No|100| -|`replicas`|Integer|The number of replica sets, where 1 is a single set of tasks (no replication). Druid always assigns replicate tasks to different workers to provide resiliency against process failure.|No|1| -|`taskCount`|Integer|The maximum number of reading tasks in a replica set. The maximum number of reading tasks equals `taskCount * replicas`. The total number of tasks, reading and publishing, is greater than this count. See [Capacity planning](./kafka-supervisor-operations.md#capacity-planning) for more details. When `taskCount > {numKafkaPartitions}`, the actual number of reading tasks is less than the `taskCount` value.|No|1| -|`taskDuration`|ISO 8601 period|The length of time before tasks stop reading and begin publishing segments.|No|PT1H| -|`startDelay`|ISO 8601 period|The period to wait before the supervisor starts managing tasks.|No|PT5S| -|`period`|ISO 8601 period|Determines how often the supervisor executes its management logic. Note that the supervisor also runs in response to certain events, such as tasks succeeding, failing, and reaching their task duration. The `period` value specifies the maximum time between iterations.|No|PT30S| -|`useEarliestOffset`|Boolean|If a supervisor manages a datasource for the first time, it obtains a set of starting offsets from Kafka. This flag determines whether it retrieves the earliest or latest offsets in Kafka. Under normal circumstances, subsequent tasks start from where the previous segments ended. Druid only uses `useEarliestOffset` on the first run.|No|`false`| -|`completionTimeout`|ISO 8601 period|The length of time to wait before declaring a publishing task as failed and terminating it. If the value is too low, your tasks may never publish. The publishing clock for a task begins roughly after `taskDuration` elapses.|No|PT30M| -|`lateMessageRejectionStartDateTime`|ISO 8601 date time|Configure tasks to reject messages with timestamps earlier than this date time. For example, if this property is set to `2016-01-01T11:00Z` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps earlier than `2016-01-01T11:00Z`. This can prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments, such as a realtime and a nightly batch ingestion pipeline.|No|| -|`lateMessageRejectionPeriod`|ISO 8601 period|Configure tasks to reject messages with timestamps earlier than this period before the task was created. For example, if this property is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps earlier than `2016-01-01T11:00Z`. This may help prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments, such as a realtime and a nightly batch ingestion pipeline. Note that you can specify only one of the late message rejection properties.|No|| -|`earlyMessageRejectionPeriod`|ISO 8601 period|Configure tasks to reject messages with timestamps later than this period after the task reached its task duration. For example, if this property is set to `PT1H`, the task duration is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps later than `2016-01-01T14:00Z`. Tasks sometimes run past their task duration, such as in cases of supervisor failover. Setting `earlyMessageRejectionPeriod` too low may cause Druid to drop messages unexpectedly whenever a task runs past its originally configured task duration.|No|| -|`autoScalerConfig`|Object|Defines auto scaling behavior for Kafka ingest tasks. See [Task autoscaler properties](#task-autoscaler-properties).|No|null| -|`idleConfig`|Object|Defines how and when the Kafka supervisor can become idle. See [Idle supervisor configuration](#idle-supervisor-configuration) for more details.|No|null| - -### Task autoscaler properties - -The following table outlines the configuration options for `autoScalerConfig`: - -|Property|Description|Required|Default| -|--------|-----------|--------|-------| -|`enableTaskAutoScaler`|Enable or disable autoscaling. `false` or blank disables the `autoScaler` even when `autoScalerConfig` is not null.|No|`false`| -|`taskCountMax`|Maximum number of ingestion tasks. Set `taskCountMax >= taskCountMin`. If `taskCountMax > {numKafkaPartitions}`, Druid only scales reading tasks up to the `{numKafkaPartitions}`. In this case, `taskCountMax` is ignored.|Yes|| -|`taskCountMin`|Minimum number of ingestion tasks. When you enable the autoscaler, Druid ignores the value of `taskCount` in `ioConfig` and starts with the `taskCountMin` number of tasks.|Yes|| -|`minTriggerScaleActionFrequencyMillis`|Minimum time interval between two scale actions.|No|600000| -|`autoScalerStrategy`|The algorithm of `autoScaler`. Only supports `lagBased`. See [Lag based autoscaler strategy related properties](#lag-based-autoscaler-strategy-related-properties) for details.|No|`lagBased`| - -### Lag based autoscaler strategy related properties - -The following table outlines the configuration options for `autoScalerStrategy`: - -|Property|Description|Required|Default| -|--------|-----------|--------|-------| -|`lagCollectionIntervalMillis`|The time period during which Druid collects lag metric points.|No|30000| -|`lagCollectionRangeMillis`|The total time window of lag collection. Use with `lagCollectionIntervalMillis` to specify the intervals at which to collect lag metric points.|No|600000| -|`scaleOutThreshold`|The threshold of scale out action.|No|6000000| -|`triggerScaleOutFractionThreshold`|Enables scale out action if `triggerScaleOutFractionThreshold` percent of lag points is higher than `scaleOutThreshold`.|No|0.3| -|`scaleInThreshold`|The threshold of scale in action.|No|1000000| -|`triggerScaleInFractionThreshold`|Enables scale in action if `triggerScaleInFractionThreshold` percent of lag points is lower than `scaleOutThreshold`.|No|0.9| -|`scaleActionStartDelayMillis`|The number of milliseconds to delay after the supervisor starts before the first scale logic check.|No|300000| -|`scaleActionPeriodMillis`|The frequency in milliseconds to check if a scale action is triggered.|No|60000| -|`scaleInStep`|The number of tasks to reduce at once when scaling down.|No|1| -|`scaleOutStep`|The number of tasks to add at once when scaling out.|No|2| - -### Ingesting from multiple topics - -To ingest data from multiple topics, you have to set `topicPattern` in the supervisor I/O configuration and not set `topic`. -You can pass multiple topics as a regex pattern as the value for `topicPattern` in the I/O configuration. For example, to -ingest data from clicks and impressions, set `topicPattern` to `clicks|impressions` in the I/O configuration. -Similarly, you can use `metrics-.*` as the value for `topicPattern` if you want to ingest from all the topics that -start with `metrics-`. If new topics are added to the cluster that match the regex, Druid automatically starts -ingesting from those new topics. A topic name that only matches partially such as `my-metrics-12` will not be -included for ingestion. If you enable multi-topic ingestion for a datasource, downgrading to a version older than -28.0.0 will cause the ingestion for that datasource to fail. - -When ingesting data from multiple topics, partitions are assigned based on the hashcode of the topic name and the -id of the partition within that topic. The partition assignment might not be uniform across all the tasks. It's also -assumed that partitions across individual topics have similar load. It is recommended that you have a higher number of -partitions for a high load topic and a lower number of partitions for a low load topic. Assuming that you want to -ingest from both high and low load topic in the same supervisor. - -## Idle supervisor configuration - -:::info - Note that idle state transitioning is currently designated as experimental. -::: - -|Property|Description|Required| -|--------|-----------|--------| -|`enabled`|If `true`, the supervisor becomes idle if there is no data on input stream/topic for some time.|No|`false`| -|`inactiveAfterMillis`|The supervisor becomes idle if all existing data has been read from input topic and no new data has been published for `inactiveAfterMillis` milliseconds.|No|`600_000`| - -When the supervisor enters the idle state, no new tasks are launched subsequent to the completion of the currently executing tasks. This strategy may lead to reduced costs for cluster operators while using topics that get sporadic data. - -The following example demonstrates supervisor spec with `lagBased` autoscaler and idle configuration enabled: - -```json -{ - "type": "kafka", - "spec": { - "dataSchema": { - ... - }, - "ioConfig": { - "topic": "metrics", - "inputFormat": { - "type": "json" - }, - "consumerProperties": { - "bootstrap.servers": "localhost:9092" - }, - "autoScalerConfig": { - "enableTaskAutoScaler": true, - "taskCountMax": 6, - "taskCountMin": 2, - "minTriggerScaleActionFrequencyMillis": 600000, - "autoScalerStrategy": "lagBased", - "lagCollectionIntervalMillis": 30000, - "lagCollectionRangeMillis": 600000, - "scaleOutThreshold": 6000000, - "triggerScaleOutFractionThreshold": 0.3, - "scaleInThreshold": 1000000, - "triggerScaleInFractionThreshold": 0.9, - "scaleActionStartDelayMillis": 300000, - "scaleActionPeriodMillis": 60000, - "scaleInStep": 1, - "scaleOutStep": 2 - }, - "taskCount":1, - "replicas":1, - "taskDuration":"PT1H", - "idleConfig": { - "enabled": true, - "inactiveAfterMillis": 600000 - } - }, - "tuningConfig":{ - ... - } - } -} -``` - -## Consumer properties - -Consumer properties must contain a property `bootstrap.servers` with a list of Kafka brokers in the form: `:,:,...`. -By default, `isolation.level` is set to `read_committed`. If you use older versions of Kafka servers without transactions support or don't want Druid to consume only committed transactions, set `isolation.level` to `read_uncommitted`. - -In some cases, you may need to fetch consumer properties at runtime. For example, when `bootstrap.servers` is not known upfront, or is not static. To enable SSL connections, you must provide passwords for `keystore`, `truststore` and `key` secretly. You can provide configurations at runtime with a dynamic config provider implementation like the environment variable config provider that comes with Druid. For more information, see [Dynamic config provider](../../operations/dynamic-config-provider.md). - -For example, if you are using SASL and SSL with Kafka, set the following environment variables for the Druid user on the machines running the Overlord and the Peon services: - -``` -export KAFKA_JAAS_CONFIG="org.apache.kafka.common.security.plain.PlainLoginModule required username='admin_user' password='admin_password';" -export SSL_KEY_PASSWORD=mysecretkeypassword -export SSL_KEYSTORE_PASSWORD=mysecretkeystorepassword -export SSL_TRUSTSTORE_PASSWORD=mysecrettruststorepassword -``` - -``` - "druid.dynamic.config.provider": { - "type": "environment", - "variables": { - "sasl.jaas.config": "KAFKA_JAAS_CONFIG", - "ssl.key.password": "SSL_KEY_PASSWORD", - "ssl.keystore.password": "SSL_KEYSTORE_PASSWORD", - "ssl.truststore.password": "SSL_TRUSTSTORE_PASSWORD" - } - } - } -``` - -Verify that you've changed the values for all configurations to match your own environment. You can use the environment variable config provider syntax in the **Consumer properties** field on the **Connect tab** in the **Load Data** UI in the web console. When connecting to Kafka, Druid replaces the environment variables with their corresponding values. - -You can provide SSL connections with [Password provider](../../operations/password-provider.md) interface to define the `keystore`, `truststore`, and `key`, but this feature is deprecated. - -## Specifying data format - -The Kafka indexing service supports both [`inputFormat`](../../ingestion/data-formats.md#input-format) and [`parser`](../../ingestion/data-formats.md#parser) to specify the data format. -Use the `inputFormat` to specify the data format for Kafka indexing service unless you need a format only supported by the legacy `parser`. - -Druid supports the following input formats: - -- `csv` -- `tsv` -- `json` -- `kafka` -- `avro_stream` -- `avro_ocf` -- `protobuf` - -For more information, see [Data formats](../../ingestion/data-formats.md). You can also read [`thrift`](../extensions-contrib/thrift.md) formats using `parser`. - -## Supervisor tuning configuration - -The `tuningConfig` object is optional. If you don't specify the `tuningConfig` object, Druid uses the default configuration settings. - -|Property|Type|Description|Required|Default| -|--------|----|-----------|--------|-------| -|`type`|String|The indexing task type. This should always be `kafka`.|Yes|| -|`maxRowsInMemory`|Integer|The number of rows to aggregate before persisting. This number represents the post-aggregation rows. It is not equivalent to the number of input events, but the resulting number of aggregated rows. Druid uses `maxRowsInMemory` to manage the required JVM heap size. The maximum heap memory usage for indexing scales is `maxRowsInMemory * (2 + maxPendingPersists)`. Normally, you do not need to set this, but depending on the nature of data, if rows are short in terms of bytes, you may not want to store a million rows in memory and this value should be set.|No|150000| -|`maxBytesInMemory`|Long|The number of bytes to aggregate in heap memory before persisting. This is based on a rough estimate of memory usage and not actual usage. Normally, this is computed internally. The maximum heap memory usage for indexing is `maxBytesInMemory * (2 + maxPendingPersists)`.|No|One-sixth of max JVM memory| -|`skipBytesInMemoryOverheadCheck`|Boolean|The calculation of `maxBytesInMemory` takes into account overhead objects created during ingestion and each intermediate persist. To exclude the bytes of these overhead objects from the `maxBytesInMemory` check, set `skipBytesInMemoryOverheadCheck` to `true`.|No|`false`| -|`maxRowsPerSegment`|Integer|The number of rows to store in a segment. This number is post-aggregation rows. Handoff occurs when `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens first.|No|5000000| -|`maxTotalRows`|Long|The number of rows to aggregate across all segments; this number is post-aggregation rows. Handoff happens either if `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens earlier.|No|20000000| -|`intermediateHandoffPeriod`|ISO 8601 period|The period that determines how often tasks hand off segments. Handoff occurs if `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens first.|No|P2147483647D| -|`intermediatePersistPeriod`|ISO 8601 period|The period that determines the rate at which intermediate persists occur.|No|PT10M| -|`maxPendingPersists`|Integer|Maximum number of persists that can be pending but not started. If a new intermediate persist exceeds this limit, Druid blocks ingestion until the currently running persist finishes. One persist can be running concurrently with ingestion, and none can be queued up. The maximum heap memory usage for indexing scales is `maxRowsInMemory * (2 + maxPendingPersists)`.|No|0| -|`numPersistThreads`|Integer|The number of threads to use to create and persist incremental segments on the disk. Higher ingestion data throughput results in a larger number of incremental segments, causing significant CPU time to be spent on the creation of the incremental segments on the disk. For datasources with number of columns running into hundreds or thousands, creation of the incremental segments may take up significant time, in the order of multiple seconds. In both of these scenarios, ingestion can stall or pause frequently, causing it to fall behind. You can use additional threads to parallelize the segment creation without blocking ingestion as long as there are sufficient CPU resources available.|No|1| -|`indexSpec`|Object|Defines how Druid indexes the data. See [IndexSpec](#indexspec) for more information.|No|| -|`indexSpecForIntermediatePersists`|Object|Defines segment storage format options to use at indexing time for intermediate persisted temporary segments. You can use `indexSpecForIntermediatePersists` to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. However, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published. See [IndexSpec](#indexspec) for possible values.|No|Same as `indexSpec`| -|`reportParseExceptions`|Boolean|DEPRECATED. If `true`, Druid throws exceptions encountered during parsing causing ingestion to halt. If `false`, Druid skips unparseable rows and fields. Setting `reportParseExceptions` to `true` overrides existing configurations for `maxParseExceptions` and `maxSavedParseExceptions`, setting `maxParseExceptions` to 0 and limiting `maxSavedParseExceptions` to not more than 1.|No|`false`| -|`handoffConditionTimeout`|Long|Number of milliseconds to wait for segment handoff. Set to a value >= 0, where 0 means to wait indefinitely.|No|900000 (15 minutes)| -|`resetOffsetAutomatically`|Boolean|Controls behavior when Druid needs to read Kafka messages that are no longer available, when `offsetOutOfRangeException` is encountered.
If `false`, the exception bubbles up causing tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation, potentially using the [Reset Supervisor API](../../api-reference/supervisor-api.md). This mode is useful for production, since it will make you aware of issues with ingestion.
If `true`, Druid will automatically reset to the earlier or latest offset available in Kafka, based on the value of the `useEarliestOffset` property (earliest if `true`, latest if `false`). Note that this can lead to dropping data (if `useEarliestSequenceNumber` is `false`) or duplicating data (if `useEarliestSequenceNumber` is `true`) without your knowledge. Druid logs messages indicating that a reset has occurred without interrupting ingestion. This mode is useful for non-production situations since it enables Druid to recover from problems automatically, even if they lead to quiet dropping or duplicating of data. This feature behaves similarly to the Kafka `auto.offset.reset` consumer property.|No|`false`| -|`workerThreads`|Integer|The number of threads that the supervisor uses to handle requests/responses for worker tasks, along with any other internal asynchronous operation.|No|`min(10, taskCount)`| -|`chatAsync`|Boolean|If `true`, use asynchronous communication with indexing tasks, and ignore the `chatThreads` parameter. If `false`, use synchronous communication in a thread pool of size `chatThreads`.|No|`true`| -|`chatThreads`|Integer|The number of threads to use for communicating with indexing tasks. Ignored if `chatAsync` is `true`.|No|`min(10, taskCount * replicas)`| -|`chatRetries`|Integer|The number of times HTTP requests to indexing tasks are retried before considering tasks unresponsive.|No|8| -|`httpTimeout`| ISO 8601 period|The period of time to wait for a HTTP response from an indexing task.|No|PT10S| -|`shutdownTimeout`|ISO 8601 period|The period of time to wait for the supervisor to attempt a graceful shutdown of tasks before exiting.|No|PT80S| -|`offsetFetchPeriod`|ISO 8601 period|Determines how often the supervisor queries Kafka and the indexing tasks to fetch current offsets and calculate lag. If the user-specified value is below the minimum value of `PT5S`, the supervisor ignores the value and uses the minimum value instead.|No|PT30S| -|`segmentWriteOutMediumFactory`|Object|The segment write-out medium to use when creating segments. See [Additional Peon configuration: SegmentWriteOutMediumFactory](../../configuration/index.md#segmentwriteoutmediumfactory) for explanation and available options.|No|If not specified, Druid uses the value from `druid.peon.defaultSegmentWriteOutMediumFactory.type`.| -|`logParseExceptions`|Boolean|If `true`, Druid logs an error message when a parsing exception occurs, containing information about the row where the error occurred.|No|`false`| -|`maxParseExceptions`|Integer|The maximum number of parse exceptions that can occur before the task halts ingestion and fails. Overridden if `reportParseExceptions` is set.|No|unlimited| -|`maxSavedParseExceptions`|Integer|When a parse exception occurs, Druid keeps track of the most recent parse exceptions. `maxSavedParseExceptions` limits the number of saved exception instances. These saved exceptions are available after the task finishes in the [task completion report](../../ingestion/tasks.md#task-reports). Overridden if `reportParseExceptions` is set.|No|0| - -### IndexSpec - -The following table outlines the configuration options for `indexSpec`: - -|Property|Type|Description|Required|Default| -|--------|----|-----------|--------|-------| -|`bitmap`|Object|Compression format for bitmap indexes. Druid supports roaring and concise bitmap types.|No|Roaring| -|`dimensionCompression`|String|Compression format for dimension columns. Choose from `LZ4`, `LZF`, `ZSTD` or `uncompressed`.|No|`LZ4`| -|`metricCompression`|String|Compression format for primitive type metric columns. Choose from `LZ4`, `LZF`, `ZSTD`, `uncompressed` or `none`.|No|`LZ4`| -|`longEncoding`|String|Encoding format for metric and dimension columns with type long. Choose from `auto` or `longs`. `auto` encodes the values using offset or lookup table depending on column cardinality, and store them with variable size. `longs` stores the value as is with 8 bytes each.|No|`longs`| diff --git a/docs/development/extensions-core/kinesis-ingestion.md b/docs/development/extensions-core/kinesis-ingestion.md deleted file mode 100644 index ae49d63201d1..000000000000 --- a/docs/development/extensions-core/kinesis-ingestion.md +++ /dev/null @@ -1,721 +0,0 @@ ---- -id: kinesis-ingestion -title: "Amazon Kinesis ingestion" -sidebar_label: "Amazon Kinesis" ---- -import Tabs from '@theme/Tabs'; -import TabItem from '@theme/TabItem'; - - - - -When you enable the Kinesis indexing service, you can configure supervisors on the Overlord to manage the creation and lifetime of Kinesis indexing tasks. These indexing tasks read events using Kinesis' own shard and sequence number mechanism to guarantee exactly-once ingestion. The supervisor oversees the state of the indexing tasks to coordinate handoffs, manage failures, and ensure that scalability and replication requirements are maintained. - -This topic contains configuration reference information for the Kinesis indexing service supervisor for Apache Druid. - -## Setup - -To use the Kinesis indexing service, you must first load the `druid-kinesis-indexing-service` core extension on both the Overlord and the Middle Manager. See [Loading extensions](../../configuration/extensions.md#loading-extensions) for more information. -Review the [Kinesis known issues](#kinesis-known-issues) before deploying the `druid-kinesis-indexing-service` extension to production. - -## Supervisor spec - -The following table outlines the high-level configuration options for the Kinesis supervisor object. -See [Supervisor API](../../api-reference/supervisor-api.md) for more information. - -|Property|Type|Description|Required| -|--------|----|-----------|--------| -|`type`|String|The supervisor type; this should always be `kinesis`.|Yes| -|`spec`|Object|The container object for the supervisor configuration.|Yes| -|`ioConfig`|Object|The [I/O configuration](#supervisor-io-configuration) object for configuring Kinesis connection and I/O-related settings for the supervisor and indexing task.|Yes| -|`dataSchema`|Object|The schema used by the Kinesis indexing task during ingestion. See [`dataSchema`](../../ingestion/ingestion-spec.md#dataschema) for more information.|Yes| -|`tuningConfig`|Object|The [tuning configuration](#supervisor-tuning-configuration) object for configuring performance-related settings for the supervisor and indexing tasks.|No| - -Druid starts a new supervisor when you define a supervisor spec. -To create a supervisor, send a `POST` request to the `/druid/indexer/v1/supervisor` endpoint. -Once created, the supervisor persists in the configured metadata database. There can only be a single supervisor per datasource, and submitting a second spec for the same datasource overwrites the previous one. - -When an Overlord gains leadership, either by being started or as a result of another Overlord failing, it spawns -a supervisor for each supervisor spec in the metadata database. The supervisor then discovers running Kinesis indexing -tasks and attempts to adopt them if they are compatible with the supervisor's configuration. If they are not -compatible because they have a different ingestion spec or shard allocation, the tasks are killed and the -supervisor creates a new set of tasks. In this way, the supervisors persist across Overlord restarts and failovers. - -The following example shows how to submit a supervisor spec for a stream with the name `KinesisStream`. -In this example, `http://SERVICE_IP:SERVICE_PORT` is a placeholder for the server address of deployment and the service port. - - - - - -```shell -curl -X POST "http://SERVICE_IP:SERVICE_PORT/druid/indexer/v1/supervisor" \ --H "Content-Type: application/json" \ --d '{ - "type": "kinesis", - "spec": { - "ioConfig": { - "type": "kinesis", - "stream": "KinesisStream", - "inputFormat": { - "type": "json" - }, - "useEarliestSequenceNumber": true - }, - "tuningConfig": { - "type": "kinesis" - }, - "dataSchema": { - "dataSource": "KinesisStream", - "timestampSpec": { - "column": "timestamp", - "format": "iso" - }, - "dimensionsSpec": { - "dimensions": [ - "isRobot", - "channel", - "flags", - "isUnpatrolled", - "page", - "diffUrl", - { - "type": "long", - "name": "added" - }, - "comment", - { - "type": "long", - "name": "commentLength" - }, - "isNew", - "isMinor", - { - "type": "long", - "name": "delta" - }, - "isAnonymous", - "user", - { - "type": "long", - "name": "deltaBucket" - }, - { - "type": "long", - "name": "deleted" - }, - "namespace", - "cityName", - "countryName", - "regionIsoCode", - "metroCode", - "countryIsoCode", - "regionName" - ] - }, - "granularitySpec": { - "queryGranularity": "none", - "rollup": false, - "segmentGranularity": "hour" - } - } - } -}' -``` - - - -```HTTP -POST /druid/indexer/v1/supervisor -HTTP/1.1 -Host: http://SERVICE_IP:SERVICE_PORT -Content-Type: application/json - -{ - "type": "kinesis", - "spec": { - "ioConfig": { - "type": "kinesis", - "stream": "KinesisStream", - "inputFormat": { - "type": "json" - }, - "useEarliestSequenceNumber": true - }, - "tuningConfig": { - "type": "kinesis" - }, - "dataSchema": { - "dataSource": "KinesisStream", - "timestampSpec": { - "column": "timestamp", - "format": "iso" - }, - "dimensionsSpec": { - "dimensions": [ - "isRobot", - "channel", - "flags", - "isUnpatrolled", - "page", - "diffUrl", - { - "type": "long", - "name": "added" - }, - "comment", - { - "type": "long", - "name": "commentLength" - }, - "isNew", - "isMinor", - { - "type": "long", - "name": "delta" - }, - "isAnonymous", - "user", - { - "type": "long", - "name": "deltaBucket" - }, - { - "type": "long", - "name": "deleted" - }, - "namespace", - "cityName", - "countryName", - "regionIsoCode", - "metroCode", - "countryIsoCode", - "regionName" - ] - }, - "granularitySpec": { - "queryGranularity": "none", - "rollup": false, - "segmentGranularity": "hour" - } - } - } -} -``` - - - -## Supervisor I/O configuration - -The following table outlines the configuration options for `ioConfig`: - -|Property|Type|Description|Required|Default| -|--------|----|-----------|--------|-------| -|`stream`|String|The Kinesis stream to read.|Yes|| -|`inputFormat`|Object|The [input format](../../ingestion/data-formats.md#input-format) to specify how to parse input data. See [Specify data format](#specify-data-format) for more information.|Yes|| -|`endpoint`|String|The AWS Kinesis stream endpoint for a region. You can find a list of endpoints in the [AWS service endpoints](http://docs.aws.amazon.com/general/latest/gr/rande.html#ak_region) document.|No|`kinesis.us-east-1.amazonaws.com`| -|`replicas`|Integer|The number of replica sets, where 1 is a single set of tasks (no replication). Druid always assigns replicate tasks to different workers to provide resiliency against process failure.|No|1| -|`taskCount`|Integer|The maximum number of reading tasks in a replica set. Multiply `taskCount` and `replicas` to measure the maximum number of reading tasks.
The total number of tasks (reading and publishing) is higher than the maximum number of reading tasks. See [Capacity planning](#capacity-planning) for more details. When `taskCount > {numKinesisShards}`, the actual number of reading tasks is less than the `taskCount` value.|No|1| -|`taskDuration`|ISO 8601 period|The length of time before tasks stop reading and begin publishing their segments.|No|PT1H| -|`startDelay`|ISO 8601 period|The period to wait before the supervisor starts managing tasks.|No|PT5S| -|`period`|ISO 8601 period|Determines how often the supervisor executes its management logic. Note that the supervisor also runs in response to certain events, such as tasks succeeding, failing, and reaching their task duration, so this value specifies the maximum time between iterations.|No|PT30S| -|`useEarliestSequenceNumber`|Boolean|If a supervisor is managing a datasource for the first time, it obtains a set of starting sequence numbers from Kinesis. This flag determines whether a supervisor retrieves the earliest or latest sequence numbers in Kinesis. Under normal circumstances, subsequent tasks start from where the previous segments ended so this flag is only used on the first run.|No|`false`| -|`completionTimeout`|ISO 8601 period|The length of time to wait before Druid declares a publishing task has failed and terminates it. If this is set too low, your tasks may never publish. The publishing clock for a task begins roughly after `taskDuration` elapses.|No|PT6H| -|`lateMessageRejectionPeriod`|ISO 8601 period|Configure tasks to reject messages with timestamps earlier than this period before the task is created. For example, if `lateMessageRejectionPeriod` is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, messages with timestamps earlier than `2016-01-01T11:00Z` are dropped. This may help prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments, such as a streaming and a nightly batch ingestion pipeline.|No|| -|`earlyMessageRejectionPeriod`|ISO 8601 period|Configure tasks to reject messages with timestamps later than this period after the task reached its `taskDuration`. For example, if `earlyMessageRejectionPeriod` is set to `PT1H`, the `taskDuration` is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`. Messages with timestamps later than `2016-01-01T14:00Z` are dropped. **Note:** Tasks sometimes run past their task duration, for example, in cases of supervisor failover. Setting `earlyMessageRejectionPeriod` too low may cause messages to be dropped unexpectedly whenever a task runs past its originally configured task duration.|No|| -|`fetchDelayMillis`|Integer|Time in milliseconds to wait between subsequent calls to fetch records from Kinesis. See [Determine fetch settings](#determine-fetch-settings).|No|0| -|`awsAssumedRoleArn`|String|The AWS assumed role to use for additional permissions.|No|| -|`awsExternalId`|String|The AWS external ID to use for additional permissions.|No|| -|`autoScalerConfig`|Object|Defines autoscaling behavior for Kinesis ingest tasks. See [Task autoscaler properties](#task-autoscaler-properties) for more information.|No|null| - -### Task autoscaler properties - -The following table outlines the configuration options for `autoScalerConfig`: - -|Property|Description|Required|Default| -|--------|-----------|--------|-------| -|`enableTaskAutoScaler`|Enables the auto scaler. If not specified, Druid disables the auto scaler even when `autoScalerConfig` is not null.|No|`false`| -|`taskCountMax`|Maximum number of Kinesis ingestion tasks. Must be greater than or equal to `taskCountMin`. If greater than `{numKinesisShards}`, Druid sets the maximum number of reading tasks to `{numKinesisShards}` and ignores `taskCountMax`.|Yes|| -|`taskCountMin`|Minimum number of Kinesis ingestion tasks. When you enable the auto scaler, Druid ignores the value of `taskCount` in `IOConfig` and uses `taskCountMin` for the initial number of tasks to launch.|Yes|| -|`minTriggerScaleActionFrequencyMillis`|Minimum time interval between two scale actions.| No|600000| -|`autoScalerStrategy`|The algorithm of `autoScaler`. Druid only supports the `lagBased` strategy. See [Lag based autoscaler strategy related properties](#lag-based-autoscaler-strategy-related-properties) for more information.|No|Defaults to `lagBased`.| - -### Lag based autoscaler strategy related properties - -Unlike the Kafka indexing service, Kinesis reports lag metrics measured in time difference in milliseconds between the current sequence number and latest sequence number, rather than message count. - -The following table outlines the configuration options for `autoScalerStrategy`: - -|Property|Description|Required|Default| -|--------|-----------|--------|-------| -|`lagCollectionIntervalMillis`|The time period during which Druid collects lag metric points.|No|30000| -|`lagCollectionRangeMillis`|The total time window of lag collection. Use with `lagCollectionIntervalMillis` to specify the intervals at which to collect lag metric points.|No|600000| -|`scaleOutThreshold`|The threshold of scale out action. |No|6000000| -|`triggerScaleOutFractionThreshold`|Enables scale out action if `triggerScaleOutFractionThreshold` percent of lag points is higher than `scaleOutThreshold`.|No|0.3| -|`scaleInThreshold`|The threshold of scale in action.|No|1000000| -|`triggerScaleInFractionThreshold`|Enables scale in action if `triggerScaleInFractionThreshold` percent of lag points is lower than `scaleOutThreshold`.|No|0.9| -|`scaleActionStartDelayMillis`|The number of milliseconds to delay after the supervisor starts before the first scale logic check.|No|300000| -|`scaleActionPeriodMillis`|The frequency in milliseconds to check if a scale action is triggered.|No|60000| -|`scaleInStep`|The number of tasks to reduce at once when scaling down.|No|1| -|`scaleOutStep`|The number of tasks to add at once when scaling out.|No|2| - -The following example shows a supervisor spec with `lagBased` auto scaler enabled. - -
- Click to view the example - -```json -{ - "type": "kinesis", - "dataSchema": { - "dataSource": "metrics-kinesis", - "timestampSpec": { - "column": "timestamp", - "format": "auto" - }, - "dimensionsSpec": { - "dimensions": [], - "dimensionExclusions": [ - "timestamp", - "value" - ] - }, - "metricsSpec": [ - { - "name": "count", - "type": "count" - }, - { - "name": "value_sum", - "fieldName": "value", - "type": "doubleSum" - }, - { - "name": "value_min", - "fieldName": "value", - "type": "doubleMin" - }, - { - "name": "value_max", - "fieldName": "value", - "type": "doubleMax" - } - ], - "granularitySpec": { - "type": "uniform", - "segmentGranularity": "HOUR", - "queryGranularity": "NONE" - } - }, - "ioConfig": { - "stream": "metrics", - "autoScalerConfig": { - "enableTaskAutoScaler": true, - "taskCountMax": 6, - "taskCountMin": 2, - "minTriggerScaleActionFrequencyMillis": 600000, - "autoScalerStrategy": "lagBased", - "lagCollectionIntervalMillis": 30000, - "lagCollectionRangeMillis": 600000, - "scaleOutThreshold": 600000, - "triggerScaleOutFractionThreshold": 0.3, - "scaleInThreshold": 100000, - "triggerScaleInFractionThreshold": 0.9, - "scaleActionStartDelayMillis": 300000, - "scaleActionPeriodMillis": 60000, - "scaleInStep": 1, - "scaleOutStep": 2 - }, - "inputFormat": { - "type": "json" - }, - "endpoint": "kinesis.us-east-1.amazonaws.com", - "taskCount": 1, - "replicas": 1, - "taskDuration": "PT1H" - }, - "tuningConfig": { - "type": "kinesis", - "maxRowsPerSegment": 5000000 - } -} -``` - -
- -### Specify data format - -The Kinesis indexing service supports both [`inputFormat`](../../ingestion/data-formats.md#input-format) and [`parser`](../../ingestion/data-formats.md#parser) to specify the data format. -Use the `inputFormat` to specify the data format for the Kinesis indexing service unless you need a format only supported by the legacy `parser`. - -Supported values for `inputFormat` include: - -- `csv` -- `delimited` -- `json` -- `avro_stream` -- `avro_ocf` -- `protobuf` - -For more information, see [Data formats](../../ingestion/data-formats.md). You can also read [`thrift`](../extensions-contrib/thrift.md) formats using `parser`. - -## Supervisor tuning configuration - -The `tuningConfig` object is optional. If you don't specify the `tuningConfig` object, Druid uses the default configuration settings. - -The following table outlines the configuration options for `tuningConfig`: - -|Property|Type|Description|Required|Default| -|--------|----|-----------|--------|-------| -|`type`|String|The indexing task type. This should always be `kinesis`.|Yes|| -|`maxRowsInMemory`|Integer|The number of rows to aggregate before persisting. This number represents the post-aggregation rows. It is not equivalent to the number of input events, but the resulting number of aggregated rows. Druid uses `maxRowsInMemory` to manage the required JVM heap size. The maximum heap memory usage for indexing scales is `maxRowsInMemory * (2 + maxPendingPersists)`.|No|100000| -|`maxBytesInMemory`|Long| The number of bytes to aggregate in heap memory before persisting. This is based on a rough estimate of memory usage and not actual usage. Normally, this is computed internally. The maximum heap memory usage for indexing is `maxBytesInMemory * (2 + maxPendingPersists)`.|No|One-sixth of max JVM memory| -|`skipBytesInMemoryOverheadCheck`|Boolean|The calculation of `maxBytesInMemory` takes into account overhead objects created during ingestion and each intermediate persist. To exclude the bytes of these overhead objects from the `maxBytesInMemory` check, set `skipBytesInMemoryOverheadCheck` to `true`.|No|`false`| -|`maxRowsPerSegment`|Integer|The number of rows to aggregate into a segment; this number represents the post-aggregation rows. Handoff occurs when `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens first.|No|5000000| -|`maxTotalRows`|Long|The number of rows to aggregate across all segments; this number represents the post-aggregation rows. Handoff occurs when `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens first.|No|unlimited| -|`intermediateHandoffPeriod`|ISO 8601 period|The period that determines how often tasks hand off segments. Handoff occurs if `maxRowsPerSegment` or `maxTotalRows` is reached or every `intermediateHandoffPeriod`, whichever happens first.|No|P2147483647D| -|`intermediatePersistPeriod`|ISO 8601 period|The period that determines the rate at which intermediate persists occur.|No|PT10M| -|`maxPendingPersists`|Integer|Maximum number of persists that can be pending but not started. If a new intermediate persist exceeds this limit, Druid blocks ingestion until the currently running persist finishes. One persist can be running concurrently with ingestion, and none can be queued up. The maximum heap memory usage for indexing scales is `maxRowsInMemory * (2 + maxPendingPersists)`.|No|0| -|`indexSpec`|Object|Defines how Druid indexes the data. See [IndexSpec](#indexspec) for more information.|No|| -|`indexSpecForIntermediatePersists`|Object|Defines segment storage format options to use at indexing time for intermediate persisted temporary segments. You can use `indexSpecForIntermediatePersists` to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. However, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published. See [IndexSpec](#indexspec) for possible values.|No|Same as `indexSpec`| -|`reportParseExceptions`|Boolean|If `true`, Druid throws exceptions encountered during parsing causing ingestion to halt. If `false`, Druid skips unparseable rows and fields.|No|`false`| -|`handoffConditionTimeout`|Long|Number of milliseconds to wait for segment handoff. Set to a value >= 0, where 0 means to wait indefinitely.|No|0| -|`resetOffsetAutomatically`|Boolean|Controls behavior when Druid needs to read Kinesis messages that are no longer available.
If `false`, the exception bubbles up causing tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation, potentially using the [Reset Supervisor API](../../api-reference/supervisor-api.md). This mode is useful for production, since it highlights issues with ingestion.
If `true`, Druid automatically resets to the earliest or latest sequence number available in Kinesis, based on the value of the `useEarliestSequenceNumber` property (earliest if `true`, latest if `false`). Note that this can lead to dropping data (if `useEarliestSequenceNumber` is `false`) or duplicating data (if `useEarliestSequenceNumber` is `true`) without your knowledge. Druid logs messages indicating that a reset has occurred without interrupting ingestion. This mode is useful for non-production situations since it enables Druid to recover from problems automatically, even if they lead to quiet dropping or duplicating of data.|No|`false`| -|`skipSequenceNumberAvailabilityCheck`|Boolean|Whether to enable checking if the current sequence number is still available in a particular Kinesis shard. If `false`, the indexing task attempts to reset the current sequence number, depending on the value of `resetOffsetAutomatically`.|No|`false`| -|`workerThreads`|Integer|The number of threads that the supervisor uses to handle requests/responses for worker tasks, along with any other internal asynchronous operation.|No| `min(10, taskCount)`| -|`chatRetries`|Integer|The number of times Druid retries HTTP requests to indexing tasks before considering tasks unresponsive.|No|8| -|`httpTimeout`|ISO 8601 period|The period of time to wait for a HTTP response from an indexing task.|No|PT10S| -|`shutdownTimeout`|ISO 8601 period|The period of time to wait for the supervisor to attempt a graceful shutdown of tasks before exiting.|No|PT80S| -|`recordBufferSizeBytes`|Integer| The size of the buffer (heap memory bytes) Druid uses between the Kinesis fetch threads and the main ingestion thread.|No| See [Determine fetch settings](#determine-fetch-settings) for defaults.| -|`recordBufferOfferTimeout`|Integer|The number of milliseconds to wait for space to become available in the buffer before timing out.|No|5000| -|`recordBufferFullWait`|Integer|The number of milliseconds to wait for the buffer to drain before Druid attempts to fetch records from Kinesis again.|No|5000| -|`fetchThreads`|Integer|The size of the pool of threads fetching data from Kinesis. There is no benefit in having more threads than Kinesis shards.|No| `procs * 2`, where `procs` is the number of processors available to the task.| -|`segmentWriteOutMediumFactory`|Object|The segment write-out medium to use when creating segments See [Additional Peon configuration: SegmentWriteOutMediumFactory](../../configuration/index.md#segmentwriteoutmediumfactory) for explanation and available options.|No|If not specified, Druid uses the value from `druid.peon.defaultSegmentWriteOutMediumFactory.type`.| -|`logParseExceptions`|Boolean|If `true`, Druid logs an error message when a parsing exception occurs, containing information about the row where the error occurred.|No|`false`| -|`maxParseExceptions`|Integer|The maximum number of parse exceptions that can occur before the task halts ingestion and fails. Overridden if `reportParseExceptions` is set.|No|unlimited| -|`maxSavedParseExceptions`|Integer|When a parse exception occurs, Druid keeps track of the most recent parse exceptions. `maxSavedParseExceptions` limits the number of saved exception instances. These saved exceptions are available after the task finishes in the [task completion report](../../ingestion/tasks.md#task-reports). Overridden if `reportParseExceptions` is set.|No|0| -|`maxBytesPerPoll`|Integer| The maximum number of bytes to be fetched from buffer per poll. At least one record is polled from the buffer regardless of this config.|No| 1000000 bytes| -|`repartitionTransitionDuration`|ISO 8601 period|When shards are split or merged, the supervisor recomputes shard to task group mappings. The supervisor also signals any running tasks created under the old mappings to stop early at current time + `repartitionTransitionDuration`. Stopping the tasks early allows Druid to begin reading from the new shards more quickly. The repartition transition wait time controlled by this property gives the stream additional time to write records to the new shards after the split or merge, which helps avoid issues with [empty shard handling](https://github.com/apache/druid/issues/7600).|No|PT2M| -|`offsetFetchPeriod`|ISO 8601 period|Determines how often the supervisor queries Kinesis and the indexing tasks to fetch current offsets and calculate lag. If the user-specified value is below the minimum value of PT5S, the supervisor ignores the value and uses the minimum value instead.|No|PT30S| -|`useListShards`|Boolean|Indicates if `listShards` API of AWS Kinesis SDK can be used to prevent `LimitExceededException` during ingestion. You must set the necessary `IAM` permissions.|No|`false`| - -### IndexSpec - -The following table outlines the configuration options for `indexSpec`: - -|Property|Type|Description|Required|Default| -|--------|----|-----------|--------|-------| -|`bitmap`|Object|Compression format for bitmap indexes. Druid supports roaring and concise bitmap types.|No|Roaring| -|`dimensionCompression`|String|Compression format for dimension columns. Choose from `LZ4`, `LZF`, or `uncompressed`.|No|`LZ4`| -|`metricCompression`|String|Compression format for primitive type metric columns. Choose from `LZ4`, `LZF`, `uncompressed`, or `none`.|No|`LZ4`| -|`longEncoding`|String|Encoding format for metric and dimension columns with type long. Choose from `auto` or `longs`. `auto` encodes the values using sequence number or lookup table depending on column cardinality and stores them with variable sizes. `longs` stores the value as is with 8 bytes each.|No|`longs`| - -## Operations - -This section describes how to use the [Supervisor API](../../api-reference/supervisor-api.md) with the Kinesis indexing service. - -### AWS authentication - -Druid uses AWS access and secret keys to authenticate Kinesis API requests. There are a few ways to provide this information to Druid: - -1. Using roles or short-term credentials: - - Druid looks for credentials set in [environment variables](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html), -via [Web Identity Token](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_oidc.html), in the -default [profile configuration file](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html), and from the -EC2 instance profile provider (in this order). - -2. Using long-term security credentials: - - You can directly provide your AWS access key and AWS secret key in the `common.runtime.properties` file as shown in the example below: - -```properties -druid.kinesis.accessKey=AKIAWxxxxxxxxxx4NCKS -druid.kinesis.secretKey=Jbytxxxxxxxxxxx2+555 -``` - -> Note: AWS does not recommend providing long-term security credentials in configuration files since it might pose a security risk. -If you use this approach, it takes precedence over all other methods of providing credentials. - -To ingest data from Kinesis, ensure that the policy attached to your IAM role contains the necessary permissions. -The required permissions depend on the value of `useListShards`. - -If the `useListShards` flag is set to `true`, you need following permissions: - -- `ListStreams` to list your data streams. -- `Get*` required for `GetShardIterator`. -- `GetRecords` to get data records from a data stream's shard. -- `ListShards` to get the shards for a stream of interest. - -The following is an example policy: - -```json -[ - { - "Effect": "Allow", - "Action": ["kinesis:List*"], - "Resource": ["*"] - }, - { - "Effect": "Allow", - "Action": ["kinesis:Get*"], - "Resource": [] - } -] -``` - -If the `useListShards` flag is set to `false`, you need following permissions: - -- `ListStreams` to list your data streams. -- `Get*` required for `GetShardIterator`. -- `GetRecords` to get data records from a data stream's shard. -- `DescribeStream` to describe the specified data stream. - -The following is an example policy: - -```json -[ - { - "Effect": "Allow", - "Action": ["kinesis:ListStreams"], - "Resource": ["*"] - }, - { - "Effect": "Allow", - "Action": ["kinesis:DescribeStream"], - "Resource": ["*"] - }, - { - "Effect": "Allow", - "Action": ["kinesis:Get*"], - "Resource": [] - } -] -``` - -### Get supervisor status report - -To retrieve the current status report for a single supervisor, send a `GET` request to the `/druid/indexer/v1/supervisor/:supervisorId/status` endpoint. - -The report contains the state of the supervisor tasks, the latest sequence numbers, and an array of recently thrown exceptions reported as `recentErrors`. You can control the maximum size of the exceptions using the `druid.supervisor.maxStoredExceptionEvents` configuration. - -The two properties related to the supervisor's state are `state` and `detailedState`. The `state` property contains a small number of generic states that apply to any type of supervisor, while the `detailedState` property contains a more descriptive, implementation-specific state that may provide more insight into the supervisor's activities. - -Possible `state` values are `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`, `UNHEALTHY_SUPERVISOR`, and `UNHEALTHY_TASKS`. - -The following table lists `detailedState` values and their corresponding `state` mapping: - -|Detailed state|Corresponding state|Description| -|--------------|-------------------|-----------| -|`UNHEALTHY_SUPERVISOR`|`UNHEALTHY_SUPERVISOR`|The supervisor encountered errors on previous `druid.supervisor.unhealthinessThreshold` iterations.| -|`UNHEALTHY_TASKS`|`UNHEALTHY_TASKS`|The last `druid.supervisor.taskUnhealthinessThreshold` tasks all failed.| -|`UNABLE_TO_CONNECT_TO_STREAM`|`UNHEALTHY_SUPERVISOR`|The supervisor is encountering connectivity issues with Kinesis and has not successfully connected in the past.| -|`LOST_CONTACT_WITH_STREAM`|`UNHEALTHY_SUPERVISOR`|The supervisor is encountering connectivity issues with Kinesis but has successfully connected in the past.| -|`PENDING` (first iteration only)|`PENDING`|The supervisor has been initialized but hasn't started connecting to the stream.| -|`CONNECTING_TO_STREAM` (first iteration only)|`RUNNING`|The supervisor is trying to connect to the stream and update partition data.| -|`DISCOVERING_INITIAL_TASKS` (first iteration only)|`RUNNING`|The supervisor is discovering already-running tasks.| -|`CREATING_TASKS` (first iteration only)|`RUNNING`|The supervisor is creating tasks and discovering state.| -|`RUNNING`|`RUNNING`|The supervisor has started tasks and is waiting for `taskDuration` to elapse.| -|`SUSPENDED`|`SUSPENDED`|The supervisor is suspended.| -|`STOPPING`|`STOPPING`|The supervisor is stopping.| - -On each iteration of the supervisor's run loop, the supervisor completes the following tasks in sequence: - -1. Fetch the list of shards from Kinesis and determine the starting sequence number for each shard (either based on the last processed sequence number if continuing, or starting from the beginning or ending of the stream if this is a new stream). -2. Discover any running indexing tasks that are writing to the supervisor's datasource and adopt them if they match the supervisor's configuration, else signal them to stop. -3. Send a status request to each supervised task to update the view of the state of the tasks under supervision. -4. Handle tasks that have exceeded `taskDuration` and should transition from the reading to publishing state. -5. Handle tasks that have finished publishing and signal redundant replica tasks to stop. -6. Handle tasks that have failed and clean up the supervisor's internal state. -7. Compare the list of healthy tasks to the requested `taskCount` and `replicas` configurations and create additional tasks if required. - -The `detailedState` property shows additional values (marked with "first iteration only" in the preceding table) the first time the -supervisor executes this run loop after startup or after resuming from a suspension. This is intended to surface -initialization-type issues, where the supervisor is unable to reach a stable state. For example, if the supervisor cannot connect to -Kinesis, if it's unable to read from the stream, or cannot communicate with existing tasks. Once the supervisor is stable; -that is, once it has completed a full execution without encountering any issues, `detailedState` will show a `RUNNING` -state until it is stopped, suspended, or hits a failure threshold and transitions to an unhealthy state. - -### Update existing supervisors - -To update an existing supervisor spec, send a `POST` request to the `/druid/indexer/v1/supervisor` endpoint. - -When you call this endpoint on an existing supervisor for the same datasource, the running supervisor signals its tasks to stop reading and begin publishing their segments, exiting itself. Druid then uses the provided configuration from the request body to create a new supervisor with a new set of tasks that start reading from the sequence numbers, where the previous now-publishing tasks left off, but using the updated schema. -In this way, configuration changes can be applied without requiring any pause in ingestion. - -You can achieve seamless schema migrations by submitting the new schema using the `/druid/indexer/v1/supervisor` endpoint. - -### Suspend and resume a supervisor - -To suspend a supervisor, send a `POST` request to the `/druid/indexer/v1/supervisor/:supervisorId/suspend` endpoint. -Suspending a supervisor does not prevent it from operating and emitting logs and metrics. It ensures that no indexing tasks are running until the supervisor resumes. - -To resume a supervisor, send a `POST` request to the `/druid/indexer/v1/supervisor/:supervisorId/resume` endpoint. - -### Reset a supervisor - -The supervisor must be running for this endpoint to be available - -To reset a supervisor, send a `POST` request to the `/druid/indexer/v1/supervisor/:supervisorId/reset` endpoint. This endpoint clears stored -sequence numbers, prompting the supervisor to start reading from either the earliest or the -latest sequence numbers in Kinesis (depending on the value of `useEarliestSequenceNumber`). -After clearing stored sequence numbers, the supervisor kills and recreates active tasks, -so that tasks begin reading from valid sequence numbers. - -This endpoint is useful when you need to recover from a stopped state due to missing sequence numbers in Kinesis. -Use this endpoint with caution as it may result in skipped messages, leading to data loss or duplicate data. - -The indexing service keeps track of the latest -persisted sequence number to provide exactly-once ingestion guarantees across -tasks. -Subsequent tasks must start reading from where the previous task completed -for the generated segments to be accepted. If the messages at the expected starting sequence numbers are -no longer available in Kinesis (typically because the message retention period has elapsed or the topic was -removed and re-created) the supervisor will refuse to start and in-flight tasks will fail. This endpoint enables you to recover from this condition. - -### Resetting Offsets for a supervisor - -To reset partition offsets for a supervisor, send a `POST` request to the `/druid/indexer/v1/supervisor/:supervisorId/resetOffsets` endpoint. This endpoint clears stored -sequence numbers, prompting the supervisor to start reading from the specified offsets. -After resetting stored offsets, the supervisor kills and recreates any active tasks pertaining to the specified partitions, -so that tasks begin reading specified offsets. For partitions that are not specified in this operation, the supervisor will resume from the last -stored offset. - -Use this endpoint with caution as it may result in skipped messages, leading to data loss or duplicate data. - -### Terminate a supervisor - -To terminate a supervisor and its associated indexing tasks, send a `POST` request to the `/druid/indexer/v1/supervisor/:supervisorId/terminate` endpoint. -This places a tombstone marker in the database to prevent the supervisor from being reloaded on a restart and then gracefully -shuts down the currently running supervisor. -The tasks stop reading and begin publishing their segments immediately. -The call returns after all tasks have been signaled to stop but before the tasks finish publishing their segments. - -The terminated supervisor continues exists in the metadata store and its history can be retrieved. -The only way to restart a terminated supervisor is by submitting a functioning supervisor spec to `/druid/indexer/v1/supervisor`. - -## Capacity planning - -Kinesis indexing tasks run on Middle Managers and are limited by the resources available in the Middle Manager cluster. In particular, you should make sure that you have sufficient worker capacity, configured using the -`druid.worker.capacity` property, to handle the configuration in the supervisor spec. Note that worker capacity is -shared across all types of indexing tasks, so you should plan your worker capacity to handle your total indexing load, such as batch processing, streaming tasks, and merging tasks. If your workers run out of capacity, Kinesis indexing tasks queue and wait for the next available worker. This may cause queries to return partial results but will not result in data loss, assuming the tasks run before Kinesis purges those sequence numbers. - -A running task can be in one of two states: reading or publishing. A task remains in reading state for the period defined in `taskDuration`, at which point it transitions to publishing state. A task remains in publishing state for as long as it takes to generate segments, push segments to deep storage, and have them loaded and served by a Historical process or until `completionTimeout` elapses. - -The number of reading tasks is controlled by `replicas` and `taskCount`. In general, there are `replicas * taskCount` reading tasks. An exception occurs if `taskCount > {numKinesisShards}`, in which case Druid uses `{numKinesisShards}` tasks. When `taskDuration` elapses, these tasks transition to publishing state and `replicas * taskCount` new reading tasks are created. To allow for reading tasks and publishing tasks to run concurrently, there should be a minimum capacity of: - -```text -workerCapacity = 2 * replicas * taskCount -``` - -This value is for the ideal situation in which there is at most one set of tasks publishing while another set is reading. -In some circumstances, it is possible to have multiple sets of tasks publishing simultaneously. This would happen if the -time-to-publish (generate segment, push to deep storage, load on Historical) is greater than `taskDuration`. This is a valid and correct scenario but requires additional worker capacity to support. In general, it is a good idea to have `taskDuration` be large enough that the previous set of tasks finishes publishing before the current set begins. - -## Shards and segment handoff - -Each Kinesis indexing task writes the events it consumes from Kinesis shards into a single segment for the segment granularity interval until it reaches one of the following limits: `maxRowsPerSegment`, `maxTotalRows`, or `intermediateHandoffPeriod`. -At this point, the task creates a new shard for this segment granularity to contain subsequent events. - -The Kinesis indexing task also performs incremental hand-offs so that the segments created by the task are not held up until the task duration is over. -When the task reaches one of the `maxRowsPerSegment`, `maxTotalRows`, or `intermediateHandoffPeriod` limits, it hands off all the segments and creates a new set of segments for further events. This allows the task to run for longer durations -without accumulating old segments locally on Middle Manager processes. - -The Kinesis indexing service may still produce some small segments. -For example, consider the following scenario: - -- Task duration is 4 hours -- Segment granularity is set to an HOUR -- The supervisor was started at 9:10 - -After 4 hours at 13:10, Druid starts a new set of tasks. The events for the interval 13:00 - 14:00 may be split across existing tasks and the new set of tasks which could result in small segments. To merge them together into new segments of an ideal size (in the range of ~500-700 MB per segment), you can schedule re-indexing tasks, optionally with a different segment granularity. - -For more detail, see [Segment size optimization](../../operations/segment-optimization.md). - -## Determine fetch settings - -Kinesis indexing tasks fetch records using `fetchThreads` threads. -If `fetchThreads` is higher than the number of Kinesis shards, the excess threads are unused. -Each fetch thread fetches up to 10 MB of records at once from a Kinesis shard, with a delay between fetches -of `fetchDelayMillis`. -The records fetched by each thread are pushed into a shared queue of size `recordBufferSizeBytes`. -The main runner thread for each task polls up to `maxRecordsPerPoll` records from the queue at once. - -The default values for these parameters are: - -- `fetchThreads`: Twice the number of processors available to the task. The number of processors available to the task -is the total number of processors on the server, divided by `druid.worker.capacity` (the number of task slots on that -particular server). This value is further limited so that the total data record data fetched at a given time does not -exceed 5% of the max heap configured, assuming that each thread fetches 10 MB of records at once. If the value specified -for this configuration is higher than this limit, no failure occurs, but a warning is logged, and the value is -implicitly lowered to the max allowed by this constraint. -- `fetchDelayMillis`: 0 (no delay between fetches). -- `recordBufferSizeBytes`: 100 MB or an estimated 10% of available heap, whichever is smaller. -- `maxBytesPerPoll`: 1000000. - -Kinesis places the following restrictions on calls to fetch records: - -- Each data record can be up to 1 MB in size. -- Each shard can support up to 5 transactions per second for reads. -- Each shard can read up to 2 MB per second. -- The maximum size of data that GetRecords can return is 10 MB. - -If the above limits are exceeded, Kinesis throws `ProvisionedThroughputExceededException` errors. If this happens, Druid -Kinesis tasks pause by `fetchDelayMillis` or 3 seconds, whichever is larger, and then attempt the call again. - -In most cases, the default settings for fetch parameters are sufficient to achieve good performance without excessive -memory usage. However, in some cases, you may need to adjust these parameters to control fetch rate -and memory usage more finely. Optimal values depend on the average size of a record and the number of consumers you -have reading from a given shard, which will be `replicas` unless you have other consumers also reading from this -Kinesis stream. - -## Deaggregation - -The Kinesis indexing service supports de-aggregation of multiple rows packed into a single record by the Kinesis -Producer Library's aggregate method for more efficient data transfer. - -## Resharding - -[Resharding](https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-resharding.html) is an advanced operation that lets you adjust the number of shards in a stream to adapt to changes in the rate of data flowing through a stream. - -When changing the shard count for a Kinesis stream, there is a window of time around the resharding operation with early shutdown of Kinesis ingestion tasks and possible task failures. - -The early shutdowns and task failures are expected. They occur because the supervisor updates the shard to task group mappings as shards are closed and fully read. This ensures that tasks are not running -with an assignment of closed shards that have been fully read and balances distribution of active shards across tasks. - -This window with early task shutdowns and possible task failures concludes when: - -- All closed shards have been fully read and the Kinesis ingestion tasks have published the data from those shards, committing the "closed" state to metadata storage. -- Any remaining tasks that had inactive shards in the assignment have been shut down. These tasks would have been created before the closed shards were completely drained. - -Note that when the supervisor is running and detects new partitions, tasks read new partitions from the earliest offsets, irrespective of the `useEarliestSequence` setting. This is because these new shards were immediately discovered and are therefore unlikely to experience a lag. - -If resharding occurs when the supervisor is suspended and `useEarliestSequence` is set to `false`, resuming the supervisor causes tasks to read the new shards from the latest sequence. This is by design so that the consumer can catch up quickly with any lag accumulated while the supervisor was suspended. - -## Kinesis known issues - -Before you deploy the Kinesis extension to production, consider the following known issues: - -- Avoid implementing more than one Kinesis supervisor that reads from the same Kinesis stream for ingestion. Kinesis has a per-shard read throughput limit and having multiple supervisors on the same stream can reduce available read throughput for an individual supervisor's tasks. Multiple supervisors ingesting to the same Druid datasource can also cause increased contention for locks on the datasource. -- The only way to change the stream reset policy is to submit a new ingestion spec and set up a new supervisor. -- If ingestion tasks get stuck, the supervisor does not automatically recover. You should monitor ingestion tasks and investigate if your ingestion falls behind. -- A Kinesis supervisor can sometimes compare the checkpoint offset to retention window of the stream to see if it has fallen behind. These checks fetch the earliest sequence number for Kinesis which can result in `IteratorAgeMilliseconds` becoming very high in AWS CloudWatch. - diff --git a/docs/ingestion/kafka-ingestion.md b/docs/ingestion/kafka-ingestion.md index c55fc5e626c5..f064e7c42a18 100644 --- a/docs/ingestion/kafka-ingestion.md +++ b/docs/ingestion/kafka-ingestion.md @@ -245,8 +245,6 @@ The following example shows a supervisor spec with idle configuration enabled: #### Data format - - The Kafka indexing service supports both [`inputFormat`](data-formats.md#input-format) and [`parser`](data-formats.md#parser) to specify the data format. Use the `inputFormat` to specify the data format for the Kafka indexing service unless you need a format only supported by the legacy `parser`. For more information, see [Source input formats](data-formats.md). The Kinesis indexing service supports the following values for `inputFormat`: @@ -423,6 +421,7 @@ For configuration properties shared across all streaming ingestion methods, refe |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| +|`numPersistThreads`|Integer|The number of threads to use to create and persist incremental segments on the disk. Higher ingestion data throughput results in a larger number of incremental segments, causing significant CPU time to be spent on the creation of the incremental segments on the disk. For datasources with number of columns running into hundreds or thousands, creation of the incremental segments may take up significant time, in the order of multiple seconds. In both of these scenarios, ingestion can stall or pause frequently, causing it to fall behind. You can use additional threads to parallelize the segment creation without blocking ingestion as long as there are sufficient CPU resources available.|No|1| |`chatAsync`|Boolean|If `true`, use asynchronous communication with indexing tasks, and ignore the `chatThreads` parameter. If `false`, use synchronous communication in a thread pool of size `chatThreads`.|No|`true`| |`chatThreads`|Integer|The number of threads to use for communicating with indexing tasks. Ignored if `chatAsync` is `true`.|No|`min(10, taskCount * replicas)`| diff --git a/docs/ingestion/kinesis-ingestion.md b/docs/ingestion/kinesis-ingestion.md index e0b887b517af..3eaab4112c63 100644 --- a/docs/ingestion/kinesis-ingestion.md +++ b/docs/ingestion/kinesis-ingestion.md @@ -129,11 +129,9 @@ For configuration properties shared across all streaming ingestion methods, refe |`stream`|String|The Kinesis stream to read.|Yes|| |`endpoint`|String|The AWS Kinesis stream endpoint for a region. You can find a list of endpoints in the [AWS service endpoints](http://docs.aws.amazon.com/general/latest/gr/rande.html#ak_region) document.|No|`kinesis.us-east-1.amazonaws.com`| |`useEarliestSequenceNumber`|Boolean|If a supervisor is managing a datasource for the first time, it obtains a set of starting sequence numbers from Kinesis. This flag determines whether a supervisor retrieves the earliest or latest sequence numbers in Kinesis. Under normal circumstances, subsequent tasks start from where the previous segments ended so this flag is only used on the first run.|No|`false`| -|`recordsPerFetch`|Integer|The number of records to request per call to fetch records from Kinesis.|No| See [Determine fetch settings](#determine-fetch-settings) for defaults.| |`fetchDelayMillis`|Integer|Time in milliseconds to wait between subsequent calls to fetch records from Kinesis. See [Determine fetch settings](#determine-fetch-settings).|No|0| |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional permissions.|No|| |`awsExternalId`|String|The AWS external ID to use for additional permissions.|No|| -|`deaggregate`|Boolean|Whether to use the deaggregate function of the Kinesis Client Library (KCL).|No|| #### Data format @@ -158,11 +156,11 @@ For configuration properties shared across all streaming ingestion methods, refe |Property|Type|Description|Required|Default| |--------|----|-----------|--------|-------| |`skipSequenceNumberAvailabilityCheck`|Boolean|Whether to enable checking if the current sequence number is still available in a particular Kinesis shard. If `false`, the indexing task attempts to reset the current sequence number, depending on the value of `resetOffsetAutomatically`.|No|`false`| -|`recordBufferSize`|Integer|The size of the buffer (number of events) Druid uses between the Kinesis fetch threads and the main ingestion thread.|No|See [Determine fetch settings](#determine-fetch-settings) for defaults.| +|`recordBufferSizeBytes`|Integer| The size of the buffer (heap memory bytes) Druid uses between the Kinesis fetch threads and the main ingestion thread.|No| See [Determine fetch settings](#determine-fetch-settings) for defaults.| |`recordBufferOfferTimeout`|Integer|The number of milliseconds to wait for space to become available in the buffer before timing out.|No|5000| |`recordBufferFullWait`|Integer|The number of milliseconds to wait for the buffer to drain before Druid attempts to fetch records from Kinesis again.|No|5000| |`fetchThreads`|Integer|The size of the pool of threads fetching data from Kinesis. There is no benefit in having more threads than Kinesis shards.|No| `procs * 2`, where `procs` is the number of processors available to the task.| -|`maxRecordsPerPoll`|Integer|The maximum number of records to be fetched from buffer per poll. The actual maximum will be `Max(maxRecordsPerPoll, Max(bufferSize, 1))`.|No| See [Determine fetch settings](#determine-fetch-settings) for defaults.| +|`maxBytesPerPoll`|Integer| The maximum number of bytes to be fetched from buffer per poll. At least one record is polled from the buffer regardless of this config.|No| 1000000 bytes| |`repartitionTransitionDuration`|ISO 8601 period|When shards are split or merged, the supervisor recomputes shard to task group mappings. The supervisor also signals any running tasks created under the old mappings to stop early at current time + `repartitionTransitionDuration`. Stopping the tasks early allows Druid to begin reading from the new shards more quickly. The repartition transition wait time controlled by this property gives the stream additional time to write records to the new shards after the split or merge, which helps avoid issues with [empty shard handling](https://github.com/apache/druid/issues/7600).|No|`PT2M`| |`useListShards`|Boolean|Indicates if `listShards` API of AWS Kinesis SDK can be used to prevent `LimitExceededException` during ingestion. You must set the necessary `IAM` permissions.|No|`false`| @@ -268,25 +266,20 @@ For information on how to optimize the segment size, see [Segment size optimizat Kinesis indexing tasks fetch records using `fetchThreads` threads. If `fetchThreads` is higher than the number of Kinesis shards, the excess threads are unused. -Each fetch thread fetches up to `recordsPerFetch` records at once from a Kinesis shard, with a delay between fetches -of `fetchDelayMillis`. -The records fetched by each thread are pushed into a shared queue of size `recordBufferSize`. -The main runner thread for each task polls up to `maxRecordsPerPoll` records from the queue at once. - -When using Kinesis Producer Library's aggregation feature, that is when [`deaggregate`](#deaggregation) is set, -each of these parameters refers to aggregated records rather than individual records. +Each fetch thread fetches up to 10 MB of records at once from a Kinesis shard, with a delay between fetches of `fetchDelayMillis`. +The records fetched by each thread are pushed into a shared queue of size `recordBufferSizeBytes`. The default values for these parameters are: - `fetchThreads`: Twice the number of processors available to the task. The number of processors available to the task is the total number of processors on the server, divided by `druid.worker.capacity` (the number of task slots on that -particular server). +particular server). This value is further limited so that the total data record data fetched at a given time does not +exceed 5% of the max heap configured, assuming that each thread fetches 10 MB of records at once. If the value specified +for this configuration is higher than this limit, no failure occurs, but a warning is logged, and the value is +implicitly lowered to the max allowed by this constraint. - `fetchDelayMillis`: 0 (no delay between fetches). -- `recordsPerFetch`: 100 MB or an estimated 5% of available heap, whichever is smaller, divided by `fetchThreads`. -For estimation purposes, Druid uses a figure of 10 KB for regular records and 1 MB for [aggregated records](#deaggregation). -- `recordBufferSize`: 100 MB or an estimated 10% of available heap, whichever is smaller. -For estimation purposes, Druid uses a figure of 10 KB for regular records and 1 MB for [aggregated records](#deaggregation). -- `maxRecordsPerPoll`: 100 for regular records, 1 for [aggregated records](#deaggregation). +- `recordBufferSizeBytes`: 100 MB or an estimated 10% of available heap, whichever is smaller. +- `maxBytesPerPoll`: 1000000. Kinesis places the following restrictions on calls to fetch records: @@ -308,8 +301,6 @@ Kinesis stream. The Kinesis indexing service supports de-aggregation of multiple rows stored within a single [Kinesis Data Streams](https://docs.aws.amazon.com/streams/latest/dev/introduction.html) record for more efficient data transfer. -To enable this feature, set `deaggregate` to true in your `ioConfig` when submitting a supervisor spec. - ## Resharding [Resharding](https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-resharding.html) is an advanced operation that lets you adjust the number of shards in a stream to adapt to changes in the rate of data flowing through a stream. From c0d898173a3a8e45a37885f2d53be673a716bd7f Mon Sep 17 00:00:00 2001 From: Katya Macedo Date: Fri, 9 Feb 2024 15:34:33 -0600 Subject: [PATCH 14/15] Fix broken links --- docs/api-reference/supervisor-api.md | 2 +- docs/ingestion/kinesis-ingestion.md | 2 +- docs/querying/nested-columns.md | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/api-reference/supervisor-api.md b/docs/api-reference/supervisor-api.md index a88b1a984a72..2e63b69c07f6 100644 --- a/docs/api-reference/supervisor-api.md +++ b/docs/api-reference/supervisor-api.md @@ -2215,7 +2215,7 @@ Host: http://ROUTER_IP:ROUTER_PORT Creates a new supervisor spec or updates an existing one with new configuration and schema information. When updating a supervisor spec, the datasource must remain the same as the previous supervisor. -You can define a supervisor spec for [Apache Kafka](../ingestion/kafka-ingestion.md#supervisor-spec) or [Amazon Kinesis](../ingestion/kinesis-ingestion.md#supervisor-spec) streaming ingestion methods. +You can define a supervisor spec for [Apache Kafka](../ingestion/kafka-ingestion.md) or [Amazon Kinesis](../ingestion/kinesis-ingestion.md) streaming ingestion methods. The following table lists the properties of a supervisor spec: diff --git a/docs/ingestion/kinesis-ingestion.md b/docs/ingestion/kinesis-ingestion.md index 3eaab4112c63..fb4bfde235a0 100644 --- a/docs/ingestion/kinesis-ingestion.md +++ b/docs/ingestion/kinesis-ingestion.md @@ -35,7 +35,7 @@ This topic contains configuration information for the Kinesis indexing service s To use the Kinesis indexing service, you must first load the `druid-kinesis-indexing-service` core extension on both the Overlord and the MiddleManager. See [Loading extensions](../configuration/extensions.md#loading-extensions) for more information. -Review [Kinesis known issues](#kinesis-known-issues) before deploying the `druid-kinesis-indexing-service` extension to production. +Review [Known issues](#known-issues) before deploying the `druid-kinesis-indexing-service` extension to production. ## Supervisor spec configuration diff --git a/docs/querying/nested-columns.md b/docs/querying/nested-columns.md index d50f907c7714..3641d4d46aa9 100644 --- a/docs/querying/nested-columns.md +++ b/docs/querying/nested-columns.md @@ -227,7 +227,7 @@ PARTITIONED BY ALL You can ingest nested data into Druid using the [streaming method](../ingestion/index.md#streaming)—for example, from a Kafka topic. -When you [define your supervisor spec](../ingestion/kafka-ingestion.md#define-a-supervisor-spec), include a dimension with type `json` for each nested column. For example, the following supervisor spec from the [Kafka ingestion tutorial](../tutorials/tutorial-kafka.md) contains dimensions for the nested columns `event`, `agent`, and `geo_ip` in datasource `kttm-kafka`. +When you [define your supervisor spec](../ingestion/supervisor.md#start-a-supervisor), include a dimension with type `json` for each nested column. For example, the following supervisor spec from the [Kafka ingestion tutorial](../tutorials/tutorial-kafka.md) contains dimensions for the nested columns `event`, `agent`, and `geo_ip` in datasource `kttm-kafka`. ```json { From 45acec71640930589895a6fb97d5ec1d55a2ab7b Mon Sep 17 00:00:00 2001 From: Katya Macedo Date: Fri, 9 Feb 2024 15:55:01 -0600 Subject: [PATCH 15/15] Fix spelling error --- docs/ingestion/kafka-ingestion.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/ingestion/kafka-ingestion.md b/docs/ingestion/kafka-ingestion.md index f064e7c42a18..5d14b1ad6baf 100644 --- a/docs/ingestion/kafka-ingestion.md +++ b/docs/ingestion/kafka-ingestion.md @@ -122,7 +122,7 @@ For configuration properties shared across all streaming ingestion methods, refe |--------|----|-----------|--------|-------| |`topic`|String|The Kafka topic to read from. To ingest data from multiple topic, use `topicPattern`. |Yes if `topicPattern` isn't set.|| |`topicPattern`|String|Multiple Kafka topics to read from, passed as a regex pattern. See [Ingest from multiple topics](#ingest-from-multiple-topics) for more information.|Yes if `topic` isn't set.|| -|`consumerProperties`|String, Object|A map of properties to pass to the Kafka consumer. See [Consumer properties](#consumer-properties) for details.|Yes. At the minumum, you must set the `bootstrap.servers` property to establish the initial connection to the Kafka cluster.|| +|`consumerProperties`|String, Object|A map of properties to pass to the Kafka consumer. See [Consumer properties](#consumer-properties) for details.|Yes. At the minimum, you must set the `bootstrap.servers` property to establish the initial connection to the Kafka cluster.|| |`pollTimeout`|Long|The length of time to wait for the Kafka consumer to poll records, in milliseconds.|No|100| |`useEarliestOffset`|Boolean|If a supervisor manages a datasource for the first time, it obtains a set of starting offsets from Kafka. This flag determines whether it retrieves the earliest or latest offsets in Kafka. Under normal circumstances, subsequent tasks start from where the previous segments ended. Druid only uses `useEarliestOffset` on the first run.|No|`false`| |`idleConfig`|Object|Defines how and when the Kafka supervisor can become idle. See [Idle configuration](#idle-configuration) for more details.|No|null|