Skip to content
Permalink
Browse files
Docs - S3 masking and nav update to S3 page (#11490)
* Docs: Masking S3 creds and some rewording

Knowledge transfer from https://groups.google.com/g/druid-user/c/FydcpFrA688

* Removed bold in one of the quote sections

* Update s3.md

* Update s3.md

Quick grammar change

* Update docs/development/extensions-core/s3.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/development/extensions-core/s3.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/development/extensions-core/s3.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/development/extensions-core/s3.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/development/extensions-core/s3.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update s3.md

Typo

* Update docs/development/extensions-core/s3.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update s3.md

Active lang

* Update s3.md

LAng nit

* Update native-batch.md

LAng nit

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Grammar tidy-up and link fix

Corrected 2 x links to old page H2s, resolved the question around precedence, and some other grammatical changes.

* Update docs/development/extensions-core/s3.md

* Update s3.md

Removed an Erroneous E

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
  • Loading branch information
petermarshallio and techdocsmith committed Mar 29, 2022
1 parent b9a968e commit f1841c644408fe5d3bf801e2ab695a75dc100da9
Showing 1 changed file with 18 additions and 15 deletions.
@@ -32,20 +32,20 @@ To use this Apache Druid extension, [include](../../development/extensions.md#lo

### Reading data from S3

The [S3 input source](../../ingestion/native-batch-input-source.md#s3-input-source) is supported by the [Parallel task](../../ingestion/native-batch.md)
to read objects directly from S3. If you use the [Hadoop task](../../ingestion/hadoop.md),
you can read data from S3 by specifying the S3 paths in your [`inputSpec`](../../ingestion/hadoop.md#inputspec).
Use a native batch [Parallel task](../../ingestion/native-batch.md) with an [S3 input source](../../ingestion/native-batch-input-sources.html#s3-input-source) to read objects directly from S3.

To configure the extension to read objects from S3 you need to configure how to [connect to S3](#configuration).
Alternatively, use a [Hadoop task](../../ingestion/hadoop.md),
and specify S3 paths in your [`inputSpec`](../../ingestion/hadoop.md#inputspec).

To read objects from S3, you must supply [connection information](#configuration) in configuration.

### Deep Storage

S3-compatible deep storage means either AWS S3 or a compatible service like Google Storage which exposes the same API as S3.

S3 deep storage needs to be explicitly enabled by setting `druid.storage.type=s3`. **Only after setting the storage type to S3 will any of the settings below take effect.**

To correctly configure this extension for deep storage in S3, first configure how to [connect to S3](#configuration).
In addition to this you need to set additional configuration, specific for [deep storage](#deep-storage-specific-configuration)
To use S3 for Deep Storage, you must supply [connection information](#configuration) in configuration *and* set additional configuration, specific for [Deep Storage](#deep-storage-specific-configuration).

#### Deep storage specific configuration

@@ -63,8 +63,9 @@ In addition to this you need to set additional configuration, specific for [deep

### S3 authentication methods

Druid uses the following credentials provider chain to connect to your S3 bucket (whether a deep storage bucket or source bucket).
**Note :** *You can override the default credentials provider chain for connecting to source bucket by specifying an access key and secret key using [Properties Object](../../ingestion/native-batch-input-source.md#s3-input-source) parameters in the ingestionSpec.*
You can provide credentials to connect to S3 in a number of ways, whether for [deep storage](#deep-storage) or as an [ingestion source](#reading-data-from-s3).

The configuration options are listed in order of precedence. For example, if you would like to use profile information given in `~/.aws.credentials`, do not set `druid.s3.accessKey` and `druid.s3.secretKey` in your Druid config file because they would take precedence.

|order|type|details|
|--------|-----------|-------|
@@ -76,23 +77,25 @@ Druid uses the following credentials provider chain to connect to your S3 bucket
|6|ECS container credentials|Based on environment variables available on AWS ECS (AWS_CONTAINER_CREDENTIALS_RELATIVE_URI or AWS_CONTAINER_CREDENTIALS_FULL_URI) as described in the [EC2ContainerCredentialsProviderWrapper documentation](https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/EC2ContainerCredentialsProviderWrapper.html)|
|7|Instance profile information|Based on the instance profile you may have attached to your druid instance|

You can find more information about authentication method [here](https://docs.aws.amazon.com/fr_fr/sdk-for-java/v1/developer-guide/credentials)<br/>
**Note :** *Order is important here as it indicates the precedence of authentication methods.<br/>
So if you are trying to use Instance profile information, you **must not** set `druid.s3.accessKey` and `druid.s3.secretKey` in your Druid runtime.properties*
For more information, refer to the [Amazon Developer Guide](https://docs.aws.amazon.com/fr_fr/sdk-for-java/v1/developer-guide/credentials).

Alternatively, you can bypass this chain by specifying an access key and secret key using a [Properties Object](../../ingestion/native-batch-input-sources.html#s3-input-source) inside your ingestion specification.

Use the property [`druid.startup.logging.maskProperties`](../../configuration/index.html#startup-logging) to mask credentials information in Druid logs. For example, `["password", "secretKey", "awsSecretAccessKey"]`.

### S3 permissions settings

`s3:GetObject` and `s3:PutObject` are basically required for pushing/loading segments to/from S3.
`s3:GetObject` and `s3:PutObject` are required for pushing or pulling segments to or from S3.

If `druid.storage.disableAcl` is set to `false`, then `s3:GetBucketAcl` and `s3:PutObjectAcl` are additionally required to set ACL for objects.

### AWS region

The AWS SDK requires that the target region be specified. Two ways of doing this are by using the JVM system property `aws.region` or the environment variable `AWS_REGION`.
The AWS SDK requires that a target region be specified. You can set these by using the JVM system property `aws.region` or by setting an environment variable `AWS_REGION`.

As an example, to set the region to 'us-east-1' through system properties:
For example, to set the region to 'us-east-1' through system properties:

- Add `-Daws.region=us-east-1` to the jvm.config file for all Druid services.
- Add `-Daws.region=us-east-1` to the `jvm.config` file for all Druid services.
- Add `-Daws.region=us-east-1` to `druid.indexer.runner.javaOpts` in [Middle Manager configuration](../../configuration/index.md#middlemanager-configuration) so that the property will be passed to Peon (worker) processes.

### Connecting to S3 configuration

0 comments on commit f1841c6

Please sign in to comment.