Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] Update and Edit Manage Credentials #9642

Merged
merged 11 commits into from
Mar 22, 2024
251 changes: 160 additions & 91 deletions docs/docusaurus/docs/core/installation_and_setup/manage_credentials.md
Original file line number Diff line number Diff line change
@@ -1,145 +1,214 @@
---
title: Manage credentials
toc_min_heading_level: 2
toc_max_heading_level: 2
---

import TabItem from '@theme/TabItem';
import Tabs from '@theme/Tabs';

import InProgress from '../_core_components/_in_progress.md'

Many environments and data storage systems require the provision of credentials to access. It is important for these credentials to be stored in a secure way, outside of version control. Great Expectations (GX) offers two ways to store and access secrets out of the box: in a YAML file that exists outside of version control, or as environment variables. GX also supports third party secrets managers for Amazon Web Services, Google Cloud Platform, and Microsoft Azure.
To access a deployment environment or a data storage system you must provide your access credentials. These access credentials must be stored securely outside of version control. With GX 1.0 you can store you access credentials as environment variables, in a YAML file that exists outside of version control, or in a third-party secrets manager. GX 1.0 supports the AWS Secrets Manager, Google Cloud Secret Manager, and Azure Key Vault secrets managers.

<Tabs
queryString="credential-style"
groupId="credentials"
defaultValue='yaml'
values={
[
{label: 'Environment variables', value:'environment_variables'},
{label: 'YAML file', value:'yaml'},
{label: 'Secrets manager', value:'secrets_manager'},
]
}>
## Environment variables

<TabItem value="yaml" label="YAML file">
</TabItem>
1. To store your database password and connection string as environment variables, run ``export ENV_VAR_NAME=env_var_value`` in a terminal or add the commands to your ``~/.bashrc`` file. For example:

<TabItem value="environment_variables" label="Environment variables">
</TabItem>
```bash title="Terminal" name="docs/docusaurus/docs/oss/guides/setup/configuring_data_contexts/how_to_configure_credentials.py export_env_vars"
```

<TabItem value="secrets_manager" label="Secrets manager">
2. Run the following code to use the `connection_string` parameter values when you add a `datasource` to a Data Context:

Select one of the following secret manager applications:
```python title="Python" name="docs/docusaurus/docs/oss/guides/setup/configuring_data_contexts/how_to_configure_credentials.py add_credentials_as_connection_string"
kwcanuck marked this conversation as resolved.
Show resolved Hide resolved
```

<Tabs queryString="manager" groupId="manager" defaultValue='aws' values={[{label: 'AWS Secrets Manager', value:'aws'}, {label: 'GCP Secret Manager', value:'gcp'}, {label: 'Azure Key Vault', value:'azure'}]}>
## YAML file

<TabItem value="aws">
</TabItem>
YAML files make variables more visible, are easier to edit, and allow for modularization. For example, you can create a YAML file for development and testing and another for production.

<TabItem value="gcp">
</TabItem>
The default ``config_variables.yml`` file located at ``great_expectations/uncommitted/config_variables.yml`` applies to deployments using ``FileSystemDataContexts``.

<TabItem value="azure">
</TabItem>
To view all the code used in the examples, see [how_to_configure_credentials.py](https://github.com/great-expectations/great_expectations/tree/develop/docs/docusaurus/docs/oss/guides/setup/configuring_data_contexts/how_to_configure_credentials.py).
kwcanuck marked this conversation as resolved.
Show resolved Hide resolved

</Tabs>
1. Save your access credentials or the database connection string to ``great_expectations/uncommitted/config_variables.yml``. For example:

</TabItem>
```yaml title="YAML" name="docs/docusaurus/docs/oss/guides/setup/configuring_data_contexts/how_to_configure_credentials.py config_variables_yaml"
```
To store values that include the dollar sign character ``$``, escape them using a backslash ``\`` to avoid substitution issues. For example, in the previous example for Postgres credentials you'd set ``password: pa\$sword`` if your password is ``pa$sword``. You can also have multiple substitutions for the same item. For example ``database_string: ${USER}:${PASSWORD}@${HOST}:${PORT}/${DATABASE}``.

</Tabs>

## Prerequisites
2. Run the following code to use the `connection_string` parameter values when you add a `datasource` to a Data Context:

<Tabs
queryString="credential-style"
groupId="credentials"
defaultValue='yaml'
values={
[
{label: 'Environment variables', value:'environment_variables'},
{label: 'YAML file', value:'yaml'},
{label: 'Secrets manager', value:'secrets_manager'},
]
}>
```python title="Python" name="docs/docusaurus/docs/oss/guides/setup/configuring_data_contexts/how_to_configure_credentials.py add_credential_from_yml"
```

<TabItem value="yaml" label="YAML file">
## Secrets manager

- An existing File Data Context. To create a new File Data Context, see [Initialize a new Data Context](/core/installation_and_setup/manage_data_contexts.md?context-type=file#initialize-a-new-data-context).
GX 1.0 supports the AWS Secrets Manager, Google Cloud Secret Manager, and Azure Key Vault secrets managers.

<Tabs
queryString="secrets_manager"
groupId="manage-credentials"
defaultValue='aws'
values={[
{label: 'AWS Secrets Manager', value:'aws'},
{label: 'Google Cloud Secret Manager', value:'gcp'},
{label: 'Azure Key Vault', value:'azure'},
]}>
<TabItem value="aws">

</TabItem>
Configure your Great Expectations project to substitute variables from the AWS Secrets Manager. Secrets store substitution uses the configurations from your ``config_variables.yml`` file after all other types of substitution are applied with environment variables.

<TabItem value="environment_variables" label="Environment variables">
Secrets store substitution uses keywords and retrieves secrets from the secrets store for values starting with ``secret|arn:aws:secretsmanager``. If the values you provide don't match the keywords, the values aren't substituted.

- The ability to set environment variables.
1. Optional. Run the following code to install the ``great_expectations`` package with the ``aws_secrets`` requirement:

```bash
pip install 'great_expectations[aws_secrets]'
```

</TabItem>
2. Provide an arn of the secret to substitute your value by a secret in AWS Secrets Manager. For example:

<TabItem value="secrets_manager" label="Secrets manager">
``secret|arn:aws:secretsmanager:123456789012:secret:my_secret-1zAyu6``

<Tabs queryString="manager" groupId="manager" defaultValue='aws' values={[{label: 'AWS Secrets Manager', value:'aws'}, {label: 'GCP Secret Manager', value:'gcp'}, {label: 'Azure Key Vault', value:'azure'}]}>
The last seven characters of the arn are automatically generated by AWS and are not mandatory to retrieve the secret. For example, ``secret|arn:aws:secretsmanager:region-name-1:123456789012:secret:my_secret`` retrieves the same secret. The latest version of the secret is returned by default.

<TabItem value="aws">
<InProgress/>
</TabItem>
3. Optional. To get a specific version of the secret you want to retrieve, specify its version UUID. For example,``secret|arn:aws:secretsmanager:region-name-1:123456789012:secret:my_secret:00000000-0000-0000-0000-000000000000``.

<TabItem value="gcp">
<InProgress/>
</TabItem>
4. Optional. To retrieve a specific secret value for a JSON string, use ``secret|arn:aws:secretsmanager:region-name-1:123456789012:secret:my_secret|key`` or
``secret|arn:aws:secretsmanager:region-name-1:123456789012:secret:my_secret:00000000-0000-0000-0000-000000000000|key``.

<TabItem value="azure">
<InProgress/>
</TabItem>
5. Save your access credentials or the database connection string to ``great_expectations/uncommitted/config_variables.yml``. For example:

</Tabs>
```yaml
# We can configure a single connection string
my_aws_creds: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|connection_string

</TabItem>
# Or each component of the connection string separately
drivername: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|drivername
host: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|host
port: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|port
username: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|username
password: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|password
database: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|database
```

</Tabs>
6. Run the following code to use the `connection_string` parameter values when you add a `datasource` to a Data Context:

```python
# We can use a single connection string
pg_datasource = context.sources.add_or_update_sql(
name="my_postgres_db", connection_string="${my_aws_creds}"
)

# Or each component of the connection string separately
pg_datasource = context.sources.add_or_update_sql(
name="my_postgres_db", connection_string="${drivername}://${username}:${password}@${host}:${port}/${database}"
)
```

</TabItem>
<TabItem value="gcp">

Configure your Great Expectations project to substitute variables from the Google Cloud Secret Manager. Secrets store substitution uses the configurations from your ``config_variables.yml`` file after all other types of substitution are applied with environment variables.

Secrets store substitution uses keywords and retrieves secrets from the secrets store for values matching the following regex ``^secret\|projects\/[a-z0-9\_\-]{6,30}\/secrets``. If the values you provide don't match the keywords, the values aren't substituted.

1. Optional. Run the following code to install the ``great_expectations`` package with the ``gcp`` requirement:

```bash
pip install 'great_expectations[gcp]'
```

2. Provide the name of the secret you want to substitute in GCP Secret Manager. For example, ``secret|projects/project_id/secrets/my_secret``.

The latest version of the secret is returned by default.

3. Optional. To get a specific version of the secret, specify its version id. For example, ``secret|projects/project_id/secrets/my_secret/versions/1``.

4. Optional. To retrieve a specific secret value for a JSON string, use ``secret|projects/project_id/secrets/my_secret|key`` or ``secret|projects/project_id/secrets/my_secret/versions/1|key``.

5. Save your access credentials or the database connection string to ``great_expectations/uncommitted/config_variables.yml``. For example:

```yaml
# We can configure a single connection string
my_gcp_creds: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|connection_string

# Or each component of the connection string separately
drivername: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_DRIVERNAME
host: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_HOST
port: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_PORT
username: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_USERNAME
password: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_PASSWORD
database: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_DATABASE
```

6. Run the following code to use the `connection_string` parameter values when you add a `datasource` to a Data Context:

```python
# We can use a single connection string
pg_datasource = context.sources.add_or_update_sql(
name="my_postgres_db", connection_string="${my_gcp_creds}"
)

# Or each component of the connection string separately
pg_datasource = context.sources.add_or_update_sql(
name="my_postgres_db", connection_string="${drivername}://${username}:${password}@${host}:${port}/${database}"
)
```

</TabItem>
<TabItem value="azure">

Configure your Great Expectations project to substitute variables from the Azure Key Vault. Secrets store substitution uses the configurations from your ``config_variables.yml`` file after all other types of substitution are applied with environment variables.

Secrets store substitution uses keywords and retrieves secrets from the secrets store for values matching the following regex ``^secret\|https:\/\/[a-zA-Z0-9\-]{3,24}\.vault\.azure\.net``. If the values you provide don't match the keywords, the values aren't substituted.

1. Optional. Run the following code to install the ``great_expectations`` package with the ``azure_secrets`` requirement:

## Define credentials
```bash
pip install 'great_expectations[azure_secrets]'
```

<Tabs
queryString="credential-style"
groupId="credentials"
defaultValue='yaml'
values={
[
{label: 'Environment variables', value:'environment_variables'},
{label: 'YAML file', value:'yaml'},
{label: 'Secrets manager', value:'secrets_manager'},
]
}>
2. Provide the name of the secret you want to substitute in Azure Key Vault. For example, ``secret|https://my-vault-name.vault.azure.net/secrets/my-secret``.

<TabItem value="yaml" label="YAML file">
<InProgress/>
</TabItem>
The latest version of the secret is returned by default.

<TabItem value="environment_variables" label="Environment variables">
<InProgress/>
</TabItem>
3. Optional. To get a specific version of the secret, specify its version id (32 lowercase alphanumeric characters). For example, ``secret|https://my-vault-name.vault.azure.net/secrets/my-secret/a0b00aba001aaab10b111001100a11ab``.

<TabItem value="secrets_manager" label="Secrets manager">
4. Optional. To retrieve a specific secret value for a JSON string, use ``secret|https://my-vault-name.vault.azure.net/secrets/my-secret|key`` or ``secret|https://my-vault-name.vault.azure.net/secrets/my-secret/a0b00aba001aaab10b111001100a11ab|key``.

<Tabs queryString="manager" groupId="manager" defaultValue='aws' values={[{label: 'AWS Secrets Manager', value:'aws'}, {label: 'GCP Secret Manager', value:'gcp'}, {label: 'Azure Key Vault', value:'azure'}]}>
5. Save your access credentials or the database connection string to ``great_expectations/uncommitted/config_variables.yml``. For example:

<TabItem value="aws">
<InProgress/>
</TabItem>
```yaml
# We can configure a single connection string
my_abs_creds: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|connection_string

<TabItem value="gcp">
<InProgress/>
</TabItem>
# Or each component of the connection string separately
drivername: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|host
host: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|host
port: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|port
username: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|username
password: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|password
database: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|database
```

<TabItem value="azure">
<InProgress/>
</TabItem>
6. Run the following code to use the `connection_string` parameter values when you add a `datasource` to a Data Context:

</Tabs>
```python
# We can use a single connection string
pg_datasource = context.sources.add_or_update_sql(
name="my_postgres_db", connection_string="${my_azure_creds}"
)

</TabItem>
# Or each component of the connection string separately
pg_datasource = context.sources.add_or_update_sql(
name="my_postgres_db", connection_string="${drivername}://${username}:${password}@${host}:${port}/${database}"
)
```

</TabItem>
</Tabs>

## Next steps
Expand Down
23 changes: 20 additions & 3 deletions docs/docusaurus/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -80,9 +80,26 @@ module.exports = {
]
},
{
type: 'doc',
id: 'core/installation_and_setup/manage_credentials',
label: '🚧 Manage credentials'
type: 'category',
label: 'Manage credentials',
link: {type: 'doc', id: 'core/installation_and_setup/manage_credentials'},
items: [
{
type: 'link',
label: 'Environment variables',
href: '/docs/1.0-prerelease/core/installation_and_setup/manage_credentials#environment-variables',
},
{
type: 'link',
label: 'YAML file',
href: '/docs/1.0-prerelease/core/installation_and_setup/manage_credentials#yaml-file',
},
{
type: 'link',
label: 'Secrets manager',
href: '/docs/1.0-prerelease/core/installation_and_setup/manage_credentials#secrets-manager',
},
]
},
{
type: 'doc',
Expand Down