Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,13 @@ curl --request 'POST' --location \
"schema": "<schema>",
"volume": "<volume>",
"volume_path": "<volume-path>",

# For Databricks OAuth machine-to-machine (M2M) authentication:
"client_secret": "<client-secret>",
"client_id": "<client-id>"

# For Databricks personal access token authentication:
"token": "<token>"
}
}'
```
5 changes: 5 additions & 0 deletions snippets/destination_connectors/databricks_volumes_sdk.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,13 @@ with UnstructuredClient(api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")) as clien
schema="<schema>",
volume="<volume>",
volume_path="<volume-path>",

# For Databricks OAuth machine-to-machine (M2M) authentication:
client_secret="<client-secret>",
client_id="<client-id>"

# For Databricks personal access token authentication:
token="<token>"
)
)
)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
- `<name>` (_required_) - A unique name for this connector.
- `<host>` (_required_) - The Databricks workspace host URL.
- `<client-id>` (_required_) - The **Client ID** (or **UUID** or **Application ID**) value for the Databricks managed service principal that has the appropriate privileges to the volume.
- `<client-secret>` (_required_) - The associated OAuth **Secret** value for the Databricks managed service principal that has the appropriate privileges to the volume.
- `<client-id>` (_required_) - For Databricks OAuth machine-to-machine (M2M) authentication,
the **Client ID** (or **UUID** or **Application ID**) value for the Databricks managed service principal that has the appropriate privileges to the volume.
- `<client-secret>` (_required_) - For Databricks OAuth M2M authentication,
the associated OAuth **Secret** value for the Databricks managed service principal that has the appropriate privileges to the volume.
- `<token>` (_required_) - For Databricks personal access token authentication, the personal access token's value.
- `<catalog>` (_required_) - The name of the catalog to use.
- `<schema>` - The name of the associated schema. If not specified, `default` is used.
- `<volume>` (_required_) - The name of the associated volume.
Expand Down
8 changes: 6 additions & 2 deletions snippets/general-shared-text/databricks-volumes-platform.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ Fill in the following fields:
- **Schema** : The name of the associated schema. If not specified, **default** is used.
- **Volume** (_required_): The name of the associated volume.
- **Volume Path** : Any optional path to access within the volume.
- **Client Secret** (_required_): The associated OAuth **Secret** value for the Databricks managed service principal that has the appropriate privileges to the volume.
- **Client ID** (_required_): The **Client ID** (or **UUID** or **Application ID**) value for the Databricks managed service principal that has appropriate privileges to the volume.

- For **Authentication Method**, if you select **Service Principal**, you must also specify the following:

- **Client Secret** (_required_): The associated OAuth **Secret** value for the Databricks managed service principal that has the appropriate privileges to the volume.
- **Client ID** (_required_): The **Client ID** (or **UUID** or **Application ID**) value for the Databricks managed service principal that has appropriate privileges to the volume.

- For **Authentication Method**, if you select **Token**, you must also specify the Databricks personal access token's value in the **Token** field.
54 changes: 38 additions & 16 deletions snippets/general-shared-text/databricks-volumes.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -20,25 +20,47 @@
[Azure](https://learn.microsoft.com/azure/databricks/dev-tools/auth/),
or [GCP](https://docs.gcp.databricks.com/dev-tools/auth/index.html).

For the [Unstructured UI](/ui/overview) or the [Unstructured API](/api-reference/overview), only Databricks OAuth machine-to-machine (M2M) authentication is supported for
[AWS](https://docs.databricks.com/dev-tools/auth/oauth-m2m.html),
[Azure](https://learn.microsoft.com/azure/databricks/dev-tools/auth/oauth-m2m), and
[GCP](https://docs.gcp.databricks.com/dev-tools/auth/oauth-m2m.html).
You will need the the **Client ID** (or **UUID** or **Application** ID) and OAuth **Secret** (client secret) values for the corresponding service principal.
Note that for Azure, only Databricks managed service principals are supported. Microsoft Entra ID managed service principals are not supported.
For the [Unstructured UI](/ui/overview) or the [Unstructured API](/api-reference/overview), the following Databricks authentication types are supported:

- Databricks OAuth machine-to-machine (M2M) authentication for
[AWS](https://docs.databricks.com/dev-tools/auth/oauth-m2m.html),
[Azure](https://learn.microsoft.com/azure/databricks/dev-tools/auth/oauth-m2m), or
[GCP](https://docs.gcp.databricks.com/dev-tools/auth/oauth-m2m.html).

You will need the the **Client ID** (or **UUID** or **Application** ID) and OAuth **Secret** (client secret) values for the corresponding service principal.
Note that for Azure, only Databricks managed service principals are supported. Microsoft Entra ID managed service principals are not supported.

The following video shows how to create a Databricks managed service principal:
The following video shows how to create a Databricks managed service principal:

<iframe
width="560"
height="315"
src="https://www.youtube.com/embed/wBmqv5DaA1E"
title="YouTube video player"
frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
allowfullscreen
></iframe>
<iframe
width="560"
height="315"
src="https://www.youtube.com/embed/wBmqv5DaA1E"
title="YouTube video player"
frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
allowfullscreen
></iframe>

- Databricks personal access token authentication for
[AWS](https://docs.databricks.com/dev-tools/auth/pat.html),
[Azure](https://learn.microsoft.com/azure/databricks/dev-tools/auth/pat), or
[GCP](https://docs.gcp.databricks.com/dev-tools/auth/pat.html).

You will need the personal access token's value.

The following video shows how to create a Databricks personal access token:

<iframe
width="560"
height="315"
src="https://www.youtube.com/embed/OzEU2miAS6I"
title="YouTube video player"
frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
allowfullscreen
></iframe>

For [Unstructured Ingest](/ingestion/overview), the following Databricks authentication types are supported:

- For Databricks personal access token authentication for
Expand Down
11 changes: 8 additions & 3 deletions snippets/source_connectors/databricks_volumes_rest_create.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,17 @@ curl --request 'POST' --location \
"type": "databricks_volumes",
"config": {
"host": "<host>",
"client_id": "<client-id>"
"client_secret": "<client-secret>",
"catalog": "<catalog>",
"schema": "<schema>",
"volume": "<volume>",
"volume_path": "<volume-path>"
"volume_path": "<volume-path>",

# For Databricks OAuth machine-to-machine (M2M) authentication:
"client_id": "<client-id>"
"client_secret": "<client-secret>"

# For Databricks personal access token authentication:
"token": "<token>"
}
}'
```
11 changes: 8 additions & 3 deletions snippets/source_connectors/databricks_volumes_sdk.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,17 @@ with UnstructuredClient(api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")) as clien
type=SourceConnectorType.DATABRICKS_VOLUMES,
config=DatabricksVolumesConnectorConfigInput(
catalog="<catalog>",
client_id="<client_id>",
client_secret="<client_secret>",
host="<host>",
schema_="<schema>",
volume="<volume>",
volume_path="<volume_path>"
volume_path="<volume_path>",

# For Databricks OAuth machine-to-machine (M2M) authentication:
client_id="<client_id>",
client_secret="<client_secret>"

# For Databricks personal access token authentication:
token="<token>"
)
)
)
Expand Down