From f08783a1e75701e59607dc96d4225a7adce96677 Mon Sep 17 00:00:00 2001 From: genwhittTTD Date: Tue, 18 Nov 2025 17:53:54 -0500 Subject: [PATCH 01/16] databricks guide --- docs/guides/integration-databricks.md | 385 ++++++++++++++++++++++++++ sidebars.js | 3 +- 2 files changed, 387 insertions(+), 1 deletion(-) create mode 100644 docs/guides/integration-databricks.md diff --git a/docs/guides/integration-databricks.md b/docs/guides/integration-databricks.md new file mode 100644 index 000000000..7c02b1a48 --- /dev/null +++ b/docs/guides/integration-databricks.md @@ -0,0 +1,385 @@ +--- +title: Databricks Integration +sidebar_label: Databricks +pagination_label: Databricks Integration +description: Information about integrating with UID2 through Databricks. +hide_table_of_contents: false +sidebar_position: 04 +displayed_sidebar: docs +--- + +import Link from '@docusaurus/Link'; + +# UID2 Databricks Clean Room Integration Guide + +Overview general info plus define audience. + +## Databricks listing? + +xxx + +## Functionality + +xxx + +### Key Benefits + +xxx + +## Summary of Integration Steps + +------------------- MATT GUIDE, BEGIN ------------------------------ + + +## Summary of Integration Steps + +At a high level, the following are the steps to set up your Databricks integration and process your data: + +1. Create a clean room and invite UID2 as a collaborator. +1. Send your sharing identifier to your UID2 contact. +1. Add data to the clean room. +1. Run the clean room notebook to map directly identifying information (DII). + +## Step 1: Create a clean room and invite UID2 as a collaborator + +Follow the steps in Create clean rooms in the Databricks documentation. Use the correct sharing identifier from the table below, based on the UID2 Environment you wish to connect to. +UID2 sharing identifiers can change. Be sure to check this page for the latest sharing identifiers. + +| Environment | UID2 Sharing Identifier | +| :--- | :--- | +| Production | aws:us-east-2:21149de7-a9e9-4463-b4e0-066f4b033e5d:673872910525611:010d98a6-8cf2-4011-8bf7-ca45940bc329 | +Integration | aws:us-east-2:4651b4ea-b29c-42ec-aecb-2377de70bbd4:2366823546528067:c15e03bf-a348-4189-92e5-68b9a7fb4018 | + +:::note +Once you've created a clean room, you cannot change its collaborators. + +If you have the option to set clean room collaborator aliases—for example, if you’re using the Databricks Python SDK [**GWH__MC is this the UID2 Python SDK? Or a Databrics SDK?**]to create the clean room—your collaborator alias must be `creator` and the UID2 collaborator alias must be `collaborator`. If you’re creating the clean room using the Databricks web UI, the correct collaborator aliases are set for you. +::: + +## Step 2: Send your sharing identifier to your UID2 contact + +Find the sharing identifier for the Unity Catalog metastore that is attached to the Databricks workspace where you’ll work with the clean room. Send the sharing identifier to your UID2 contact. +The sharing identifier is a string in this format: `::`. + +For information on how to find the sharing identifier, see Get access in the Databricks-to-Databricks model in the Databricks documentation. + +## Step 3: Add data to the clean room + +Add one or more tables or views to the clean room. You can use any names for the schema, tables, and views. Tables and views must follow the schema detailed in [Input Table](#uptohere)Schema. + +## Step 4: Run the clean room notebook to map DII + +Run the `identity_map_v3` clean room notebook to map DII to UID2s. Details about this notebook are given in the next section. +Map DII +The `identity_map_v3` clean room notebook maps DII to UID2s. +Notebook Parameters +The `identity_map_v3` notebook can be used to map DII in any table or view that has been added to the creator catalog of the clean room. +The notebook has two parameters, input_schema and input_table. Together they identify the table or view in the clean room that contains the DII to be mapped. +For example, to map DII in the clean room table named creator.default.emails, set input_schema to default and input_table to emails. +Parameter Name +Description +input_schema +The schema containing the table or view. +input_table +The name of the table or view containing the DII to be mapped. +Input Table +The input table or view must have two columns: INPUT and INPUT_TYPE. The table or view can have additional columns, but they won’t be used by the notebook. +Column Name +Data Type +Description +INPUT +string +The DII to map. +INPUT_TYPE +string +The type of DII to map. Allowed values: email, email_hash, phone, and phone_hash. +DII Format +If the DII is an email address, the notebook normalizes the data using the UID2 Email Address Normalization rules. +If the DII is a phone number, you must normalize it before mapping it with the notebook, using the UID2 Phone Number Normalization rules. +Output Table +If the clean room has an output catalog, the mapped DII will be written to a table in the output catalog. Output tables are stored for 30 days. For more information, see Overview of output tables in the Databricks documentation. +Output Table Schema +Column Name +Data Type +Description +UID +string +The value is one of the following: +DII was successfully mapped: The UID2 associated with the DII. +Otherwise: NULL. +PREV_UID +string +The value is one of the following: +DII was successfully mapped and the current raw UID2 was rotated in the last 90 days: the previous raw UID2. +Otherwise: NULL. +REFRESH_FROM +timestamp +The value is one of the following: +DII was successfully mapped: The timestamp (in epoch seconds) indicating when this UID2 should be refreshed. +Otherwise: NULL. +UNMAPPED +string +The value is one of the following: +DII was successfully mapped: NULL. +Otherwise: The reason why the identifier was not mapped: OPTOUT, INVALID IDENTIFIER, or INVALID INPUT TYPE. +For details, see Values for the UNMAPPED Column. +Values for the UNMAPPED Column +The following table shows possible values for the UNMAPPED column. +Value +Meaning +NULL +The DII was successfully mapped. +OPTOUT +The user has opted out. +INVALID IDENTIFIER +The email address or phone number is invalid. +INVALID INPUT TYPE +The value of INPUT_TYPE is invalid. Valid values for INPUT_TYPE are: email, email_hash, phone, phone_hash. + + + + + + + +------------------- MATT GUIDE, END ------------------------------ + + +------------------- BELOW IS A COPY OF SNOWFLAKE DOC HEADINGS ------------------------------ + + + +xxx + +## Testing in the Integ Environment + +xxx + +## Shared Objects/Functions? + +xxx + +### Database and Schema Names + +Query examples? + +xxx | + +### Map DII + +Define types of DII? + +Does the service normalize? (for email / for phone) + + +A successful query returns ...? + + + +#### Examples + +Mapping request examples in this section: + +- [Single Unhashed Email](#mapping-request-example---single-unhashed-email) +- [Multiple Unhashed Emails](#mapping-request-example---multiple-unhashed-emails) +- [Single Unhashed Phone Number](#mapping-request-example---single-unhashed-phone-number) +- [Multiple Unhashed Phone Numbers](#mapping-request-example---multiple-unhashed-phone-numbers) +- [Single Hashed Email](#mapping-request-example---single-hashed-email) +- [Multiple Hashed Emails](#mapping-request-example---multiple-hashed-emails) +- [Single Hashed Phone Number](#mapping-request-example---single-hashed-phone-number) +- [Multiple Hashed Phone Numbers](#mapping-request-example---multiple-hashed-phone-numbers) + +:::note +The input and output data in these examples is fictitious, for illustrative purposes only. The values provided are not real values. +::: + +#### Mapping Request Example - Single Unhashed Email + +The following query illustrates how to map a single email address, using the [default database and schema names](#database-and-schema-names). + +```sql +select UID, PREV_UID, REFRESH_FROM, UNMAPPED from table(UID2_PROD_UID_SH.UID.FN_T_IDENTITY_MAP_V3('validate@example.com', 'email')); +``` + +Query results for a single email: + +``` ++----------------------------------------------+--------------------------------------------------+--------------+----------+ +| UID | PREV_UID | REFRESH_FROM | UNMAPPED | ++----------------------------------------------+--------------------------------------------------+--------------+----------+ +| 2ODl112/VS3x2vL+kG1439nPb7XNngLvOWiZGaMhdcU= | vP9zK2mL7fR4tY8qN3wE6xB0dH5jA1sC+nI/oGuMeVa= | 1735689600 | NULL | ++----------------------------------------------+--------------------------------------------------+--------------+----------+ +``` + +#### Mapping Request Example - Multiple Unhashed Emails + +The following query illustrates how to map multiple email addresses, using the [default database and schema names](#database-and-schema-names). + +```sql +select a.ID, a.EMAIL, m.UID, m.PREV_UID, m.REFRESH_FROM, m.UNMAPPED from AUDIENCE a LEFT JOIN( + select ID, t.* from AUDIENCE, lateral UID2_PROD_UID_SH.UID.FN_T_IDENTITY_MAP_V3(EMAIL, 'email') t) m + on a.ID=m.ID; +``` + +Query results for multiple emails: + +The following table identifies each item in the response, including `NULL` values for `NULL` or improperly formatted emails. + +``` ++----+----------------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ +| ID | EMAIL | UID | PREV_UID | REFRESH_FROM | UNMAPPED | ++----+----------------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ +| 1 | validate@example.com | 2ODl112/VS3x2vL+kG1439nPb7XNngLvOWiZGaMhdcU= | vP9zK2mL7fR4tY8qN3wE6xB0dH5jA1sC+nI/oGuMeVa= | 1735689600 | NULL | +| 2 | test@uidapi.com | IbW4n6LIvtDj/8fCESlU0QG9K/fH63UdcTkJpAG8fIQ= | NULL | 1735689600 | NULL | +| 3 | optout@example.com | NULL | NULL | NULL | OPTOUT | +| 4 | invalid-email | NULL | NULL | NULL | INVALID IDENTIFIER | +| 5 | NULL | NULL | NULL | NULL | INVALID IDENTIFIER | ++----+----------------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ +``` + +#### Mapping Request Example - Single Unhashed Phone Number + +The following query illustrates how to map a phone number, using the [default database and schema names](#database-and-schema-names). + +You must normalize phone numbers using the UID2 [Phone Number Normalization](../getting-started/gs-normalization-encoding.md#phone-number-normalization) rules. + +```sql +select UID, PREV_UID, REFRESH_FROM, UNMAPPED from table(UID2_PROD_UID_SH.UID.FN_T_IDENTITY_MAP_V3('+12345678901', 'phone')); +``` + +Query results for a single phone number: + +``` ++----------------------------------------------+----------+--------------+----------+ +| UID | PREV_UID | REFRESH_FROM | UNMAPPED | ++----------------------------------------------+----------+--------------+----------+ +| 2ODl112/VS3x2vL+kG1439nPb7XNngLvOWiZGaMhdcU= | NULL | 1735689600 | NULL | ++----------------------------------------------+----------+--------------+----------+ +``` + +#### Mapping Request Example - Multiple Unhashed Phone Numbers + +The following query illustrates how to map multiple phone numbers, using the [default database and schema names](#database-and-schema-names). + +You must normalize phone numbers using the UID2 [Phone Number Normalization](../getting-started/gs-normalization-encoding.md#phone-number-normalization) rules. + +```sql +select a.ID, a.PHONE, m.UID, m.PREV_UID, m.REFRESH_FROM, m.UNMAPPED from AUDIENCE a LEFT JOIN( + select ID, t.* from AUDIENCE, lateral UID2_PROD_UID_SH.UID.FN_T_IDENTITY_MAP_V3(PHONE, 'phone') t) m + on a.ID=m.ID; +``` + +Query results for multiple phone numbers: + +The following table identifies each item in the response, including `NULL` values for `NULL` or invalid phone numbers. + +``` ++----+--------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ +| ID | PHONE | UID | PREV_UID | REFRESH_FROM | UNMAPPED | ++----+--------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ +| 1 | +12345678901 | 2ODl112/VS3x2vL+kG1439nPb7XNngLvOWiZGaMhdcU= | vP9zK2mL7fR4tY8qN3wE6xB0dH5jA1sC+nI/oGuMeVa= | 1735689600 | NULL | +| 2 | +61491570006 | IbW4n6LIvtDj/8fCESlU0QG9K/fH63UdcTkJpAG8fIQ= | NULL | 1735689600 | NULL | +| 3 | +56789123001 | NULL | NULL | NULL | OPTOUT | +| 4 | 1234 | NULL | NULL | NULL | INVALID IDENTIFIER | +| 5 | NULL | NULL | NULL | NULL | INVALID IDENTIFIER | ++----+--------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ +``` + +#### Mapping Request Example - Single Hashed Email + +The following query illustrates how to map a single email address hash, using the [default database and schema names](#database-and-schema-names). + +```sql +select UID, PREV_UID, REFRESH_FROM, UNMAPPED from table(UID2_PROD_UID_SH.UID.FN_T_IDENTITY_MAP_V3(BASE64_ENCODE(SHA2_BINARY('validate@example.com', 256)), 'email_hash')); +``` + +Query results for a single hashed email: + +``` ++----------------------------------------------+----------------------------------------------+--------------+----------+ +| UID | PREV_UID | REFRESH_FROM | UNMAPPED | ++----------------------------------------------+----------------------------------------------+--------------+----------+ +| 2ODl112/VS3x2vL+kG1439nPb7XNngLvOWiZGaMhdcU= | vP9zK2mL7fR4tY8qN3wE6xB0dH5jA1sC+nI/oGuMeVa= | 1735689600 | NULL | ++----------------------------------------------+----------------------------------------------+--------------+----------+ +``` + +#### Mapping Request Example - Multiple Hashed Emails + +The following query illustrates how to map multiple email address hashes, using the [default database and schema names](#database-and-schema-names). + +```sql +select a.ID, a.EMAIL_HASH, m.UID, m.PREV_UID, m.REFRESH_FROM, m.UNMAPPED from AUDIENCE a LEFT JOIN( + select ID, t.* from AUDIENCE, lateral UID2_PROD_UID_SH.UID.FN_T_IDENTITY_MAP_V3(EMAIL_HASH, 'email_hash') t) m + on a.ID=m.ID; +``` + +Query results for multiple hashed emails: + +The following table identifies each item in the response, including `NULL` values for `NULL` hashes. + +``` ++----+----------------------------------------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ +| ID | EMAIL_HASH | UID | PREV_UID | REFRESH_FROM | UNMAPPED | ++----+----------------------------------------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ +| 1 | LdhtUlMQ58ZZy5YUqGPRQw5xUMS5dXG5ocJHYJHbAKI= | 2ODl112/VS3x2vL+kG1439nPb7XNngLvOWiZGaMhdcU= | vP9zK2mL7fR4tY8qN3wE6xB0dH5jA1sC+nI/oGuMeVa= | 1735689600 | NULL | +| 2 | /XJSTajB68SCUyuc3ePyxSLNhxrMKvJcjndq8TuwW5g= | IbW4n6LIvtDj/8fCESlU0QG9K/fH63UdcTkJpAG8fIQ= | NULL | 1735689600 | NULL | +| 2 | UebesrNN0bQkm/QR7Jx7eav+UDXN5Gbq3zs1fLBMRy0= | NULL | NULL | 1735689600 | OPTOUT | +| 4 | NULL | NULL | NULL | NULL | INVALID IDENTIFIER | ++----+----------------------------------------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ +``` + +#### Mapping Request Example - Single Hashed Phone Number + +The following query illustrates how to map a single phone number hash, using the [default database and schema names](#database-and-schema-names). + +```sql +select UID, PREV_UID, REFRESH_FROM, UNMAPPED from table(UID2_PROD_UID_SH.UID.FN_T_IDENTITY_MAP_V3(BASE64_ENCODE(SHA2_BINARY('+12345678901', 256)), 'phone_hash')); +``` + +Query results for a single hashed phone number: + +``` ++----------------------------------------------+----------------------------------------------+--------------+----------+ +| UID | PREV_UID | REFRESH_FROM | UNMAPPED | ++----------------------------------------------+----------------------------------------------+--------------+----------+ +| 2ODl112/VS3x2vL+kG1439nPb7XNngLvOWiZGaMhdcU= | vP9zK2mL7fR4tY8qN3wE6xB0dH5jA1sC+nI/oGuMeVa= | 1735689600 | NULL | ++----------------------------------------------+----------------------------------------------+--------------+----------+ +``` + +#### Mapping Request Example - Multiple Hashed Phone Numbers + +The following query illustrates how to map multiple phone number hashes, using the [default database and schema names](#database-and-schema-names). + +```sql +select a.ID, a.PHONE_HASH, m.UID, m.PREV_UID, m.REFRESH_FROM, m.UNMAPPED from AUDIENCE a LEFT JOIN( + select ID, t.* from AUDIENCE, lateral UID2_PROD_UID_SH.UID.FN_T_IDENTITY_MAP_V3(PHONE_HASH, 'phone_hash') t) m + on a.ID=m.ID; +``` + +Query results for multiple hashed phone numbers: + +The following table identifies each item in the response, including `NULL` values for `NULL` hashes. + +``` ++----+----------------------------------------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ +| ID | PHONE_HASH | UID | PREV_UID | REFRESH_FROM | UNMAPPED | ++----+----------------------------------------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ +| 1 | LdhtUlMQ58ZZy5YUqGPRQw5xUMS5dXG5ocJHYJHbAKI= | 2ODl112/VS3x2vL+kG1439nPb7XNngLvOWiZGaMhdcU= | vP9zK2mL7fR4tY8qN3wE6xB0dH5jA1sC+nI/oGuMeVa= | 1735689600 | NULL | +| 2 | /XJSTajB68SCUyuc3ePyxSLNhxrMKvJcjndq8TuwW5g= | IbW4n6LIvtDj/8fCESlU0QG9K/fH63UdcTkJpAG8fIQ= | NULL | 1735689600 | NULL | +| 2 | UebesrNN0bQkm/QR7Jx7eav+UDXN5Gbq3zs1fLBMRy0= | NULL | NULL | 1735689600 | OPTOUT | +| 4 | NULL | NULL | NULL | NULL | INVALID IDENTIFIER | ++----+----------------------------------------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ +``` + +### Monitor Raw UID2 Refresh and Regenerate Raw UID2s + +xxx + +#### Targeted Input Table + +xxx + + +Query results: + +xxx diff --git a/sidebars.js b/sidebars.js index 59535b29c..fba95d65f 100644 --- a/sidebars.js +++ b/sidebars.js @@ -234,7 +234,8 @@ const fullSidebar = [ ], }, - 'guides/integration-aws-entity-resolution', + 'guides/integration-databricks', + 'guides/integration-aws-entity-resolution', 'guides/integration-advertiser-dataprovider-endpoints', ], }, From 34b7a304661a5044ba0219d05473c861d315cf23 Mon Sep 17 00:00:00 2001 From: genwhittTTD Date: Wed, 19 Nov 2025 15:29:00 -0500 Subject: [PATCH 02/16] doc development --- docs/guides/integration-databricks.md | 255 ++++++------ .../current/guides/integration-databricks.md | 384 ++++++++++++++++++ sidebars.js | 4 +- 3 files changed, 514 insertions(+), 129 deletions(-) create mode 100644 i18n/ja/docusaurus-plugin-content-docs/current/guides/integration-databricks.md diff --git a/docs/guides/integration-databricks.md b/docs/guides/integration-databricks.md index 7c02b1a48..a3aab2463 100644 --- a/docs/guides/integration-databricks.md +++ b/docs/guides/integration-databricks.md @@ -12,144 +12,153 @@ import Link from '@docusaurus/Link'; # UID2 Databricks Clean Room Integration Guide -Overview general info plus define audience. +This guide is for advertisers and data providers who want to manage their raw UID2s in a Databricks environment. -## Databricks listing? +[**GWH__MC01 "Amazon Web Services, Google Cloud Platform, or Microsoft Azure." -- which do we use? Or, any and all?**] -xxx +[**GWH__MC02 Is it for EUID also? I think not?**] + +## Databricks Listing? + +[**GWH__MC03 where do Databricks users go to get more information about UID2 integration?**] ## Functionality -xxx +The following table summarizes the functionality available with the UID2 Databricks integration. -### Key Benefits +| Encrypt Raw UID2 to UID2 Token for Sharing | Decrypt UID2 Token to Raw UID2 | Generate UID2 Token from DII | Refresh UID2 Token | Map DII to Raw UID2s | +| :--- | :--- | :--- | :--- | :--- | +| ✅ | ✅ | —* | — | ✅ | -xxx +*You cannot use Databricks to generate a UID2 token directly from DII. However, you can convert DII to a raw UID2, and then encrypt the raw UID2 into a UID2 token. -## Summary of Integration Steps +### Key Benefits -------------------- MATT GUIDE, BEGIN ------------------------------ +Here are some key benefits of integrating with Databricks for your UID2 processing: +- Native support for managing UID2 workflows within a Databricks data clean room. +- Secure identity interoperability between partner datasets. +- Direct lineage and observability for all UID2-related transformations and joins, for auditing and traceability. +- Streamlined integration between UID2 identifiers and The Trade Desk activation ecosystem. +- Self-service support for marketers and advertisers through Databricks. -## Summary of Integration Steps +## Integration Steps At a high level, the following are the steps to set up your Databricks integration and process your data: -1. Create a clean room and invite UID2 as a collaborator. -1. Send your sharing identifier to your UID2 contact. -1. Add data to the clean room. -1. Run the clean room notebook to map directly identifying information (DII). +1. [Create a clean room for UID2 collaboration](#create-clean-room-for-uid2-collaboration). +1. [Send your Databricks sharing identifier to your UID2 contact](#send-sharing-identifier-to-uid2-contact). +1. [Add data to the clean room](#add-data-to-the-clean-room). +1. [Map DII](#map-dii) by running the clean room notebook. + +### Create Clean Room for UID2 Collaboration -## Step 1: Create a clean room and invite UID2 as a collaborator +As a starting point, create a Databricks clean room—a secure environment for you to collaborate with UID2 to process your data. + +Follow the steps in [Create clean rooms](https://docs.databricks.com/aws/en/clean-rooms/create-clean-room) in the Databricks documentation. Use the correct sharing identifier based on the [UID2 environment](../getting-started/gs-environments) you want to connect to: see [UID2 Sharing Identifiers](#uid2-sharing-identifiers). + +:::important +After you've created a clean room, you cannot change its collaborators. If you have the option to set clean room collaborator aliases—for example, if you’re using the Databricks Python SDK to create the clean room—your collaborator alias must be `creator` and the UID2 collaborator alias must be `collaborator`. If you’re creating the clean room using the Databricks web UI, the correct collaborator aliases are set for you. +::: + +#### UID2 Sharing Identifiers -Follow the steps in Create clean rooms in the Databricks documentation. Use the correct sharing identifier from the table below, based on the UID2 Environment you wish to connect to. UID2 sharing identifiers can change. Be sure to check this page for the latest sharing identifiers. | Environment | UID2 Sharing Identifier | | :--- | :--- | -| Production | aws:us-east-2:21149de7-a9e9-4463-b4e0-066f4b033e5d:673872910525611:010d98a6-8cf2-4011-8bf7-ca45940bc329 | -Integration | aws:us-east-2:4651b4ea-b29c-42ec-aecb-2377de70bbd4:2366823546528067:c15e03bf-a348-4189-92e5-68b9a7fb4018 | - -:::note -Once you've created a clean room, you cannot change its collaborators. - -If you have the option to set clean room collaborator aliases—for example, if you’re using the Databricks Python SDK [**GWH__MC is this the UID2 Python SDK? Or a Databrics SDK?**]to create the clean room—your collaborator alias must be `creator` and the UID2 collaborator alias must be `collaborator`. If you’re creating the clean room using the Databricks web UI, the correct collaborator aliases are set for you. -::: +| Production | `aws:us-east-2:21149de7-a9e9-4463-b4e0-066f4b033e5d:673872910525611:010d98a6-8cf2-4011-8bf7-ca45940bc329` | +Integration | `aws:us-east-2:4651b4ea-b29c-42ec-aecb-2377de70bbd4:2366823546528067:c15e03bf-a348-4189-92e5-68b9a7fb4018` | -## Step 2: Send your sharing identifier to your UID2 contact +### Send Sharing Identifier to UID2 Contact Find the sharing identifier for the Unity Catalog metastore that is attached to the Databricks workspace where you’ll work with the clean room. Send the sharing identifier to your UID2 contact. + The sharing identifier is a string in this format: `::`. -For information on how to find the sharing identifier, see Get access in the Databricks-to-Databricks model in the Databricks documentation. - -## Step 3: Add data to the clean room - -Add one or more tables or views to the clean room. You can use any names for the schema, tables, and views. Tables and views must follow the schema detailed in [Input Table](#uptohere)Schema. - -## Step 4: Run the clean room notebook to map DII - -Run the `identity_map_v3` clean room notebook to map DII to UID2s. Details about this notebook are given in the next section. -Map DII -The `identity_map_v3` clean room notebook maps DII to UID2s. -Notebook Parameters -The `identity_map_v3` notebook can be used to map DII in any table or view that has been added to the creator catalog of the clean room. -The notebook has two parameters, input_schema and input_table. Together they identify the table or view in the clean room that contains the DII to be mapped. -For example, to map DII in the clean room table named creator.default.emails, set input_schema to default and input_table to emails. -Parameter Name -Description -input_schema -The schema containing the table or view. -input_table -The name of the table or view containing the DII to be mapped. -Input Table -The input table or view must have two columns: INPUT and INPUT_TYPE. The table or view can have additional columns, but they won’t be used by the notebook. -Column Name -Data Type -Description -INPUT -string -The DII to map. -INPUT_TYPE -string -The type of DII to map. Allowed values: email, email_hash, phone, and phone_hash. -DII Format -If the DII is an email address, the notebook normalizes the data using the UID2 Email Address Normalization rules. -If the DII is a phone number, you must normalize it before mapping it with the notebook, using the UID2 Phone Number Normalization rules. -Output Table -If the clean room has an output catalog, the mapped DII will be written to a table in the output catalog. Output tables are stored for 30 days. For more information, see Overview of output tables in the Databricks documentation. -Output Table Schema -Column Name -Data Type -Description -UID -string -The value is one of the following: -DII was successfully mapped: The UID2 associated with the DII. -Otherwise: NULL. -PREV_UID -string -The value is one of the following: -DII was successfully mapped and the current raw UID2 was rotated in the last 90 days: the previous raw UID2. -Otherwise: NULL. -REFRESH_FROM -timestamp -The value is one of the following: -DII was successfully mapped: The timestamp (in epoch seconds) indicating when this UID2 should be refreshed. -Otherwise: NULL. -UNMAPPED -string -The value is one of the following: -DII was successfully mapped: NULL. -Otherwise: The reason why the identifier was not mapped: OPTOUT, INVALID IDENTIFIER, or INVALID INPUT TYPE. -For details, see Values for the UNMAPPED Column. -Values for the UNMAPPED Column -The following table shows possible values for the UNMAPPED column. -Value -Meaning -NULL -The DII was successfully mapped. -OPTOUT -The user has opted out. -INVALID IDENTIFIER -The email address or phone number is invalid. -INVALID INPUT TYPE -The value of INPUT_TYPE is invalid. Valid values for INPUT_TYPE are: email, email_hash, phone, phone_hash. - - - - - - - -------------------- MATT GUIDE, END ------------------------------ +For information on how to find the sharing identifier, see [Request the recipient's sharing identifier](https://docs.databricks.com/aws/en/delta-sharing/create-recipient#step-1-request-the-recipients-sharing-identifier) in the Databricks documentation. +[**GWH__MC04 just noting that I changed the above: just the link copy, not the link itself. You had "Get access in the Databricks-to-Databricks model" but the link in your file went to the above. LMK if I need to change anything.**] -------------------- BELOW IS A COPY OF SNOWFLAKE DOC HEADINGS ------------------------------ +### Add Data to the Clean Room +Add one or more tables or views to the clean room. You can use any names for the schema, tables, and views. Tables and views must follow the schema detailed in [Input Table](#input-table ). +### Map DII -xxx +Run the `identity_map_v3` clean room [notebook](#https://docs.databricks.com/aws/en/notebooks/) to map email addresses, phone numbers, or their respective hashes to raw UID2s. + +## Running the Clean Room Notebook + +This section provides details to help you use your Databricks clean room to process your DII into raw UID2s, including the following: + +- [Notebook Parameters](#notebook-parameters) +- [Input Table](#input-table) +- [DII Format and Normalization](#dii-format-and-normalization) +- [Output Table](#output-table) +- [Output Table Schema](#output-table-schema) + +### Notebook Parameters + +You can use the `identity_map_v3` notebook to map DII in any table or view that you've added to the `creator` catalog of the clean room. + +The notebook has two parameters, `input_schema` and `input_table`. Together, these two parameters identify the table or view in the clean room that contains the DII to be mapped. + +For example, to map DII in the clean room table named `creator.default.emails`, set `input_schema` to `default` and `input_table` to `emails`. + +| Parameter Name | Description | +| :--- | :--- | +| `input_schema` | The schema containing the table or view. | +| `input_table` | The name of the table or view containing the DII to be mapped. | + +### Input Table + +The input table or view must have the two columns shown in the following table. The table or view can have additional columns, but the notebook doesn't use any additional columns, only these two. + +| Column Name | Data Type | Description | +| :--- | :--- | :--- | +| `INPUT` | string | The DII to map. | +| `INPUT_TYPE` | string | The type of DII to map. Allowed values: `email`, `email_hash`, `phone`, and `phone_hash`. | + +### DII Format and Normalization + +The normalization requirements depend on the type of DII you're processing, as follows: + +- **Email address**: The notebook normalizes the data using the UID2 [Email Address Normalization](../getting-started/gs-normalization-encoding#email-address-normalization) rules. +- **Phone number**: You must normalize the phone number before mapping it with the notebook, using the UID2 [Phone Number Normalization](../getting-started/gs-normalization-encoding#phone-number-normalization) rules. + +### Output Table + +If the clean room has an output catalog, the mapped DII is written to a table in the output catalog. Output tables are stored for 30 days. + +For details, see [Overview of output tables](https://docs.databricks.com/aws/en/clean-rooms/output-tables#overview-of-output-tables) in the Databricks documentation. + +### Output Table Schema + +The following table provides information about the structure of the output data, including field names and values. + +| Column Name | Data Type | Description | +| :--- | :--- | :--- | +| `UID` | string | The value is one of the following:
  • **DII was successfully mapped**: The UID2 associated with the DII.
  • Othe**rwise: `NULL`.
| +| `PREV_UID` | string | The value is one of the following:
  • **DII was successfully mapped and the current raw UID2 was rotated in the last 90 days**: the previous raw UID2.
  • **Otherwise**: `NULL`.
| +| `REFRESH_FROM` | timestamp | The value is one of the following:
  • **DII was successfully mapped**: The timestamp (in epoch seconds) indicating when this UID2 should be refreshed.
  • **Otherwise**: `NULL`.
| +| `UNMAPPED` | string | The value is one of the following:
  • **DII was successfully mapped**: `NULL`.
  • Othe**rwise: The reason why the identifier was not mapped: `OPTOUT`, `INVALID IDENTIFIER`, or `INVALID INPUT TYPE`.
    For details, see [Values for the UNMAPPED Column](#values-for-the-unmapped-column).
| + +#### Values for the UNMAPPED Column + +The following table shows possible values for the `UNMAPPED` column. + +| Value | Meaning | +| :--- | :--- | +| `NULL` | The DII was successfully mapped. | +| `OPTOUT` | The user has opted out. | +| `INVALID IDENTIFIER` | The email address or phone number is invalid. | +| `INVALID INPUT TYPE` | The value of `INPUT_TYPE` is invalid. Valid values for `INPUT_TYPE` are: `email`, `email_hash`, `phone`, `phone_hash`. | + + + \ No newline at end of file diff --git a/i18n/ja/docusaurus-plugin-content-docs/current/guides/integration-databricks.md b/i18n/ja/docusaurus-plugin-content-docs/current/guides/integration-databricks.md new file mode 100644 index 000000000..a3aab2463 --- /dev/null +++ b/i18n/ja/docusaurus-plugin-content-docs/current/guides/integration-databricks.md @@ -0,0 +1,384 @@ +--- +title: Databricks Integration +sidebar_label: Databricks +pagination_label: Databricks Integration +description: Information about integrating with UID2 through Databricks. +hide_table_of_contents: false +sidebar_position: 04 +displayed_sidebar: docs +--- + +import Link from '@docusaurus/Link'; + +# UID2 Databricks Clean Room Integration Guide + +This guide is for advertisers and data providers who want to manage their raw UID2s in a Databricks environment. + +[**GWH__MC01 "Amazon Web Services, Google Cloud Platform, or Microsoft Azure." -- which do we use? Or, any and all?**] + +[**GWH__MC02 Is it for EUID also? I think not?**] + +## Databricks Listing? + +[**GWH__MC03 where do Databricks users go to get more information about UID2 integration?**] + +## Functionality + +The following table summarizes the functionality available with the UID2 Databricks integration. + +| Encrypt Raw UID2 to UID2 Token for Sharing | Decrypt UID2 Token to Raw UID2 | Generate UID2 Token from DII | Refresh UID2 Token | Map DII to Raw UID2s | +| :--- | :--- | :--- | :--- | :--- | +| ✅ | ✅ | —* | — | ✅ | + +*You cannot use Databricks to generate a UID2 token directly from DII. However, you can convert DII to a raw UID2, and then encrypt the raw UID2 into a UID2 token. + +### Key Benefits + +Here are some key benefits of integrating with Databricks for your UID2 processing: + +- Native support for managing UID2 workflows within a Databricks data clean room. +- Secure identity interoperability between partner datasets. +- Direct lineage and observability for all UID2-related transformations and joins, for auditing and traceability. +- Streamlined integration between UID2 identifiers and The Trade Desk activation ecosystem. +- Self-service support for marketers and advertisers through Databricks. + +## Integration Steps + +At a high level, the following are the steps to set up your Databricks integration and process your data: + +1. [Create a clean room for UID2 collaboration](#create-clean-room-for-uid2-collaboration). +1. [Send your Databricks sharing identifier to your UID2 contact](#send-sharing-identifier-to-uid2-contact). +1. [Add data to the clean room](#add-data-to-the-clean-room). +1. [Map DII](#map-dii) by running the clean room notebook. + +### Create Clean Room for UID2 Collaboration + +As a starting point, create a Databricks clean room—a secure environment for you to collaborate with UID2 to process your data. + +Follow the steps in [Create clean rooms](https://docs.databricks.com/aws/en/clean-rooms/create-clean-room) in the Databricks documentation. Use the correct sharing identifier based on the [UID2 environment](../getting-started/gs-environments) you want to connect to: see [UID2 Sharing Identifiers](#uid2-sharing-identifiers). + +:::important +After you've created a clean room, you cannot change its collaborators. If you have the option to set clean room collaborator aliases—for example, if you’re using the Databricks Python SDK to create the clean room—your collaborator alias must be `creator` and the UID2 collaborator alias must be `collaborator`. If you’re creating the clean room using the Databricks web UI, the correct collaborator aliases are set for you. +::: + +#### UID2 Sharing Identifiers + +UID2 sharing identifiers can change. Be sure to check this page for the latest sharing identifiers. + +| Environment | UID2 Sharing Identifier | +| :--- | :--- | +| Production | `aws:us-east-2:21149de7-a9e9-4463-b4e0-066f4b033e5d:673872910525611:010d98a6-8cf2-4011-8bf7-ca45940bc329` | +Integration | `aws:us-east-2:4651b4ea-b29c-42ec-aecb-2377de70bbd4:2366823546528067:c15e03bf-a348-4189-92e5-68b9a7fb4018` | + +### Send Sharing Identifier to UID2 Contact + +Find the sharing identifier for the Unity Catalog metastore that is attached to the Databricks workspace where you’ll work with the clean room. Send the sharing identifier to your UID2 contact. + +The sharing identifier is a string in this format: `::`. + +For information on how to find the sharing identifier, see [Request the recipient's sharing identifier](https://docs.databricks.com/aws/en/delta-sharing/create-recipient#step-1-request-the-recipients-sharing-identifier) in the Databricks documentation. + +[**GWH__MC04 just noting that I changed the above: just the link copy, not the link itself. You had "Get access in the Databricks-to-Databricks model" but the link in your file went to the above. LMK if I need to change anything.**] + +### Add Data to the Clean Room + +Add one or more tables or views to the clean room. You can use any names for the schema, tables, and views. Tables and views must follow the schema detailed in [Input Table](#input-table ). + +### Map DII + +Run the `identity_map_v3` clean room [notebook](#https://docs.databricks.com/aws/en/notebooks/) to map email addresses, phone numbers, or their respective hashes to raw UID2s. + +## Running the Clean Room Notebook + +This section provides details to help you use your Databricks clean room to process your DII into raw UID2s, including the following: + +- [Notebook Parameters](#notebook-parameters) +- [Input Table](#input-table) +- [DII Format and Normalization](#dii-format-and-normalization) +- [Output Table](#output-table) +- [Output Table Schema](#output-table-schema) + +### Notebook Parameters + +You can use the `identity_map_v3` notebook to map DII in any table or view that you've added to the `creator` catalog of the clean room. + +The notebook has two parameters, `input_schema` and `input_table`. Together, these two parameters identify the table or view in the clean room that contains the DII to be mapped. + +For example, to map DII in the clean room table named `creator.default.emails`, set `input_schema` to `default` and `input_table` to `emails`. + +| Parameter Name | Description | +| :--- | :--- | +| `input_schema` | The schema containing the table or view. | +| `input_table` | The name of the table or view containing the DII to be mapped. | + +### Input Table + +The input table or view must have the two columns shown in the following table. The table or view can have additional columns, but the notebook doesn't use any additional columns, only these two. + +| Column Name | Data Type | Description | +| :--- | :--- | :--- | +| `INPUT` | string | The DII to map. | +| `INPUT_TYPE` | string | The type of DII to map. Allowed values: `email`, `email_hash`, `phone`, and `phone_hash`. | + +### DII Format and Normalization + +The normalization requirements depend on the type of DII you're processing, as follows: + +- **Email address**: The notebook normalizes the data using the UID2 [Email Address Normalization](../getting-started/gs-normalization-encoding#email-address-normalization) rules. +- **Phone number**: You must normalize the phone number before mapping it with the notebook, using the UID2 [Phone Number Normalization](../getting-started/gs-normalization-encoding#phone-number-normalization) rules. + +### Output Table + +If the clean room has an output catalog, the mapped DII is written to a table in the output catalog. Output tables are stored for 30 days. + +For details, see [Overview of output tables](https://docs.databricks.com/aws/en/clean-rooms/output-tables#overview-of-output-tables) in the Databricks documentation. + +### Output Table Schema + +The following table provides information about the structure of the output data, including field names and values. + +| Column Name | Data Type | Description | +| :--- | :--- | :--- | +| `UID` | string | The value is one of the following:
  • **DII was successfully mapped**: The UID2 associated with the DII.
  • Othe**rwise: `NULL`.
| +| `PREV_UID` | string | The value is one of the following:
  • **DII was successfully mapped and the current raw UID2 was rotated in the last 90 days**: the previous raw UID2.
  • **Otherwise**: `NULL`.
| +| `REFRESH_FROM` | timestamp | The value is one of the following:
  • **DII was successfully mapped**: The timestamp (in epoch seconds) indicating when this UID2 should be refreshed.
  • **Otherwise**: `NULL`.
| +| `UNMAPPED` | string | The value is one of the following:
  • **DII was successfully mapped**: `NULL`.
  • Othe**rwise: The reason why the identifier was not mapped: `OPTOUT`, `INVALID IDENTIFIER`, or `INVALID INPUT TYPE`.
    For details, see [Values for the UNMAPPED Column](#values-for-the-unmapped-column).
| + +#### Values for the UNMAPPED Column + +The following table shows possible values for the `UNMAPPED` column. + +| Value | Meaning | +| :--- | :--- | +| `NULL` | The DII was successfully mapped. | +| `OPTOUT` | The user has opted out. | +| `INVALID IDENTIFIER` | The email address or phone number is invalid. | +| `INVALID INPUT TYPE` | The value of `INPUT_TYPE` is invalid. Valid values for `INPUT_TYPE` are: `email`, `email_hash`, `phone`, `phone_hash`. | + + + \ No newline at end of file diff --git a/sidebars.js b/sidebars.js index fba95d65f..c8eb817c5 100644 --- a/sidebars.js +++ b/sidebars.js @@ -407,6 +407,7 @@ const sidebars = { 'guides/integration-advertiser-dataprovider-overview', 'guides/integration-snowflake', 'guides/integration-snowflake-previous', + 'guides/integration-databricks', 'guides/integration-aws-entity-resolution', 'guides/advertiser-dataprovider-endpoints', 'DSP Integrations', @@ -493,7 +494,8 @@ const sidebars = { 'Advertiser/Data Provider Integrations', 'guides/integration-advertiser-dataprovider-overview', 'guides/integration-snowflake', - 'guides/integration-snowflake-integration-snowflake-previous', + 'guides/integration-snowflake-previous', + 'guides/integration-databricks', 'guides/integration-aws-entity-resolution', 'guides/advertiser-dataprovider-endpoints', 'sharing/sharing-bid-stream' From 67886bb559df5a55db32206321776045f6b0e4ac Mon Sep 17 00:00:00 2001 From: genwhittTTD Date: Thu, 20 Nov 2025 17:58:28 -0500 Subject: [PATCH 03/16] updates --- docs/guides/integration-databricks.md | 12 +- docs/ref-info/updates-doc.md | 22 +- .../current/guides/integration-databricks.md | 371 +----------------- 3 files changed, 27 insertions(+), 378 deletions(-) diff --git a/docs/guides/integration-databricks.md b/docs/guides/integration-databricks.md index a3aab2463..d6f4dfdbb 100644 --- a/docs/guides/integration-databricks.md +++ b/docs/guides/integration-databricks.md @@ -12,7 +12,7 @@ import Link from '@docusaurus/Link'; # UID2 Databricks Clean Room Integration Guide -This guide is for advertisers and data providers who want to manage their raw UID2s in a Databricks environment. +This guide is for advertisers and data providers who want to convert their user data to raw UID2s in a Databricks environment. [**GWH__MC01 "Amazon Web Services, Google Cloud Platform, or Microsoft Azure." -- which do we use? Or, any and all?**] @@ -86,7 +86,7 @@ Add one or more tables or views to the clean room. You can use any names for the ### Map DII -Run the `identity_map_v3` clean room [notebook](#https://docs.databricks.com/aws/en/notebooks/) to map email addresses, phone numbers, or their respective hashes to raw UID2s. +Run the `identity_map_v3` clean room [notebook](https://docs.databricks.com/aws/en/notebooks/) to map email addresses, phone numbers, or their respective hashes to raw UID2s. ## Running the Clean Room Notebook @@ -156,9 +156,8 @@ The following table shows possible values for the `UNMAPPED` column. | `INVALID INPUT TYPE` | The value of `INPUT_TYPE` is invalid. Valid values for `INPUT_TYPE` are: `email`, `email_hash`, `phone`, `phone_hash`. | - \ No newline at end of file +xxx \ No newline at end of file diff --git a/docs/ref-info/updates-doc.md b/docs/ref-info/updates-doc.md index 539cb71c9..2f2e363b0 100644 --- a/docs/ref-info/updates-doc.md +++ b/docs/ref-info/updates-doc.md @@ -20,6 +20,24 @@ Check out the latest updates to our UID2 documentation resources. Use the Tags toolbar to view a subset of documentation updates. ::: +## Q4 2025 + +The following documents were released in this quarter. + + + +### Databricks Integration Guide + +November 25, 2025 + +We've added an integration guide for the UID2 Databricks integration. + +For details, see [UID2 Databricks Clean Room Integration Guide](../guides/integration-databricks.md). + + + + + ## Q3 2025 The following documents were released in this quarter. @@ -37,13 +55,13 @@ We updated the following additional implementations and corresponding documentat - Python SDK: see [SDK for Python Reference Guide](../sdks/sdk-ref-python.md) - Snowflake: see [Snowflake Integration Guide](../guides/integration-snowflake.md) - + -### Identity Map v3 +### Identity Map v3 (Endpoint Doc) July 11, 2025 diff --git a/i18n/ja/docusaurus-plugin-content-docs/current/guides/integration-databricks.md b/i18n/ja/docusaurus-plugin-content-docs/current/guides/integration-databricks.md index a3aab2463..01daa665b 100644 --- a/i18n/ja/docusaurus-plugin-content-docs/current/guides/integration-databricks.md +++ b/i18n/ja/docusaurus-plugin-content-docs/current/guides/integration-databricks.md @@ -12,373 +12,4 @@ import Link from '@docusaurus/Link'; # UID2 Databricks Clean Room Integration Guide -This guide is for advertisers and data providers who want to manage their raw UID2s in a Databricks environment. - -[**GWH__MC01 "Amazon Web Services, Google Cloud Platform, or Microsoft Azure." -- which do we use? Or, any and all?**] - -[**GWH__MC02 Is it for EUID also? I think not?**] - -## Databricks Listing? - -[**GWH__MC03 where do Databricks users go to get more information about UID2 integration?**] - -## Functionality - -The following table summarizes the functionality available with the UID2 Databricks integration. - -| Encrypt Raw UID2 to UID2 Token for Sharing | Decrypt UID2 Token to Raw UID2 | Generate UID2 Token from DII | Refresh UID2 Token | Map DII to Raw UID2s | -| :--- | :--- | :--- | :--- | :--- | -| ✅ | ✅ | —* | — | ✅ | - -*You cannot use Databricks to generate a UID2 token directly from DII. However, you can convert DII to a raw UID2, and then encrypt the raw UID2 into a UID2 token. - -### Key Benefits - -Here are some key benefits of integrating with Databricks for your UID2 processing: - -- Native support for managing UID2 workflows within a Databricks data clean room. -- Secure identity interoperability between partner datasets. -- Direct lineage and observability for all UID2-related transformations and joins, for auditing and traceability. -- Streamlined integration between UID2 identifiers and The Trade Desk activation ecosystem. -- Self-service support for marketers and advertisers through Databricks. - -## Integration Steps - -At a high level, the following are the steps to set up your Databricks integration and process your data: - -1. [Create a clean room for UID2 collaboration](#create-clean-room-for-uid2-collaboration). -1. [Send your Databricks sharing identifier to your UID2 contact](#send-sharing-identifier-to-uid2-contact). -1. [Add data to the clean room](#add-data-to-the-clean-room). -1. [Map DII](#map-dii) by running the clean room notebook. - -### Create Clean Room for UID2 Collaboration - -As a starting point, create a Databricks clean room—a secure environment for you to collaborate with UID2 to process your data. - -Follow the steps in [Create clean rooms](https://docs.databricks.com/aws/en/clean-rooms/create-clean-room) in the Databricks documentation. Use the correct sharing identifier based on the [UID2 environment](../getting-started/gs-environments) you want to connect to: see [UID2 Sharing Identifiers](#uid2-sharing-identifiers). - -:::important -After you've created a clean room, you cannot change its collaborators. If you have the option to set clean room collaborator aliases—for example, if you’re using the Databricks Python SDK to create the clean room—your collaborator alias must be `creator` and the UID2 collaborator alias must be `collaborator`. If you’re creating the clean room using the Databricks web UI, the correct collaborator aliases are set for you. -::: - -#### UID2 Sharing Identifiers - -UID2 sharing identifiers can change. Be sure to check this page for the latest sharing identifiers. - -| Environment | UID2 Sharing Identifier | -| :--- | :--- | -| Production | `aws:us-east-2:21149de7-a9e9-4463-b4e0-066f4b033e5d:673872910525611:010d98a6-8cf2-4011-8bf7-ca45940bc329` | -Integration | `aws:us-east-2:4651b4ea-b29c-42ec-aecb-2377de70bbd4:2366823546528067:c15e03bf-a348-4189-92e5-68b9a7fb4018` | - -### Send Sharing Identifier to UID2 Contact - -Find the sharing identifier for the Unity Catalog metastore that is attached to the Databricks workspace where you’ll work with the clean room. Send the sharing identifier to your UID2 contact. - -The sharing identifier is a string in this format: `::`. - -For information on how to find the sharing identifier, see [Request the recipient's sharing identifier](https://docs.databricks.com/aws/en/delta-sharing/create-recipient#step-1-request-the-recipients-sharing-identifier) in the Databricks documentation. - -[**GWH__MC04 just noting that I changed the above: just the link copy, not the link itself. You had "Get access in the Databricks-to-Databricks model" but the link in your file went to the above. LMK if I need to change anything.**] - -### Add Data to the Clean Room - -Add one or more tables or views to the clean room. You can use any names for the schema, tables, and views. Tables and views must follow the schema detailed in [Input Table](#input-table ). - -### Map DII - -Run the `identity_map_v3` clean room [notebook](#https://docs.databricks.com/aws/en/notebooks/) to map email addresses, phone numbers, or their respective hashes to raw UID2s. - -## Running the Clean Room Notebook - -This section provides details to help you use your Databricks clean room to process your DII into raw UID2s, including the following: - -- [Notebook Parameters](#notebook-parameters) -- [Input Table](#input-table) -- [DII Format and Normalization](#dii-format-and-normalization) -- [Output Table](#output-table) -- [Output Table Schema](#output-table-schema) - -### Notebook Parameters - -You can use the `identity_map_v3` notebook to map DII in any table or view that you've added to the `creator` catalog of the clean room. - -The notebook has two parameters, `input_schema` and `input_table`. Together, these two parameters identify the table or view in the clean room that contains the DII to be mapped. - -For example, to map DII in the clean room table named `creator.default.emails`, set `input_schema` to `default` and `input_table` to `emails`. - -| Parameter Name | Description | -| :--- | :--- | -| `input_schema` | The schema containing the table or view. | -| `input_table` | The name of the table or view containing the DII to be mapped. | - -### Input Table - -The input table or view must have the two columns shown in the following table. The table or view can have additional columns, but the notebook doesn't use any additional columns, only these two. - -| Column Name | Data Type | Description | -| :--- | :--- | :--- | -| `INPUT` | string | The DII to map. | -| `INPUT_TYPE` | string | The type of DII to map. Allowed values: `email`, `email_hash`, `phone`, and `phone_hash`. | - -### DII Format and Normalization - -The normalization requirements depend on the type of DII you're processing, as follows: - -- **Email address**: The notebook normalizes the data using the UID2 [Email Address Normalization](../getting-started/gs-normalization-encoding#email-address-normalization) rules. -- **Phone number**: You must normalize the phone number before mapping it with the notebook, using the UID2 [Phone Number Normalization](../getting-started/gs-normalization-encoding#phone-number-normalization) rules. - -### Output Table - -If the clean room has an output catalog, the mapped DII is written to a table in the output catalog. Output tables are stored for 30 days. - -For details, see [Overview of output tables](https://docs.databricks.com/aws/en/clean-rooms/output-tables#overview-of-output-tables) in the Databricks documentation. - -### Output Table Schema - -The following table provides information about the structure of the output data, including field names and values. - -| Column Name | Data Type | Description | -| :--- | :--- | :--- | -| `UID` | string | The value is one of the following:
  • **DII was successfully mapped**: The UID2 associated with the DII.
  • Othe**rwise: `NULL`.
| -| `PREV_UID` | string | The value is one of the following:
  • **DII was successfully mapped and the current raw UID2 was rotated in the last 90 days**: the previous raw UID2.
  • **Otherwise**: `NULL`.
| -| `REFRESH_FROM` | timestamp | The value is one of the following:
  • **DII was successfully mapped**: The timestamp (in epoch seconds) indicating when this UID2 should be refreshed.
  • **Otherwise**: `NULL`.
| -| `UNMAPPED` | string | The value is one of the following:
  • **DII was successfully mapped**: `NULL`.
  • Othe**rwise: The reason why the identifier was not mapped: `OPTOUT`, `INVALID IDENTIFIER`, or `INVALID INPUT TYPE`.
    For details, see [Values for the UNMAPPED Column](#values-for-the-unmapped-column).
| - -#### Values for the UNMAPPED Column - -The following table shows possible values for the `UNMAPPED` column. - -| Value | Meaning | -| :--- | :--- | -| `NULL` | The DII was successfully mapped. | -| `OPTOUT` | The user has opted out. | -| `INVALID IDENTIFIER` | The email address or phone number is invalid. | -| `INVALID INPUT TYPE` | The value of `INPUT_TYPE` is invalid. Valid values for `INPUT_TYPE` are: `email`, `email_hash`, `phone`, `phone_hash`. | - - - \ No newline at end of file +**COPY OF DATABRICKS DOC WILL GO HERE WHEN IT'S FINALIZED.** From b6a98379f0d51cae42cf64a4babdc4898e44b27d Mon Sep 17 00:00:00 2001 From: genwhittTTD Date: Fri, 21 Nov 2025 11:04:53 -0500 Subject: [PATCH 04/16] edits from MC --- docs/guides/integration-databricks.md | 265 ++++---------------------- 1 file changed, 33 insertions(+), 232 deletions(-) diff --git a/docs/guides/integration-databricks.md b/docs/guides/integration-databricks.md index d6f4dfdbb..35d677423 100644 --- a/docs/guides/integration-databricks.md +++ b/docs/guides/integration-databricks.md @@ -14,13 +14,11 @@ import Link from '@docusaurus/Link'; This guide is for advertisers and data providers who want to convert their user data to raw UID2s in a Databricks environment. -[**GWH__MC01 "Amazon Web Services, Google Cloud Platform, or Microsoft Azure." -- which do we use? Or, any and all?**] +[**GWH__MC01 "Amazon Web Services, Google Cloud Platform, or Microsoft Azure." -- which do we use? Or, any and all? And Matt said: "Let's discuss next week."**] -[**GWH__MC02 Is it for EUID also? I think not?**] +## Databricks Partner Network Listing -## Databricks Listing? - -[**GWH__MC03 where do Databricks users go to get more information about UID2 integration?**] +[**GWH__EE or MC for listing update when available. https://www.databricks.com/company/partners/technology? Guessing it will be here.**] ## Functionality @@ -28,11 +26,11 @@ The following table summarizes the functionality available with the UID2 Databri | Encrypt Raw UID2 to UID2 Token for Sharing | Decrypt UID2 Token to Raw UID2 | Generate UID2 Token from DII | Refresh UID2 Token | Map DII to Raw UID2s | | :--- | :--- | :--- | :--- | :--- | -| ✅ | ✅ | —* | — | ✅ | +| — | — | —* | — | ✅ | *You cannot use Databricks to generate a UID2 token directly from DII. However, you can convert DII to a raw UID2, and then encrypt the raw UID2 into a UID2 token. -### Key Benefits +## Key Benefits Here are some key benefits of integrating with Databricks for your UID2 processing: @@ -61,24 +59,18 @@ Follow the steps in [Create clean rooms](https://docs.databricks.com/aws/en/clea After you've created a clean room, you cannot change its collaborators. If you have the option to set clean room collaborator aliases—for example, if you’re using the Databricks Python SDK to create the clean room—your collaborator alias must be `creator` and the UID2 collaborator alias must be `collaborator`. If you’re creating the clean room using the Databricks web UI, the correct collaborator aliases are set for you. ::: -#### UID2 Sharing Identifiers - -UID2 sharing identifiers can change. Be sure to check this page for the latest sharing identifiers. - -| Environment | UID2 Sharing Identifier | -| :--- | :--- | -| Production | `aws:us-east-2:21149de7-a9e9-4463-b4e0-066f4b033e5d:673872910525611:010d98a6-8cf2-4011-8bf7-ca45940bc329` | -Integration | `aws:us-east-2:4651b4ea-b29c-42ec-aecb-2377de70bbd4:2366823546528067:c15e03bf-a348-4189-92e5-68b9a7fb4018` | - ### Send Sharing Identifier to UID2 Contact -Find the sharing identifier for the Unity Catalog metastore that is attached to the Databricks workspace where you’ll work with the clean room. Send the sharing identifier to your UID2 contact. +To establish a relationship with your UID2 contact, you'll need to send the Databricks sharing identifier. The sharing identifier is a string in this format: `::`. -For information on how to find the sharing identifier, see [Request the recipient's sharing identifier](https://docs.databricks.com/aws/en/delta-sharing/create-recipient#step-1-request-the-recipients-sharing-identifier) in the Databricks documentation. +Follow these steps: -[**GWH__MC04 just noting that I changed the above: just the link copy, not the link itself. You had "Get access in the Databricks-to-Databricks model" but the link in your file went to the above. LMK if I need to change anything.**] +1. Find the sharing identifier for the Unity Catalog metastore that is attached to the Databricks workspace where you’ll work with the clean room. + + For information on how to find this value, see [Finding a Sharing Identifier](#finding-a-sharing-identifier). +1. Send the sharing identifier to your UID2 contact. ### Add Data to the Clean Room @@ -88,6 +80,8 @@ Add one or more tables or views to the clean room. You can use any names for the Run the `identity_map_v3` clean room [notebook](https://docs.databricks.com/aws/en/notebooks/) to map email addresses, phone numbers, or their respective hashes to raw UID2s. +A successful notebook run results in raw UID2s populated in the output table. For details, see [Output Table](#output-table). + ## Running the Clean Room Notebook This section provides details to help you use your Databricks clean room to process your DII into raw UID2s, including the following: @@ -139,14 +133,14 @@ The following table provides information about the structure of the output data, | Column Name | Data Type | Description | | :--- | :--- | :--- | -| `UID` | string | The value is one of the following:
  • **DII was successfully mapped**: The UID2 associated with the DII.
  • Othe**rwise: `NULL`.
| +| `UID` | string | The value is one of the following:
  • **DII was successfully mapped**: The UID2 associated with the DII.
  • **Otherwise**: `NULL`.
| | `PREV_UID` | string | The value is one of the following:
  • **DII was successfully mapped and the current raw UID2 was rotated in the last 90 days**: the previous raw UID2.
  • **Otherwise**: `NULL`.
| | `REFRESH_FROM` | timestamp | The value is one of the following:
  • **DII was successfully mapped**: The timestamp (in epoch seconds) indicating when this UID2 should be refreshed.
  • **Otherwise**: `NULL`.
| -| `UNMAPPED` | string | The value is one of the following:
  • **DII was successfully mapped**: `NULL`.
  • Othe**rwise: The reason why the identifier was not mapped: `OPTOUT`, `INVALID IDENTIFIER`, or `INVALID INPUT TYPE`.
    For details, see [Values for the UNMAPPED Column](#values-for-the-unmapped-column).
| +| `UNMAPPED` | string | The value is one of the following:
  • **DII was successfully mapped**: `NULL`.
  • **Otherwise**: The reason why the identifier was not mapped: `OPTOUT`, `INVALID IDENTIFIER`, or `INVALID INPUT TYPE`.
    For details, see [Values for the UNMAPPED Column](#values-for-the-unmapped-column).
| #### Values for the UNMAPPED Column -The following table shows possible values for the `UNMAPPED` column. +The following table shows possible values for the `UNMAPPED` column in the output table schema. | Value | Meaning | | :--- | :--- | @@ -155,230 +149,37 @@ The following table shows possible values for the `UNMAPPED` column. | `INVALID IDENTIFIER` | The email address or phone number is invalid. | | `INVALID INPUT TYPE` | The value of `INPUT_TYPE` is invalid. Valid values for `INPUT_TYPE` are: `email`, `email_hash`, `phone`, `phone_hash`. | - - -## BELOW IS A COPY OF SNOWFLAKE DOC HEADINGS - ## Testing in the Integ Environment -xxx - -## Shared Objects/Functions? - -xxx - -### Database and Schema Names - -Query examples? - -xxx | - -### Map DII - -(**we have this... but, leaving these notes in in case we want to add anything**) - -Define types of DII? - -Does the service normalize? (for email / for phone) - -A successful query returns ...? - -#### Examples - -Mapping request examples in this section: - -- [Single Unhashed Email](#mapping-request-example---single-unhashed-email) -- [Multiple Unhashed Emails](#mapping-request-example---multiple-unhashed-emails) -- [Single Unhashed Phone Number](#mapping-request-example---single-unhashed-phone-number) -- [Multiple Unhashed Phone Numbers](#mapping-request-example---multiple-unhashed-phone-numbers) -- [Single Hashed Email](#mapping-request-example---single-hashed-email) -- [Multiple Hashed Emails](#mapping-request-example---multiple-hashed-emails) -- [Single Hashed Phone Number](#mapping-request-example---single-hashed-phone-number) -- [Multiple Hashed Phone Numbers](#mapping-request-example---multiple-hashed-phone-numbers) - -:::note -The input and output data in these examples is fictitious, for illustrative purposes only. The values provided are not real values. -::: - -#### Mapping Request Example - Single Unhashed Email - -The following query illustrates how to map a single email address, using the [default database and schema names](#database-and-schema-names). - -```sql - -``` - -Query results for a single email: - -``` -+----------------------------------------------+--------------------------------------------------+--------------+----------+ -| UID | PREV_UID | REFRESH_FROM | UNMAPPED | -+----------------------------------------------+--------------------------------------------------+--------------+----------+ -| 2ODl112/VS3x2vL+kG1439nPb7XNngLvOWiZGaMhdcU= | vP9zK2mL7fR4tY8qN3wE6xB0dH5jA1sC+nI/oGuMeVa= | 1735689600 | NULL | -+----------------------------------------------+--------------------------------------------------+--------------+----------+ -``` - -#### Mapping Request Example - Multiple Unhashed Emails - -The following query illustrates how to map multiple email addresses, using the [default database and schema names](#database-and-schema-names). - -```sql - -``` - -Query results for multiple emails: +[**GWH__MC content for this section is to come.**] -The following table identifies each item in the response, including `NULL` values for `NULL` or improperly formatted emails. +## Reference -``` -+----+----------------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ -| ID | EMAIL | UID | PREV_UID | REFRESH_FROM | UNMAPPED | -+----+----------------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ -| 1 | validate@example.com | 2ODl112/VS3x2vL+kG1439nPb7XNngLvOWiZGaMhdcU= | vP9zK2mL7fR4tY8qN3wE6xB0dH5jA1sC+nI/oGuMeVa= | 1735689600 | NULL | -| 2 | test@uidapi.com | IbW4n6LIvtDj/8fCESlU0QG9K/fH63UdcTkJpAG8fIQ= | NULL | 1735689600 | NULL | -| 3 | optout@example.com | NULL | NULL | NULL | OPTOUT | -| 4 | invalid-email | NULL | NULL | NULL | INVALID IDENTIFIER | -| 5 | NULL | NULL | NULL | NULL | INVALID IDENTIFIER | -+----+----------------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ -``` +This section includes the following reference information: -#### Mapping Request Example - Single Unhashed Phone Number +- [UID2 Sharing Identifiers](#uid2-sharing-identifiers) +- [Finding the Sharing Identifier for Your UID2 Contact](#finding-the-sharing-identifier-for-your-uid2-contact) -The following query illustrates how to map a phone number, using the [default database and schema names](#database-and-schema-names). +### UID2 Sharing Identifiers -You must normalize phone numbers using the UID2 [Phone Number Normalization](../getting-started/gs-normalization-encoding.md#phone-number-normalization) rules. - -```sql - -``` - -Query results for a single phone number: - -``` -+----------------------------------------------+----------+--------------+----------+ -| UID | PREV_UID | REFRESH_FROM | UNMAPPED | -+----------------------------------------------+----------+--------------+----------+ -| 2ODl112/VS3x2vL+kG1439nPb7XNngLvOWiZGaMhdcU= | NULL | 1735689600 | NULL | -+----------------------------------------------+----------+--------------+----------+ -``` - -#### Mapping Request Example - Multiple Unhashed Phone Numbers - -The following query illustrates how to map multiple phone numbers, using the [default database and schema names](#database-and-schema-names). - -You must normalize phone numbers using the UID2 [Phone Number Normalization](../getting-started/gs-normalization-encoding.md#phone-number-normalization) rules. - -```sql - -``` - -Query results for multiple phone numbers: - -The following table identifies each item in the response, including `NULL` values for `NULL` or invalid phone numbers. - -``` -+----+--------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ -| ID | PHONE | UID | PREV_UID | REFRESH_FROM | UNMAPPED | -+----+--------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ -| 1 | +12345678901 | 2ODl112/VS3x2vL+kG1439nPb7XNngLvOWiZGaMhdcU= | vP9zK2mL7fR4tY8qN3wE6xB0dH5jA1sC+nI/oGuMeVa= | 1735689600 | NULL | -| 2 | +61491570006 | IbW4n6LIvtDj/8fCESlU0QG9K/fH63UdcTkJpAG8fIQ= | NULL | 1735689600 | NULL | -| 3 | +56789123001 | NULL | NULL | NULL | OPTOUT | -| 4 | 1234 | NULL | NULL | NULL | INVALID IDENTIFIER | -| 5 | NULL | NULL | NULL | NULL | INVALID IDENTIFIER | -+----+--------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ -``` - -#### Mapping Request Example - Single Hashed Email - -The following query illustrates how to map a single email address hash, using the [default database and schema names](#database-and-schema-names). - -```sql - -``` - -Query results for a single hashed email: - -``` -+----------------------------------------------+----------------------------------------------+--------------+----------+ -| UID | PREV_UID | REFRESH_FROM | UNMAPPED | -+----------------------------------------------+----------------------------------------------+--------------+----------+ -| 2ODl112/VS3x2vL+kG1439nPb7XNngLvOWiZGaMhdcU= | vP9zK2mL7fR4tY8qN3wE6xB0dH5jA1sC+nI/oGuMeVa= | 1735689600 | NULL | -+----------------------------------------------+----------------------------------------------+--------------+----------+ -``` - -#### Mapping Request Example - Multiple Hashed Emails - -The following query illustrates how to map multiple email address hashes, using the [default database and schema names](#database-and-schema-names). - -```sql - -```` - -Query results for multiple hashed emails: - -The following table identifies each item in the response, including `NULL` values for `NULL` hashes. - -``` -+----+----------------------------------------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ -| ID | EMAIL_HASH | UID | PREV_UID | REFRESH_FROM | UNMAPPED | -+----+----------------------------------------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ -| 1 | LdhtUlMQ58ZZy5YUqGPRQw5xUMS5dXG5ocJHYJHbAKI= | 2ODl112/VS3x2vL+kG1439nPb7XNngLvOWiZGaMhdcU= | vP9zK2mL7fR4tY8qN3wE6xB0dH5jA1sC+nI/oGuMeVa= | 1735689600 | NULL | -| 2 | /XJSTajB68SCUyuc3ePyxSLNhxrMKvJcjndq8TuwW5g= | IbW4n6LIvtDj/8fCESlU0QG9K/fH63UdcTkJpAG8fIQ= | NULL | 1735689600 | NULL | -| 2 | UebesrNN0bQkm/QR7Jx7eav+UDXN5Gbq3zs1fLBMRy0= | NULL | NULL | 1735689600 | OPTOUT | -| 4 | NULL | NULL | NULL | NULL | INVALID IDENTIFIER | -+----+----------------------------------------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ -``` - -#### Mapping Request Example - Single Hashed Phone Number - -The following query illustrates how to map a single phone number hash, using the [default database and schema names](#database-and-schema-names). - -```sql - -``` - -Query results for a single hashed phone number: - -``` -+----------------------------------------------+----------------------------------------------+--------------+----------+ -| UID | PREV_UID | REFRESH_FROM | UNMAPPED | -+----------------------------------------------+----------------------------------------------+--------------+----------+ -| 2ODl112/VS3x2vL+kG1439nPb7XNngLvOWiZGaMhdcU= | vP9zK2mL7fR4tY8qN3wE6xB0dH5jA1sC+nI/oGuMeVa= | 1735689600 | NULL | -+----------------------------------------------+----------------------------------------------+--------------+----------+ -``` - -#### Mapping Request Example - Multiple Hashed Phone Numbers - -The following query illustrates how to map multiple phone number hashes, using the [default database and schema names](#database-and-schema-names). - -```sql - -``` +UID2 sharing identifiers can change. Be sure to check this page for the latest sharing identifiers. -Query results for multiple hashed phone numbers: +| Environment | UID2 Sharing Identifier | +| :--- | :--- | +| Production | `aws:us-east-2:21149de7-a9e9-4463-b4e0-066f4b033e5d:673872910525611:010d98a6-8cf2-4011-8bf7-ca45940bc329` | +| Integration | `aws:us-east-2:4651b4ea-b29c-42ec-aecb-2377de70bbd4:2366823546528067:c15e03bf-a348-4189-92e5-68b9a7fb4018` | -The following table identifies each item in the response, including `NULL` values for `NULL` hashes. +### Finding a Sharing Identifier -``` -+----+----------------------------------------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ -| ID | PHONE_HASH | UID | PREV_UID | REFRESH_FROM | UNMAPPED | -+----+----------------------------------------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ -| 1 | LdhtUlMQ58ZZy5YUqGPRQw5xUMS5dXG5ocJHYJHbAKI= | 2ODl112/VS3x2vL+kG1439nPb7XNngLvOWiZGaMhdcU= | vP9zK2mL7fR4tY8qN3wE6xB0dH5jA1sC+nI/oGuMeVa= | 1735689600 | NULL | -| 2 | /XJSTajB68SCUyuc3ePyxSLNhxrMKvJcjndq8TuwW5g= | IbW4n6LIvtDj/8fCESlU0QG9K/fH63UdcTkJpAG8fIQ= | NULL | 1735689600 | NULL | -| 2 | UebesrNN0bQkm/QR7Jx7eav+UDXN5Gbq3zs1fLBMRy0= | NULL | NULL | 1735689600 | OPTOUT | -| 4 | NULL | NULL | NULL | NULL | INVALID IDENTIFIER | -+----+----------------------------------------------+----------------------------------------------+----------------------------------------------+--------------+--------------------+ -``` +To find the sharing identifier for your UID2 contact, follow these steps: -### Monitor Raw UID2 Refresh and Regenerate Raw UID2s +In your Databricks workspace, in the Catalog Explorer, click **Catalog**. -xxx +At the top, click the gear icon and select **Delta Sharing**. -#### Targeted Input Table +On the **Shared with me** tab, in the upper right, click your Databricks sharing organization and then select **Copy sharing identifier**. -xxx +For details, see [Request the recipient's sharing identifier](https://docs.databricks.com/aws/en/delta-sharing/create-recipient#step-1-request-the-recipients-sharing-identifier) in the Databricks documentation. -Query results: -xxx \ No newline at end of file From 21b66e80e7126c7c0cdcd95de87cc882eadb6355 Mon Sep 17 00:00:00 2001 From: genwhittTTD Date: Fri, 21 Nov 2025 12:48:35 -0500 Subject: [PATCH 05/16] add intro info --- docs/guides/integration-databricks.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/docs/guides/integration-databricks.md b/docs/guides/integration-databricks.md index 35d677423..1acedd5b1 100644 --- a/docs/guides/integration-databricks.md +++ b/docs/guides/integration-databricks.md @@ -16,6 +16,18 @@ This guide is for advertisers and data providers who want to convert their user [**GWH__MC01 "Amazon Web Services, Google Cloud Platform, or Microsoft Azure." -- which do we use? Or, any and all? And Matt said: "Let's discuss next week."**] +## Integration Overview + +A Databricks clean room is a secure and privacy-protected environment where multiple parties can work together on sensitive enterprise data without direct access to each other's data. + +In the context of UID2, you set up the clean room and place your data there. You set up a trust relationship with the UID2 Operator and allow the Operator to convert your data to raw UID2s. + +The Secure Clean Room environment ensures that all UID2 operations occur within your Databricks workspace. You never send raw DII, and all calculations are performed via programmable, policy-enforced queries. + +With UID2 supported in the Databricks environment, advertisers and data partners can securely connect their first-party data to drive more relevant and measurable campaigns across the open internet. + +[**GWH__MC11 is it only first-party data for the moment? If they're just sending phone numbers and emails, I don't see what the difference is... it's just data?**] + ## Databricks Partner Network Listing [**GWH__EE or MC for listing update when available. https://www.databricks.com/company/partners/technology? Guessing it will be here.**] From 9879353f60b004ea4a8640d68b6c375346cb62e3 Mon Sep 17 00:00:00 2001 From: genwhittTTD Date: Tue, 25 Nov 2025 17:12:46 -0500 Subject: [PATCH 06/16] edits from MC plus additional mods --- docs/getting-started/gs-permissions.md | 2 +- ...ation-advertiser-dataprovider-endpoints.md | 2 +- ...ration-advertiser-dataprovider-overview.md | 4 +- docs/guides/integration-databricks.md | 43 ++++++++++--------- docs/guides/summary-guides.md | 1 + docs/overviews/overview-advertisers.md | 1 + docs/overviews/overview-data-providers.md | 1 + docs/ref-info/updates-doc.md | 2 +- docs/summary-doc-v2.md | 2 +- .../current/guides/integration-databricks.md | 2 +- 10 files changed, 33 insertions(+), 27 deletions(-) diff --git a/docs/getting-started/gs-permissions.md b/docs/getting-started/gs-permissions.md index 67a284233..c1ecdc464 100644 --- a/docs/getting-started/gs-permissions.md +++ b/docs/getting-started/gs-permissions.md @@ -26,5 +26,5 @@ The following table lists the key permissions, the types of participants that co | :--- | :--- | :--- | | Generator | Publishers | Permission to call the [POST /token/generate](../endpoints/post-token-generate.md), [POST /token/validate](../endpoints/post-token-validate.md), and [POST /token/refresh](../endpoints/post-token-refresh.md) endpoints, to generate UID2 tokens from DII and to refresh them, using one of these integration methods:
  • A Prebid integration
  • The SDK for JavaScript
  • An integration that directly calls the applicable API endpoints for retrieving and managing UID2 tokens
| | Bidder | DSPs | Permission to decrypt UID2 tokens coming in from the bidstream from publishers into raw UID2s for bidding purposes. | -| Sharer | Any participant type that takes part in UID2 sharing.
For details, see [UID2 Sharing: Overview](../sharing/sharing-overview.md). | Permission to do both of the following:
  • Encrypt raw UID2s into UID2 tokens for sharing with another authorized sharing participant, using a UID2 SDK or Snowflake
  • Decrypt UID2 tokens received from another authorized sharing participant into raw UID2s
| +| Sharer | Any participant type that takes part in UID2 sharing.
For details, see [UID2 Sharing: Overview](../sharing/sharing-overview.md). | Permission to do both of the following:
  • Encrypt raw UID2s into UID2 tokens for sharing with another authorized sharing participant, using a UID2 SDK, Snowflake, or Databricks
  • Decrypt UID2 tokens received from another authorized sharing participant into raw UID2s
| | Mapper | Advertisers
Data Providers | Permission to call the following endpoints to map multiple email addresses, phone numbers, or their respective hashes to their raw UID2s, previous raw UID2s, and refresh timestamps:
  • [POST /identity/map](../endpoints/post-identity-map.md) (latest version)
  • The earlier v2 identity mapping endpoints: [POST /identity/map (v2)](../endpoints/post-identity-map.md) and [POST /identity/buckets](../endpoints/post-identity-buckets.md)
| diff --git a/docs/guides/integration-advertiser-dataprovider-endpoints.md b/docs/guides/integration-advertiser-dataprovider-endpoints.md index c834ea151..c4d7948c4 100644 --- a/docs/guides/integration-advertiser-dataprovider-endpoints.md +++ b/docs/guides/integration-advertiser-dataprovider-endpoints.md @@ -11,7 +11,7 @@ import Link from '@docusaurus/Link'; # Advertiser/Data Provider Integration to HTTP Endpoints -This guide covers integration steps for advertisers and data providers to integrate with UID2 by writing code to call UID2 HTTP endpoints, rather than using another implementation option such as an SDK, Snowflake, or AWS Entity Resolution. +This guide covers integration steps for advertisers and data providers to integrate with UID2 by writing code to call UID2 HTTP endpoints, rather than using another implementation option such as an SDK, Snowflake, Databricks, or AWS Entity Resolution. :::tip For a summary of all integration options and steps for advertisers and data providers, see [Advertiser/Data Provider Integration Overview](integration-advertiser-dataprovider-overview.md). diff --git a/docs/guides/integration-advertiser-dataprovider-overview.md b/docs/guides/integration-advertiser-dataprovider-overview.md index dff199a95..f30bb6524 100644 --- a/docs/guides/integration-advertiser-dataprovider-overview.md +++ b/docs/guides/integration-advertiser-dataprovider-overview.md @@ -53,7 +53,7 @@ The following table shows the implementation options that are available for adve | High-Level Step | Implementation Options | | --- | --- | -| [1: Generate Raw UID2s from DII](#1-generate-raw-uid2s-from-dii) | Use any of the following options to map DII to raw UID2s:
  • One of these UID2 SDKs:
    • Python SDK: [Map DII to Raw UID2s](../sdks/sdk-ref-python.md#map-dii-to-raw-uid2s)
    • Java SDK: [Usage for Advertisers/Data Providers](../sdks/sdk-ref-java.md#usage-for-advertisersdata-providers)
  • Snowflake: [Map DII](integration-snowflake.md#map-dii)
  • AWS Entity Resolution: [AWS Entity Resolution Integration Guide](integration-aws-entity-resolution.md)
  • HTTP endpoints: [POST /identity/map](../endpoints/post-identity-map.md)
| +| [1: Generate Raw UID2s from DII](#1-generate-raw-uid2s-from-dii) | Use any of the following options to map DII to raw UID2s:
  • One of these UID2 SDKs:
    • Python SDK: [Map DII to Raw UID2s](../sdks/sdk-ref-python.md#map-dii-to-raw-uid2s)
    • Java SDK: [Usage for Advertisers/Data Providers](../sdks/sdk-ref-java.md#usage-for-advertisersdata-providers)
  • Snowflake: [Map DII](integration-snowflake.md#map-dii)
  • Databricks: [Map DII](integration-databricks.md#map-dii)
  • AWS Entity Resolution: [AWS Entity Resolution Integration Guide](integration-aws-entity-resolution.md)
  • HTTP endpoints: [POST /identity/map](../endpoints/post-identity-map.md)
| | [2: Store Raw UID2s and Refresh Timestamps](#2-store-raw-uid2s-and-refresh-timestamps) | Custom (your choice). | | [3: Manipulate or Combine Raw UID2s](#3-manipulate-or-combine-raw-uid2s) | Custom (your choice). | | [4: Send Stored Raw UID2s to DSPs to Create Audiences or Conversions](#4-send-stored-raw-uid2s-to-dsps-to-create-audiences-or-conversions) | Custom (your choice). | @@ -87,6 +87,8 @@ To generate raw UID2s, use one of the following options: - Snowflake: See [Map DII](integration-snowflake.md#map-dii). +- Databricks: See [Map DII](integration-databricks.md#map-dii). + - AWS Entity Resolution: See [AWS Entity Resolution Integration Guide](integration-aws-entity-resolution.md). - HTTP endpoints: [POST /identity/map](../endpoints/post-identity-map.md). For details, see [Generate Raw UID2s from DII](integration-advertiser-dataprovider-endpoints.md#1-generate-raw-uid2s-from-dii). diff --git a/docs/guides/integration-databricks.md b/docs/guides/integration-databricks.md index 1acedd5b1..fae47ecab 100644 --- a/docs/guides/integration-databricks.md +++ b/docs/guides/integration-databricks.md @@ -10,27 +10,27 @@ displayed_sidebar: docs import Link from '@docusaurus/Link'; -# UID2 Databricks Clean Room Integration Guide +# Databricks Clean Rooms Integration Guide This guide is for advertisers and data providers who want to convert their user data to raw UID2s in a Databricks environment. -[**GWH__MC01 "Amazon Web Services, Google Cloud Platform, or Microsoft Azure." -- which do we use? Or, any and all? And Matt said: "Let's discuss next week."**] - ## Integration Overview -A Databricks clean room is a secure and privacy-protected environment where multiple parties can work together on sensitive enterprise data without direct access to each other's data. +[Databricks Clean Rooms](https://docs.databricks.com/aws/en/clean-rooms/) is a Databricks data warehousing solution, where you as a partner can store your data and integrate with the UID2 framework. Using Databricks Clean Rooms, UID2 enables you to securely share consumer identifier data without exposing sensitive directly identifying information (DII). -In the context of UID2, you set up the clean room and place your data there. You set up a trust relationship with the UID2 Operator and allow the Operator to convert your data to raw UID2s. +In the context of UID2, you set up the Databricks Clean Rooms environment and place your data there. You set up a trust relationship with the UID2 Operator and allow the Operator to convert your data to raw UID2s. -The Secure Clean Room environment ensures that all UID2 operations occur within your Databricks workspace. You never send raw DII, and all calculations are performed via programmable, policy-enforced queries. +With UID2 supported in the clean room, advertisers and data partners can securely process their first-party data within Databricks. -With UID2 supported in the Databricks environment, advertisers and data partners can securely connect their first-party data to drive more relevant and measurable campaigns across the open internet. +[**GWH__EE01 is it only first-party data, or just data? If they're just sending phone numbers and emails, I don't see what the difference is... it's just data?**] -[**GWH__MC11 is it only first-party data for the moment? If they're just sending phone numbers and emails, I don't see what the difference is... it's just data?**] +[**GWH__EE02 Please provide any additional content you want in the overview. Thx.**] + ## Functionality @@ -38,9 +38,7 @@ The following table summarizes the functionality available with the UID2 Databri | Encrypt Raw UID2 to UID2 Token for Sharing | Decrypt UID2 Token to Raw UID2 | Generate UID2 Token from DII | Refresh UID2 Token | Map DII to Raw UID2s | | :--- | :--- | :--- | :--- | :--- | -| — | — | —* | — | ✅ | - -*You cannot use Databricks to generate a UID2 token directly from DII. However, you can convert DII to a raw UID2, and then encrypt the raw UID2 into a UID2 token. +| — | — | — | — | ✅ | ## Key Benefits @@ -63,7 +61,7 @@ At a high level, the following are the steps to set up your Databricks integrati ### Create Clean Room for UID2 Collaboration -As a starting point, create a Databricks clean room—a secure environment for you to collaborate with UID2 to process your data. +As a starting point, create a Databricks Clean Rooms environment—a secure environment for you to collaborate with UID2 to process your data. Follow the steps in [Create clean rooms](https://docs.databricks.com/aws/en/clean-rooms/create-clean-room) in the Databricks documentation. Use the correct sharing identifier based on the [UID2 environment](../getting-started/gs-environments) you want to connect to: see [UID2 Sharing Identifiers](#uid2-sharing-identifiers). @@ -90,13 +88,13 @@ Add one or more tables or views to the clean room. You can use any names for the ### Map DII -Run the `identity_map_v3` clean room [notebook](https://docs.databricks.com/aws/en/notebooks/) to map email addresses, phone numbers, or their respective hashes to raw UID2s. +Run the `identity_map_v3` Databricks Clean Rooms [notebook](https://docs.databricks.com/aws/en/notebooks/) to map email addresses, phone numbers, or their respective hashes to raw UID2s. A successful notebook run results in raw UID2s populated in the output table. For details, see [Output Table](#output-table). -## Running the Clean Room Notebook +## Running the Clean Rooms Notebook -This section provides details to help you use your Databricks clean room to process your DII into raw UID2s, including the following: +This section provides details to help you use your Databricks Clean Rooms environment to process your DII into raw UID2s, including the following: - [Notebook Parameters](#notebook-parameters) - [Input Table](#input-table) @@ -115,7 +113,7 @@ For example, to map DII in the clean room table named `creator.default.emails`, | Parameter Name | Description | | :--- | :--- | | `input_schema` | The schema containing the table or view. | -| `input_table` | The name of the table or view containing the DII to be mapped. | +| `input_table` | The name you specify for the table or view containing the DII to be mapped. | ### Input Table @@ -163,7 +161,13 @@ The following table shows possible values for the `UNMAPPED` column in the outpu ## Testing in the Integ Environment -[**GWH__MC content for this section is to come.**] +If you'd like to test the Databricks Clean Rooms implementation before signing a UID2 POC, you can ask your UID2 contact for access in the integ (integration) environment. This environment is for testing only, and has no production data. + +In the request, be sure to include your sharing identifier, and use the sharing identifier for the UID2 integration environment. For details, see [UID2 Sharing Identifiers](#uid2-sharing-identifiers). + +While you're waiting to hear back, you could create the clean room, invite UID2, and put your assets into the clean room. For details, see [Integration Steps](#integration-steps). + +When your access is ready, your UID2 contact notifies you. ## Reference @@ -192,6 +196,3 @@ At the top, click the gear icon and select **Delta Sharing**. On the **Shared with me** tab, in the upper right, click your Databricks sharing organization and then select **Copy sharing identifier**. For details, see [Request the recipient's sharing identifier](https://docs.databricks.com/aws/en/delta-sharing/create-recipient#step-1-request-the-recipients-sharing-identifier) in the Databricks documentation. - - - diff --git a/docs/guides/summary-guides.md b/docs/guides/summary-guides.md index 4c2ca4727..b2090ea6b 100644 --- a/docs/guides/summary-guides.md +++ b/docs/guides/summary-guides.md @@ -102,6 +102,7 @@ The following documentation resources are available for advertisers and data pro | :--- | :--- | | [Advertiser/Data Provider Overview](integration-advertiser-dataprovider-overview.md) | This guide provides an overview of integration options for organizations that collect user data and push it to other UID2 participants. | | [Snowflake Integration Guide](integration-snowflake.md) | Instructions for generating UID2s from emails within Snowflake. | +| [Databricks Clean Rooms Integration Guide](integration-databricks.md) | Instructions for generating UID2s from emails or phone numbers in a Databricks Clean Rooms environment. | | [AWS Entity Resolution Integration Guide](integration-aws-entity-resolution.md) | Instructions for integrating with the UID2 framework using AWS Entity Resolution. | | [Advertiser/Data Provider Integration to HTTP Endpoints](integration-advertiser-dataprovider-endpoints.md) | This guide covers integration steps for advertisers and data providers to integrate with UID2 by writing code to call UID2 HTTP endpoints, rather than using another implementation option such as an SDK, Snowflake, or AWS Entity Resolution. | | [Client-Side Integration Guide for JavaScript](integration-javascript-client-side.md) | A guide for advertisers and data providers who want to use this SDK for adding a UID2 token to their tracking pixels. | diff --git a/docs/overviews/overview-advertisers.md b/docs/overviews/overview-advertisers.md index bd4b5151d..de5ff9645 100644 --- a/docs/overviews/overview-advertisers.md +++ b/docs/overviews/overview-advertisers.md @@ -76,6 +76,7 @@ The following documentation resources are available for advertisers and data pro | :--- | :--- | :--- | | Overview of integration options for organizations that collect user data and push it to other UID2 participants | [Advertiser/Data Provider Integration Overview](../guides/integration-advertiser-dataprovider-overview.md) | This guide covers integration workflows for mapping identity for audience-building and targeting. | | Snowflake | [Snowflake Integration Guide](../guides/integration-snowflake.md) | This guide provides instructions for generating UID2s from emails within Snowflake. | +| Databricks Clean Rooms | [Databricks Clean Rooms Integration Guide](../guides/integration-databricks.md) | This guide provides instructions for generating UID2s from emails or phone numbers in a Databricks Clean Rooms environment. | | AWS Entity Resolution | [AWS Entity Resolution Integration Guide](../guides/integration-aws-entity-resolution.md) | This guide provides instructions for integrating with the UID2 framework using AWS Entity Resolution. | | Integration steps for organizations that collect user data and push it to other UID2 participants, using UID2 HTTP endpoints only | [Advertiser/Data Provider Integration to HTTP Endpoints](../guides/integration-advertiser-dataprovider-endpoints.md) | This guide covers integration steps for advertisers and data providers to integrate with UID2 by writing code to call UID2 HTTP endpoints, rather than using another implementation option such as an SDK, Snowflake, or AWS Entity Resolution. | | Integration steps for advertisers and data providers who want to use the client-side JavaScript SDK for adding a UID2 token to their tracking pixels. | [Client-Side Integration Guide for JavaScript](../guides/integration-javascript-client-side.md) | This guide provides instructions for generating UID2 tokens (advertising tokens) using only JavaScript client-side changes. | diff --git a/docs/overviews/overview-data-providers.md b/docs/overviews/overview-data-providers.md index 9da97cbfa..f73be112d 100644 --- a/docs/overviews/overview-data-providers.md +++ b/docs/overviews/overview-data-providers.md @@ -81,6 +81,7 @@ The following documentation resources are available for advertisers and data pro | :--- | :--- | :--- | | Overview of integration options for organizations that collect user data and push it to other UID2 participants | [Advertiser/Data Provider Integration Overview](../guides/integration-advertiser-dataprovider-overview.md) | This guide covers integration workflows for mapping identity for audience-building and targeting. | | Snowflake | [Snowflake Integration Guide](../guides/integration-snowflake.md) | This guide provides instructions for generating UID2s from emails within Snowflake. | +| Databricks Clean Rooms | [Databricks Clean Rooms Integration Guide](../guides/integration-databricks.md) | This guide provides instructions for generating UID2s from emails or phone numbers in a Databricks Clean Rooms environment. | | AWS Entity Resolution | [AWS Entity Resolution Integration Guide](../guides/integration-aws-entity-resolution.md) | This guide provides instructions for integrating with the UID2 framework using AWS Entity Resolution. | | Integration steps for organizations that collect user data and push it to other UID2 participants, using UID2 HTTP endpoints only | [Advertiser/Data Provider Integration to HTTP Endpoints](../guides/integration-advertiser-dataprovider-endpoints.md) | This guide covers integration steps for advertisers and data providers to integrate with UID2 by writing code to call UID2 HTTP endpoints, rather than using another implementation option such as an SDK, Snowflake, or AWS Entity Resolution. | | Integration steps for advertisers and data providers who want to use the client-side JavaScript SDK for adding a UID2 token to their tracking pixels. | [Client-Side Integration Guide for JavaScript](../guides/integration-javascript-client-side.md) | This guide provides instructions for generating UID2 tokens (advertising tokens) using only JavaScript client-side changes. | diff --git a/docs/ref-info/updates-doc.md b/docs/ref-info/updates-doc.md index 2f2e363b0..a61fe8735 100644 --- a/docs/ref-info/updates-doc.md +++ b/docs/ref-info/updates-doc.md @@ -32,7 +32,7 @@ November 25, 2025 We've added an integration guide for the UID2 Databricks integration. -For details, see [UID2 Databricks Clean Room Integration Guide](../guides/integration-databricks.md). +For details, see [UID2 Databricks Clean Rooms Integration Guide](../guides/integration-databricks.md). diff --git a/docs/summary-doc-v2.md b/docs/summary-doc-v2.md index e844ace25..3124b91ff 100644 --- a/docs/summary-doc-v2.md +++ b/docs/summary-doc-v2.md @@ -18,5 +18,5 @@ For details on using the API, see the following pages. | :--- | :--- | | [Encrypting Requests and Decrypting Responses](getting-started/gs-encryption-decryption.md) | The high-level request-response workflow for the UID2 APIs, requirements for encrypting requests and decrypting responses, and respective script examples in different programming languages. | | [Endpoints](endpoints/summary-endpoints.md) | The API reference for managing identity tokens and mapping email addresses, phone numbers, or hashes to their UID2s and salt bucket IDs used to generate the UID2s.
NOTE: The integration environment and the production environment require different [API keys](ref-info/glossary-uid.md#gl-api-key). | -| [Integration Guides](guides/summary-guides.md) | The UID2 integration workflows for UID2 participants, such as publishers, DSPs, advertisers, and data providers, as well as Operator Enterprise Partners, such as Microsoft Azure, AWS, and Snowflake. | +| [Integration Guides](guides/summary-guides.md) | The UID2 integration workflows for UID2 participants, such as publishers, DSPs, advertisers, and data providers, as well as Operator Enterprise Partners, such as Microsoft Azure, AWS, Snowflake, and Databricks. | | [SDKs](sdks/summary-sdks.md) | Links to documentation for using UID2 SDKs. | diff --git a/i18n/ja/docusaurus-plugin-content-docs/current/guides/integration-databricks.md b/i18n/ja/docusaurus-plugin-content-docs/current/guides/integration-databricks.md index 01daa665b..776254ca1 100644 --- a/i18n/ja/docusaurus-plugin-content-docs/current/guides/integration-databricks.md +++ b/i18n/ja/docusaurus-plugin-content-docs/current/guides/integration-databricks.md @@ -10,6 +10,6 @@ displayed_sidebar: docs import Link from '@docusaurus/Link'; -# UID2 Databricks Clean Room Integration Guide +# UID2 Databricks Clean Rooms Integration Guide **COPY OF DATABRICKS DOC WILL GO HERE WHEN IT'S FINALIZED.** From 162dd054718608375a423bcd9abfc603571c7a46 Mon Sep 17 00:00:00 2001 From: genwhittTTD Date: Tue, 25 Nov 2025 17:17:10 -0500 Subject: [PATCH 07/16] edits --- docs/guides/integration-databricks.md | 2 +- .../current/guides/integration-databricks.md | 187 +++++++++++++++++- 2 files changed, 186 insertions(+), 3 deletions(-) diff --git a/docs/guides/integration-databricks.md b/docs/guides/integration-databricks.md index fae47ecab..5654a58c6 100644 --- a/docs/guides/integration-databricks.md +++ b/docs/guides/integration-databricks.md @@ -145,7 +145,7 @@ The following table provides information about the structure of the output data, | :--- | :--- | :--- | | `UID` | string | The value is one of the following:
  • **DII was successfully mapped**: The UID2 associated with the DII.
  • **Otherwise**: `NULL`.
| | `PREV_UID` | string | The value is one of the following:
  • **DII was successfully mapped and the current raw UID2 was rotated in the last 90 days**: the previous raw UID2.
  • **Otherwise**: `NULL`.
| -| `REFRESH_FROM` | timestamp | The value is one of the following:
  • **DII was successfully mapped**: The timestamp (in epoch seconds) indicating when this UID2 should be refreshed.
  • **Otherwise**: `NULL`.
| +| `REFRESH_FROM` | timestamp | The value is one of the following:
  • **DII was successfully mapped**: The timestamp indicating when this UID2 should be refreshed.
  • **Otherwise**: `NULL`.
| | `UNMAPPED` | string | The value is one of the following:
  • **DII was successfully mapped**: `NULL`.
  • **Otherwise**: The reason why the identifier was not mapped: `OPTOUT`, `INVALID IDENTIFIER`, or `INVALID INPUT TYPE`.
    For details, see [Values for the UNMAPPED Column](#values-for-the-unmapped-column).
| #### Values for the UNMAPPED Column diff --git a/i18n/ja/docusaurus-plugin-content-docs/current/guides/integration-databricks.md b/i18n/ja/docusaurus-plugin-content-docs/current/guides/integration-databricks.md index 776254ca1..5654a58c6 100644 --- a/i18n/ja/docusaurus-plugin-content-docs/current/guides/integration-databricks.md +++ b/i18n/ja/docusaurus-plugin-content-docs/current/guides/integration-databricks.md @@ -10,6 +10,189 @@ displayed_sidebar: docs import Link from '@docusaurus/Link'; -# UID2 Databricks Clean Rooms Integration Guide +# Databricks Clean Rooms Integration Guide -**COPY OF DATABRICKS DOC WILL GO HERE WHEN IT'S FINALIZED.** +This guide is for advertisers and data providers who want to convert their user data to raw UID2s in a Databricks environment. + +## Integration Overview + +[Databricks Clean Rooms](https://docs.databricks.com/aws/en/clean-rooms/) is a Databricks data warehousing solution, where you as a partner can store your data and integrate with the UID2 framework. Using Databricks Clean Rooms, UID2 enables you to securely share consumer identifier data without exposing sensitive directly identifying information (DII). + +In the context of UID2, you set up the Databricks Clean Rooms environment and place your data there. You set up a trust relationship with the UID2 Operator and allow the Operator to convert your data to raw UID2s. + +With UID2 supported in the clean room, advertisers and data partners can securely process their first-party data within Databricks. + +[**GWH__EE01 is it only first-party data, or just data? If they're just sending phone numbers and emails, I don't see what the difference is... it's just data?**] + +[**GWH__EE02 Please provide any additional content you want in the overview. Thx.**] + + + +## Functionality + +The following table summarizes the functionality available with the UID2 Databricks integration. + +| Encrypt Raw UID2 to UID2 Token for Sharing | Decrypt UID2 Token to Raw UID2 | Generate UID2 Token from DII | Refresh UID2 Token | Map DII to Raw UID2s | +| :--- | :--- | :--- | :--- | :--- | +| — | — | — | — | ✅ | + +## Key Benefits + +Here are some key benefits of integrating with Databricks for your UID2 processing: + +- Native support for managing UID2 workflows within a Databricks data clean room. +- Secure identity interoperability between partner datasets. +- Direct lineage and observability for all UID2-related transformations and joins, for auditing and traceability. +- Streamlined integration between UID2 identifiers and The Trade Desk activation ecosystem. +- Self-service support for marketers and advertisers through Databricks. + +## Integration Steps + +At a high level, the following are the steps to set up your Databricks integration and process your data: + +1. [Create a clean room for UID2 collaboration](#create-clean-room-for-uid2-collaboration). +1. [Send your Databricks sharing identifier to your UID2 contact](#send-sharing-identifier-to-uid2-contact). +1. [Add data to the clean room](#add-data-to-the-clean-room). +1. [Map DII](#map-dii) by running the clean room notebook. + +### Create Clean Room for UID2 Collaboration + +As a starting point, create a Databricks Clean Rooms environment—a secure environment for you to collaborate with UID2 to process your data. + +Follow the steps in [Create clean rooms](https://docs.databricks.com/aws/en/clean-rooms/create-clean-room) in the Databricks documentation. Use the correct sharing identifier based on the [UID2 environment](../getting-started/gs-environments) you want to connect to: see [UID2 Sharing Identifiers](#uid2-sharing-identifiers). + +:::important +After you've created a clean room, you cannot change its collaborators. If you have the option to set clean room collaborator aliases—for example, if you’re using the Databricks Python SDK to create the clean room—your collaborator alias must be `creator` and the UID2 collaborator alias must be `collaborator`. If you’re creating the clean room using the Databricks web UI, the correct collaborator aliases are set for you. +::: + +### Send Sharing Identifier to UID2 Contact + +To establish a relationship with your UID2 contact, you'll need to send the Databricks sharing identifier. + +The sharing identifier is a string in this format: `::`. + +Follow these steps: + +1. Find the sharing identifier for the Unity Catalog metastore that is attached to the Databricks workspace where you’ll work with the clean room. + + For information on how to find this value, see [Finding a Sharing Identifier](#finding-a-sharing-identifier). +1. Send the sharing identifier to your UID2 contact. + +### Add Data to the Clean Room + +Add one or more tables or views to the clean room. You can use any names for the schema, tables, and views. Tables and views must follow the schema detailed in [Input Table](#input-table ). + +### Map DII + +Run the `identity_map_v3` Databricks Clean Rooms [notebook](https://docs.databricks.com/aws/en/notebooks/) to map email addresses, phone numbers, or their respective hashes to raw UID2s. + +A successful notebook run results in raw UID2s populated in the output table. For details, see [Output Table](#output-table). + +## Running the Clean Rooms Notebook + +This section provides details to help you use your Databricks Clean Rooms environment to process your DII into raw UID2s, including the following: + +- [Notebook Parameters](#notebook-parameters) +- [Input Table](#input-table) +- [DII Format and Normalization](#dii-format-and-normalization) +- [Output Table](#output-table) +- [Output Table Schema](#output-table-schema) + +### Notebook Parameters + +You can use the `identity_map_v3` notebook to map DII in any table or view that you've added to the `creator` catalog of the clean room. + +The notebook has two parameters, `input_schema` and `input_table`. Together, these two parameters identify the table or view in the clean room that contains the DII to be mapped. + +For example, to map DII in the clean room table named `creator.default.emails`, set `input_schema` to `default` and `input_table` to `emails`. + +| Parameter Name | Description | +| :--- | :--- | +| `input_schema` | The schema containing the table or view. | +| `input_table` | The name you specify for the table or view containing the DII to be mapped. | + +### Input Table + +The input table or view must have the two columns shown in the following table. The table or view can have additional columns, but the notebook doesn't use any additional columns, only these two. + +| Column Name | Data Type | Description | +| :--- | :--- | :--- | +| `INPUT` | string | The DII to map. | +| `INPUT_TYPE` | string | The type of DII to map. Allowed values: `email`, `email_hash`, `phone`, and `phone_hash`. | + +### DII Format and Normalization + +The normalization requirements depend on the type of DII you're processing, as follows: + +- **Email address**: The notebook normalizes the data using the UID2 [Email Address Normalization](../getting-started/gs-normalization-encoding#email-address-normalization) rules. +- **Phone number**: You must normalize the phone number before mapping it with the notebook, using the UID2 [Phone Number Normalization](../getting-started/gs-normalization-encoding#phone-number-normalization) rules. + +### Output Table + +If the clean room has an output catalog, the mapped DII is written to a table in the output catalog. Output tables are stored for 30 days. + +For details, see [Overview of output tables](https://docs.databricks.com/aws/en/clean-rooms/output-tables#overview-of-output-tables) in the Databricks documentation. + +### Output Table Schema + +The following table provides information about the structure of the output data, including field names and values. + +| Column Name | Data Type | Description | +| :--- | :--- | :--- | +| `UID` | string | The value is one of the following:
  • **DII was successfully mapped**: The UID2 associated with the DII.
  • **Otherwise**: `NULL`.
| +| `PREV_UID` | string | The value is one of the following:
  • **DII was successfully mapped and the current raw UID2 was rotated in the last 90 days**: the previous raw UID2.
  • **Otherwise**: `NULL`.
| +| `REFRESH_FROM` | timestamp | The value is one of the following:
  • **DII was successfully mapped**: The timestamp indicating when this UID2 should be refreshed.
  • **Otherwise**: `NULL`.
| +| `UNMAPPED` | string | The value is one of the following:
  • **DII was successfully mapped**: `NULL`.
  • **Otherwise**: The reason why the identifier was not mapped: `OPTOUT`, `INVALID IDENTIFIER`, or `INVALID INPUT TYPE`.
    For details, see [Values for the UNMAPPED Column](#values-for-the-unmapped-column).
| + +#### Values for the UNMAPPED Column + +The following table shows possible values for the `UNMAPPED` column in the output table schema. + +| Value | Meaning | +| :--- | :--- | +| `NULL` | The DII was successfully mapped. | +| `OPTOUT` | The user has opted out. | +| `INVALID IDENTIFIER` | The email address or phone number is invalid. | +| `INVALID INPUT TYPE` | The value of `INPUT_TYPE` is invalid. Valid values for `INPUT_TYPE` are: `email`, `email_hash`, `phone`, `phone_hash`. | + +## Testing in the Integ Environment + +If you'd like to test the Databricks Clean Rooms implementation before signing a UID2 POC, you can ask your UID2 contact for access in the integ (integration) environment. This environment is for testing only, and has no production data. + +In the request, be sure to include your sharing identifier, and use the sharing identifier for the UID2 integration environment. For details, see [UID2 Sharing Identifiers](#uid2-sharing-identifiers). + +While you're waiting to hear back, you could create the clean room, invite UID2, and put your assets into the clean room. For details, see [Integration Steps](#integration-steps). + +When your access is ready, your UID2 contact notifies you. + +## Reference + +This section includes the following reference information: + +- [UID2 Sharing Identifiers](#uid2-sharing-identifiers) +- [Finding the Sharing Identifier for Your UID2 Contact](#finding-the-sharing-identifier-for-your-uid2-contact) + +### UID2 Sharing Identifiers + +UID2 sharing identifiers can change. Be sure to check this page for the latest sharing identifiers. + +| Environment | UID2 Sharing Identifier | +| :--- | :--- | +| Production | `aws:us-east-2:21149de7-a9e9-4463-b4e0-066f4b033e5d:673872910525611:010d98a6-8cf2-4011-8bf7-ca45940bc329` | +| Integration | `aws:us-east-2:4651b4ea-b29c-42ec-aecb-2377de70bbd4:2366823546528067:c15e03bf-a348-4189-92e5-68b9a7fb4018` | + +### Finding a Sharing Identifier + +To find the sharing identifier for your UID2 contact, follow these steps: + +In your Databricks workspace, in the Catalog Explorer, click **Catalog**. + +At the top, click the gear icon and select **Delta Sharing**. + +On the **Shared with me** tab, in the upper right, click your Databricks sharing organization and then select **Copy sharing identifier**. + +For details, see [Request the recipient's sharing identifier](https://docs.databricks.com/aws/en/delta-sharing/create-recipient#step-1-request-the-recipients-sharing-identifier) in the Databricks documentation. From 97f83484b26dc865638a43e6652f7c9e125b473e Mon Sep 17 00:00:00 2001 From: genwhittTTD Date: Wed, 26 Nov 2025 11:15:08 -0500 Subject: [PATCH 08/16] edits from MC --- docs/guides/integration-databricks.md | 10 ++++------ .../current/guides/integration-databricks.md | 10 ++++------ 2 files changed, 8 insertions(+), 12 deletions(-) diff --git a/docs/guides/integration-databricks.md b/docs/guides/integration-databricks.md index 5654a58c6..afc9c6764 100644 --- a/docs/guides/integration-databricks.md +++ b/docs/guides/integration-databricks.md @@ -16,15 +16,13 @@ This guide is for advertisers and data providers who want to convert their user ## Integration Overview -[Databricks Clean Rooms](https://docs.databricks.com/aws/en/clean-rooms/) is a Databricks data warehousing solution, where you as a partner can store your data and integrate with the UID2 framework. Using Databricks Clean Rooms, UID2 enables you to securely share consumer identifier data without exposing sensitive directly identifying information (DII). +[Databricks Clean Rooms](https://docs.databricks.com/aws/en/clean-rooms/) is Databricks feature that provides a secure and privacy-protecting environment for working on sensitive data. Using Databricks Clean Rooms, UID2 enables you to securely share consumer identifier data without exposing sensitive directly identifying information (DII). In the context of UID2, you set up the Databricks Clean Rooms environment and place your data there. You set up a trust relationship with the UID2 Operator and allow the Operator to convert your data to raw UID2s. -With UID2 supported in the clean room, advertisers and data partners can securely process their first-party data within Databricks. +With UID2 supported in the clean room, advertisers and data partners can securely process their data within Databricks. -[**GWH__EE01 is it only first-party data, or just data? If they're just sending phone numbers and emails, I don't see what the difference is... it's just data?**] - -[**GWH__EE02 Please provide any additional content you want in the overview. Thx.**] +[**GWH__EE Please provide any additional content you want in the overview. Thx.**]