Skip to content

threadkeeper/purview-api-tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Purview API Tools

GitHub Copilot Claude

All code was generated by Claude Opus 4.6

Automate Microsoft Purview Unified Catalog operations that aren't possible through the portal UI alone.


TL;DR — Bulk-add an entire Purview Data Map collection of assets to a Unified Catalog Data Product in one script run. No portal clicking required.

🗺️ What's in this repo

File Purpose
add-collection-to-product.ps1 Interactive script that connects to your Purview account, discovers collections and their physical data assets, then programmatically registers and links them to a Data Product via the Data Governance API.

🤔 Why does this exist?

The Microsoft Purview portal lets you add data assets to a Data Product one page at a time — there's no "Select All" button and no built-in bulk operation. For collections with tens or hundreds of assets, this is painfully slow.

This tool uses the Purview Data Governance API (2025-09-15-preview) to automate the entire process:

  1. Registers each Data Map asset as a governance data asset (POST /datagovernance/catalog/dataAssets)
  2. Links it to your chosen Data Product (POST /datagovernance/catalog/dataProducts/{id}/relationships?entityType=DATAASSET)

Assets already registered or linked are automatically skipped — safe to re-run.


⚓ Prerequisites

  • 🪟 Windows (PowerShell 5.1+)
  • ☁️ Azure CLI installed and logged in (install)
  • 🔐 Purview roles — Data Curator or Data Product Owner on the collections and data products you're working with
  • 📦 curl.exe — included with Windows 10/11 by default
  • 🏢 Purview account with at least one scanned collection containing assets
  • 📋 Data Product already created in the Purview Unified Catalog portal

🏴‍☠️ Step-by-step guide

Step 1 — ☁️ Run the script

.\add-collection-to-product.ps1

That's it. The script walks you through everything interactively:

Step 2 — 🔐 Authenticate

The script checks if you're already logged in via Azure CLI. If not, it opens a browser for az login. You then select your subscription.

Step 3 — 🏢 Select Purview account

All Purview accounts in the subscription are listed. If there's only one, it's auto-selected.

Step 4 — 📂 Select Collection

Only collections that contain assets are shown, with asset counts:

Collections with assets:
  [1] Application Dev (id: ekiiom) - 58 assets
  [2] Data Analytics Dev (id: d8y2pf) - 31 assets

Press Enter to select the first one, or type a number.

Step 5 — 🔍 Asset discovery

The script searches the collection and filters to physical data assets only (tables, files, datasets). Non-data objects like schemas, server instances, and databases are excluded.

Supported asset types include:

  • SQL: azure_sql_mi_table, azure_sql_table, azure_sql_view
  • Lake/Blob: azure_datalake_gen2_path, azure_blob_path
  • Synapse: azure_synapse_dedicated_sql_table
  • Fabric: fabric_lakehouse_table, fabric_warehouse_table
  • Files: parquet_file, csv_file, json_file, delta_table
  • And more (see $physicalDataTypes in the script)

Step 6 — 📋 Select Data Product

Existing Data Products are listed with their current asset count:

Available data products:
  [1] Zava Mobile Application (Draft) - 3 assets currently assigned

Step 7 — 🚀 Bulk add assets

The script processes each asset:

  • Registers it in the governance layer if not already registered
  • Skips assets already linked to the data product
  • Links new assets via the Create Relationship API
  SKIP: Customers (already linked)
  ADDED: Countries_Archive
  ADDED: StateProvinces
  ADDED: Suppliers
  ...

────────────────────────────────────
  Results for 'Zava Mobile Application'
────────────────────────────────────
  Added:   45
  Skipped: 3 (already linked)
  Failed:  0
  Total:   48
────────────────────────────────────

🔧 API Reference

This tool uses the following Purview APIs:

API Version Purpose
GET /account/collections 2019-11-01-preview List Data Map collections
POST /catalog/api/search/query 2022-08-01-preview Search assets in a collection
GET /datagovernance/catalog/dataProducts 2025-09-15-preview List Unified Catalog data products
GET /datagovernance/catalog/dataAssets 2025-09-15-preview List governance data assets
POST /datagovernance/catalog/dataAssets 2025-09-15-preview Register a Data Map asset in governance layer
GET /datagovernance/catalog/dataProducts/{id}/relationships 2025-09-15-preview List data product relationships
POST /datagovernance/catalog/dataProducts/{id}/relationships 2025-09-15-preview Add asset to data product

Note: The Data Governance API (2025-09-15-preview) is in preview. The script uses curl.exe for these calls to avoid PowerShell Invoke-RestMethod body encoding issues with the API.


📝 Notes

  • 🔁 Idempotent — safe to run multiple times. Already-registered and already-linked assets are skipped.
  • 🏷️ Physical assets only — schemas, databases, server instances, and views are filtered out by default. Edit the $physicalDataTypes array to customize.
  • ⚠️ Preview API — the 2025-09-15-preview Data Governance API may change. Check Microsoft docs for updates.
  • 🔐 Auth scope — uses https://purview.azure.net as the token resource for all API calls.

threadkeeper

About

Automate Microsoft Purview Unified Catalog operations that aren't possible through the portal UI alone

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors