Troubleshooting

Connector Issues

How to file a new issue
How to debug log files

Common Issues

I don't see lineage in Microsoft Purview
Databricks Cluster Won't Start
Databricks Driver Logs: EventEmitter Could Not Emit Lineage
Error Loading to Purview: 403 Forbidden
Missing OpenLineage to Purview mapping data for this source
Error getting Authentication Token for Databricks API
Function Application Code Did Not Deploy

Demo Deployment Issues

$'\r': command not found
OpenLineage jar did not upload completely to Cloud Shell
No policy exists for Purview Collection during deployment
Databricks Notebooks do not exist

Other Issues

VS Code Pop-Up: "Some projects have trouble loading" when opening the folder
Driver Crashing with OpenLineage Installed

How to file a new issue

When filing a new issue, please include associated log message(s) from Azure Functions. This will allow the core team to debug within our test environment to validate the issue and develop a solution. Before submitting a new issue, please review the known issues below, as well as the limitations which affect what sort of lineage can be collected.

How to debug log files

Open Azure Portal > Resource Group > Function App > Functions
Select either the OpenLineageIn or PurviewOut function
Click Monitor in the left-hand menu
1. Click the linked timestamps within Invocation Traces to view details of past events
2. Click the Logs tab to view the live events. (Ensure both connected and timestamped welcome messages appear.)

Databricks Cluster Won't Start

Init Script Fails

When installing the OpenLineage init script and Jar, the Databricks cluster will not start. You may receive a Databricks Event Log event indicating the init script failed.

Solution: Ensure that the init script uploaded to Databricks uses Line Feed (LF) and does not use Carriage Returns (CRLF). If you are using Windows, your development environment may default to CRLF which is not accepted on Databricks. To fix this, download / edit your init script and use an IDE that supports changing the line endings to LF. For example, VS Code and Notepad++ indicate the line endings in the bottom right corner of the window. Select CRLF and change it to LF. Save the file and upload to DBFS.

If the file is already LF, confirm that the OpenLineage Jar was uploaded properly. Ensure its location matches your init script (expected location is /dbfs/databricks/openlineage) and that the file pattern is correct (default is openlineage-spark-*.jar). If you uploaded the OpenLineage jar via the Databricks UI / DBFS UI, it may have replaced hyphens (-) with underscores (_) causing the wildcard pattern to fail.

In this case, use the databricks CLI to upload the jar to the expected location to avoid changes in the file name.

Internal Error Resolving Secrets

For the demo deployment, if your cluster fails and returns the error "Internal Error resolving secrets" and "Failed to fetch secrets referred to in Spark Conf", the deployment script may have failed to add an Access Policy to the Azure Key Vault or the secret scope was not created.

Solution: Update the values in the below script and execute it in the cloud shell. This script deletes the demo deployment's secret scope and then recreates it. After executing the script, you should see an access policy for "AzureDatabricks" in your Azure Key Vault.

adb_ws_url=adb-DATABRICKS_WORKSPACE.ID.azuredatabricks.net
global_adb_token=$(az account get-access-token --resource 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d -o tsv --query '[accessToken]')
adb_ws_id=/subscriptions/SUBSCRIPTION_ID/resourceGroups/RESOURCE_GROUP_NAME/providers/Microsoft.Databricks/workspaces/DATABRICKS_WORKSPACE_NAME
subscription_id=123acb-456-def
akv_name=AKV_NAME
akv_resource_id=/subscriptions/SUBSCRIPTION_ID/resourceGroups/RESOURCE_GROUP_NAME/providers/Microsoft.KeyVault/vaults/AKV_NAME

# Remove the Secret Scope if it exists
cat << EOF > delete-scope.json
{
"scope": "purview-to-adb-kv"
}
EOF

curl \
-X POST https://$adb_ws_url/api/2.0/secrets/scopes/delete \
-H "Authorization: Bearer $global_adb_token" \
-H "X-Databricks-Azure-Workspace-Resource-Id: $adb_ws_id" \
--data @delete-scope.json

# If the above fails, that's okay
# Ultimately, we just need a clean slate

cat << EOF > create-scope.json
{
"scope": "purview-to-adb-kv",
"scope_backend_type": "AZURE_KEYVAULT",
"backend_azure_keyvault":
{
    "resource_id": "$akv_resource_id",
    "dns_name": "https://$akv_name.vault.azure.net/"
},
"initial_manage_principal": "users"
}
EOF


curl \
-X POST https://$adb_ws_url/api/2.0/secrets/scopes/create \
-H "Authorization: Bearer $global_adb_token" \
-H "X-Databricks-Azure-Workspace-Resource-Id: $adb_ws_id" \
--data @create-scope.json

I don't see lineage in Microsoft Purview

Try Refreshing the Page

Microsoft Purview tends to cache the results to improve query response time. You may not see the results in the Purview UI immediately.

Solution: Refresh the page using the Refresh button in the Purview UI. You may consider returning to the page after a minute or two and attempting the refresh again.
Confirm the Azure Function Code Was Deployed

Navigate to the Azure Function and check if you are receiving the alert Microsoft.Azure.WebJobs.Extensions.FunctionMetadataLoader: The file 'C:\home\site\wwwroot\worker.config.json' was not found..

Solution: Restart or start and stop the function to resolve the issue. If the issue persists, consider deploying the function code via VS Code or the Azure CLI.
Check Azure Key Vault References Are Active

The connector uses Key Vault References inside the Azure Functions used to translate OpenLineage to Apache Atlas standards. When first launching the services, the Key Vault references may not have activated / synced. You will see red "x" marks in the Function App's Configuration menu.

Solution: Wait two to five minutes after deploying the connector and check back to confirm that the Function can communicate with the Key Vault. If the connection still shows red "x" marks, confirm that the Key Vault has an access policy for your Azure Function's managed identity.
Confirm that Custom Types Are Available

If you have deployed the demo or connector solution, the custom types used by the connector may not have been uploaded.

Solution: Follow the post-installation step and confirm that the types are created. If the types have not been created already, you will receive a JSON payload of the created types (e.g. {"enumDefs":[], "classificationDefs":[], "entityDefs":[...], ... }). If the types already exist, you will receive a message indicating they exist already.
Confirm that you are NOT using Spark Streaming

Spark Structured Streaming and Spark DStreams are not supported in this release of the solution accelerator.

Databricks Driver Logs: EventEmitter Could Not Emit Lineage

When reviewing the Driver logs, you see an error in the Log4j output that indicates the EventEmitter class had an exception and could not emit lineage. You do not see any events in OpenLineageIn or PurviewOut functions either.

Solution: This indicates a problem connecting to the Azure Function from Databricks.

Confirm spark.openlineage.url.param.code and spark.openlineage.host values are set and correct.
Confirm that the Azure Function is currently on and has the correct API routes for OpenLineageIn
Confirm that spark.openlineage.version is set correctly.

SA Release OpenLineage Jar spark.openlineage.version

1.0.x 0.8.2 1

1.1.x 0.8.2 1

2.x.x or newer 0.11.0 or newer v1

PurviewOut Logs: Error Loading to Purview: 403 Forbidden

When reviewing the Purview Out function logs, you see an error that indicates there was an error loading assets to Microsoft Purview. The errors looks similar to Error Loading to Purview: Return Code: 403 - Reason:Forbidden

Solution: This indicates authorization is not correct for the Service Principal.

Your ClientId, ClientSecret or Certificate values are correct
Certificate should be of the form: {"SourceType": "KeyVault","KeyVaultUrl": "https://akv-service-name.vault.azure.net/","KeyVaultCertificateName": "myCertName"}
You have given the Service Principal permission to access Microsoft Purview ( auth using a service principal )

PurviewOut Logs: Missing Ol to Purview mapping data for this source

You may be working with data sources that are supported by OpenLineage but not supported by the Solution Accelerator for ingestion into Purview.

Solution:

Confirm that the OlToPurviewMappings app setting is populated and matches your release’s version of OlToPurviewMappings.json
Review the Databricks Driver Logs and identify the namespace authority (e.g. jdbc, abfss, dbfs) for the OpenLineage Inputs/Outputs being emitted
Look for the json after the EventEmitter logging statement. Then look at the inputs / outputs field in the emitted JSON.
If the authority is not found in the OlToPurviewMappings JSON, it is an unsupported type. Consider modifying the OlToPurviewMappings to your need.

Error getting Authentication Token for Databricks API

The Service Principal is unable to retrieve an access token for Databricks.

Solution: Ensure the Service Principal is a user in the Databricks workspace.

Add your Service Principal to Databricks
Assign the Service Principal as a contributor to the Databricks Workspace

Function Application Code Did Not Deploy

When running the openlineage-deployment.sh script or newdeploymenttemp.json, your deployment succeeds but the Azure Function deployed does not include the OpenLineageIn or PurviewOut functions.

Solution: First, attempt the deployment a second time. It may just be an intermittent issue. Secondly, if the Function service deployed successfully (including the app settings), you may consider manually deploying the Azure Function application code.

Download the latest release zip file (see list of all releases on Github).

Using the Azure CLI to deploy the zip file to the Function service:

az functionapp deployment source config-zip \
-g <RESOURCE_GROUP_NAME> \
-n <FUNCTION_SERVICE_NAME> \
--src <PATH_TO_RELEASE_ZIP_FILE>

Demo Deployment: $'\r': command not found

When running the openlineage-deployment.sh script, you received either $'\r': command not found or syntax error near unexpected token $'\r'. This can occur when you have used git to clone the shell script (.sh) to a Windows OS file system with git core.autocrlf enabled. This will only occur if you cloned the repo locally rather than using the Bash cloud shell.

Solution: Use the Azure Cloud shell to do your deployment. Alternatively, if you must use a Windows OS machine, use your preferred file editor to remove \r\n and replace with \n. Lastly, you may consider using the zip file download rather than the git clone option provided by GitHub.

Demo Deployment: OpenLineage jar did not upload completely to Cloud Shell

When uploading the OpenLineage jar to the cloud shell, it may not complete the upload due to a transient issue. You may receive a message such as cannot access '*.jar': No such file or directory when running openlineage-deployment.sh.

Solution: Attempt the jar upload to the cloud shell again and confirm the jar has successfully uploaded. Re-run the openlineage-deployment.sh script.

Demo Deployment: No policy exists for Purview Collection during deployment

This means the current person running the script could not be added as a Purview Collection Admin, either because insufficient permission or because the Purview is still being deployed with some additional resources in backend.

Solution: If you don't have the correct permissions, consider requesting an existing Purview Collection Admin to add you as a collection admin.

Demo Deployment: No notebooks exist after deployment

The demo deployment appears to be successful but when reviewing the sample notebook in the Databricks workspace they are either missing or return a 404: Not Found in the Databricks UI.

Solution: Manually import the notebook found at the deployment\deployment-assets\openlineage_sample.scala location.

VS Code Pop-Up: "Some projects have trouble loading" when opening the folder.

When opening the cloned repo in VS Code, you may see a pop up saying "Some projects have trouble loading". This is due to Auth libraries not yet being specifically available for dotnet 6.0.

Solution: You can safely ignore these warnings.

Driver Crashes When Using OpenLineage

When using SaveAsTable and Overwrite Mode

When using OpenLineage 0.11.0 with Databricks Runtime 10.4 and executing a command like df.write.mode("overwrite").saveAsTable("default.mytable"), the driver crashes. This is due to a bug in OpenLineage which did not separate out certain commands for Spark 3.2 vs Spark 3.1.

Solution: Upgrade to OpenLineage 0.13.0 which has a fix for the issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TROUBLESHOOTING.md

TROUBLESHOOTING.md

Troubleshooting

Connector Issues

Common Issues

Demo Deployment Issues

Other Issues

How to file a new issue

How to debug log files

Databricks Cluster Won't Start

Init Script Fails

Internal Error Resolving Secrets

I don't see lineage in Microsoft Purview

Try Refreshing the Page

Confirm the Azure Function Code Was Deployed

Check Azure Key Vault References Are Active

Confirm that Custom Types Are Available

Confirm that you are NOT using Spark Streaming

Databricks Driver Logs: EventEmitter Could Not Emit Lineage

PurviewOut Logs: Error Loading to Purview: 403 Forbidden

PurviewOut Logs: Missing Ol to Purview mapping data for this source

Error getting Authentication Token for Databricks API

Function Application Code Did Not Deploy

Demo Deployment: $'\r': command not found

Demo Deployment: OpenLineage jar did not upload completely to Cloud Shell

Demo Deployment: No policy exists for Purview Collection during deployment

Demo Deployment: No notebooks exist after deployment

VS Code Pop-Up: "Some projects have trouble loading" when opening the folder.

Driver Crashes When Using OpenLineage

When using SaveAsTable and Overwrite Mode

SA Release	OpenLineage Jar	spark.openlineage.version
1.0.x	0.8.2	1
1.1.x	0.8.2	1
2.x.x or newer	0.11.0 or newer	v1

Files

TROUBLESHOOTING.md

Latest commit

History

TROUBLESHOOTING.md

File metadata and controls

Troubleshooting

Connector Issues

Common Issues

Demo Deployment Issues

Other Issues

Init Script Fails

Internal Error Resolving Secrets

Try Refreshing the Page

Confirm the Azure Function Code Was Deployed

Check Azure Key Vault References Are Active

Confirm that Custom Types Are Available

Confirm that you are NOT using Spark Streaming

When using SaveAsTable and Overwrite Mode