# Overview of Managed Connectors in Databricks Lakeflow Connect

Databricks Lakeflow Connect provides managed connectors for ingesting data from SaaS applications and databases. The resulting ingestion pipeline is:

- **Governed by Unity Catalog**
- **Powered by serverless compute and Delta Live Tables (DLT)**

Managed connectors leverage efficient incremental reads and writes to make data ingestion **faster, scalable, and more cost-efficient**, while keeping your data **fresh for downstream consumption**.

---

## SaaS Connector Components

A SaaS connector is modeled by the following components:

- **Connection**: A Unity Catalog securable object that stores authentication details for the application.
- **Ingestion pipeline**: Ingests the staged data into Delta tables. This component is modeled as a **serverless DLT pipeline**.

### SaaS Connector Components Diagram

![SaaS Connector Components](https://docs.databricks.com/aws/en/assets/images/saas-connector-components-799859c34970b8f86c6758f7ea510233.png)

---

## Database Connector Components

A database connector is modeled by the following components:

- **Connection**: A Unity Catalog securable object that stores authentication details for the database.
- **Gateway**: Extracts data from the source database and maintains the integrity of transactions during the transfer. For cloud-based databases, the gateway is configured as a **DLT pipeline with classic compute**.
- **Staging storage**: A Unity Catalog volume where data from the gateway is staged before being applied to a Delta table. The staging storage account is created when you deploy the gateway and exists within the catalog and schema that you specify.
- **Ingestion pipeline**: Ingests the staged data into Delta tables. This component is modeled as a **serverless DLT pipeline**.

![Database Connector Components](https://docs.databricks.com/aws/en/assets/images/database-connector-components-cceed910a6b9ab41d2348ee68fd361df.png)

##Managed SaaS connectors support column-level selection (API-only)

Managed connectors for enterprise applications (such as Salesforce, Workday, and ServiceNow) now support column-level selection and deselection. You can also specify whether to automatically ingest future columns as they're added to the source.

#### New Feature : Configure Ingestion with Column Selection and include new columns

It can be setup under the pipelines Lakeflow UI by selecting the data source

![Lakeflow UI](https://www.databricks.com/sites/default/files/inline-images/lakeflow-connect-video.gif?v=1718218999)


In [0]:
import requests

INSTANCE = "https://<databricks-instance>"
TOKEN = "<your-databricks-pat>"
CONNECTION_ID = "<your-connection-id>"

headers = {
    "Authorization": f"Bearer {TOKEN}",
    "Content-Type": "application/json"
}

# Example: Ingest only specific columns from "Account" table in Salesforce
payload = {
    "connection_id": CONNECTION_ID,
    "source": {
        "object": "Account",
        "columns": [
            {"name": "Id"},
            {"name": "Name"},
            {"name": "Industry"}
        ],
        "include_new_columns": False  # set to True to auto-ingest new columns in future
    },
    "target": {
        "schema_name": "salesforce",
        "table_name": "account_selected_columns"
    },
    "ingestion_type": "full"  # or "incremental"
}

response = requests.post(
    f"{INSTANCE}/api/2.0/lakeflow/ingestions",
    headers=headers,
    json=payload
)

print(response.status_code)
print(response.json())


# 📌 Notes

- `include_new_columns: true` automatically ingests newly added fields later.

### ✅ This works for connectors like:
- Salesforce  
- Workday  
- ServiceNow  

You can call this API repeatedly to:
- Create ingestions for multiple tables  
- Tweak column selections for each table  

If successful, the ingestion appears in the **Lakeflow UI** under:  
**`Connect > Ingestions`**


###Recommended for downstream processing 
- For Databricks Jobs (without DLT): You can handle schema drift by enabling .option("mergeSchema", "true") when writing to a Delta table — this allows new columns to be automatically added.

- For DLT Pipelines: DLT supports schema drift out-of-the-box with features like expectations, table() definitions, and Unity Catalog governance, making it easier to manage evolving schemas with built-in logging and monitoring.