# Connectors - Databricks Unity Catalog

[YData SDK provides a seamless integration with Databricks Delta Lake](https://ydata.ai), allowing you to connect,
query, and align with data resources between [ydata-sdk](https://pypi.org/project/ydata-sdk/) and Unity Catalog. This section will guide you through the benefits,
setup, and usage of the Databricks' connector within ydata-sdk.

### Benefits of Integration
Integrating ydata-sdk with Databricks offers several key benefits:

- **Enhanced Data Accessibility:** Seamlessly access and integrate previously siloed data.
- **Improved Data Quality:** Use YData's tools to enhance the quality of your data through data preparation and augmentation.
- **Scalability:** Leverage Databricks' robust infrastructure to scale data processing and AI workloads.
- **Streamlined Workflows:** Simplify data workflows with connectors and SDKs, reducing manual effort and potential errors.
- **Comprehensive Support:** Benefit from extensive documentation and support for both platforms, ensuring smooth integration and operation.

### Authenticate with your account YData

In [None]:
# Authenticate with your ydata-sdk token - https://dashboard.ydata.ai/
import os

os.environ['YDATA_LICENSE_KEY'] = '{add-your-key}'

## Create a Unity Catalog connector

Databricks Unity Catalog leverages the concept of [Delta Sharing](https://www.databricks.com/product/delta-sharing), meaning this is a great way not only to ensure alignment between Catalogs but also to limit the access to data. This means that byt leveraging the Unity Catalog connector, users can only access a set of data assets that were authorized for a given Share.


In [None]:
from ydata.connectors import DatabricksUnityCatalog

SHARE_NAME='insert-share-name'
SCHEMA_NAME='insert-schema-name'
TABLE_NAME='insert-table-name'

connector = DatabricksUnityCatalog(profile='insert-file-path')

## Read from you create Delta Share

In [23]:
#List the available shares for the provided authentication
connector.list_shares()

['teste']

In [24]:
#List the available schemas for a given share
connector.list_schemas(share_name='teste')

['berka']

In [28]:
#List the available tables for a given schema in a share
connector.list_tables(schema_name='berka',
                       share_name='teste')

['transactions', 'trans_drift_metrics', 'trans_profile_metrics']

In [29]:
#List all the tables regardless of share and schema
connector.list_all_tables()

{'transactions': {'share': 'teste', 'schema': 'berka'},
 'trans_drift_metrics': {'share': 'teste', 'schema': 'berka'},
 'trans_profile_metrics': {'share': 'teste', 'schema': 'berka'}}

## Read from you create Delta Share

Using the Unity Catalog connector it is possible to:
- Get a table from a Delta Share
- Get a sample from a Delta Share
- Get the data from a query from a Delta share

In [32]:
#This method reads all the data records in the table
table = connector.read_table(table_name='transactions',
                             schema_name='berka',
                             share_name='teste')
print(table)

This may cause some slowdown.
Consider scattering data ahead of time and using futures.


[1mDataset 
 
[0m[1mShape: [0m(1056320, 10)
[1mSchema: [0m
       Column Variable type
0    trans_id           int
1  account_id           int
2        date           int
3        type        string
4   operation        string
5      amount         float
6     balance         float
7    k_symbol        string
8        bank        string
9     account         float




In [33]:
#This method reads a sample with number of records = provided sample size
table_sample = connector.read_table_sample(table_name='transactions',
                                           schema_name='berka',
                                           share_name='teste',
                                           sample_size=1000)
print(table_sample)


[1mDataset 
 
[0m[1mShape: [0m(1000, 10)
[1mSchema: [0m
       Column Variable type
0    trans_id           int
1  account_id           int
2        date           int
3        type        string
4   operation        string
5      amount         float
6     balance         float
7    k_symbol        string
8        bank        string
9     account         float


