# Connectors - Snowflake

YData Fabric provides a seamless integration with Snowflake, allowing you to connect,
query, and manage your data in Snowflake with ease. This section will guide you through the benefits,
setup, and usage of the Snowflake connector within YData Fabric.

### Benefits of Integration
Integrating YData Fabric with Snowflake offers several key benefits:

- **Scalability:** Snowflake's architecture scales effortlessly with your data needs, while YData Fabric's tools ensure efficient data integration and management.
- **Performance:** Leveraging Snowflake's high performance for data querying and YData Fabric's optimization techniques enhances overall data processing speed.
- **Security:** Snowflake's robust security features, combined with YData Fabric's data governance capabilities, ensure your data remains secure and compliant.
- **Interoperability:** YData Fabric simplifies the process of connecting to Snowflake, allowing you to quickly set up and start using the data without extensive configuration. Benefit from the unique Fabric functionalities like data preparation with Python, synthetic data generation and data profiling.


### Use a connector already created in the UI
On the Fabric home page, navigate to the *"Connectors"* menu. Locate the connector you want to use, click the three vertical dots on the right side, and select *"Use in Lab"* as shown in the image below.

<img src="img/snowflake_use_in_lab.png" alt="Use snowflake connector inside the lab" width="800"/>

In [1]:
# Importing YData's packages
from ydata.labs import Connectors
# Getting a previously created Connector
connector = Connectors.get(uid='9debcd27-a667-4df1-83ec-9ad7d1576d43',
                           namespace='5a79dccb-d50f-41ba-90d9-09adf2b26b9c')
print(connector)


SnowflakeConnector(
  
  uid='9debcd27-a667-4df1-83ec-9ad7d1576d43',
  name='Snowflake',
  type=ConnectorType.SNOWFLAKE,
  connection=Connection(host='kptngbj-wp65779', port=443),
  database=CARDIO,
  warehouse=VALIDATION)


### Navigate your database

In [2]:
#list the available schemas
schemas = connector.list_schemas() #returns a list of schemas

## get the metadata of a database schema
schema = connector.get_database_schema('PATIENTS')

INFO: 2024-05-23 02:56:11,547 Snowflake Connector for Python Version: 3.10.0, Python Version: 3.10.12, Platform: Linux-5.10.186-179.751.amzn2.x86_64-x86_64-with-glibc2.35
INFO: 2024-05-23 02:56:11,549 This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
INFO: 2024-05-23 02:56:12,841 Number of results in first chunk: 1
INFO: 2024-05-23 02:56:13,021 Number of results in first chunk: 1
INFO: 2024-05-23 02:56:13,223 Number of results in first chunk: 1
INFO: 2024-05-23 02:56:13,399 Number of results in first chunk: 1
INFO: 2024-05-23 02:56:13,588 Number of results in first chunk: 1
INFO: 2024-05-23 02:56:13,765 Number of results in first chunk: 1
INFO: 2024-05-23 02:56:13,980 Number of results in first chunk: 3
INFO: 2024-05-23 02:56:14,243 Number of results in first chunk: 1
INFO: 2024-05-23 02:56:14,425 Nu

## Read from your Snowflake
Using the Snowflake connector it is possible to:
- Get the data from a Snowflake table
- Get a sample from a Snowflake table
- Get the data from a query to a Snowflake instance
- Get the full data from a selected database

In [4]:
table = connector.get_table('cardio_test')
print(table)

INFO: 2024-05-23 02:56:29,927 Snowflake Connector for Python Version: 3.10.0, Python Version: 3.10.12, Platform: Linux-5.10.186-179.751.amzn2.x86_64-x86_64-with-glibc2.35
INFO: 2024-05-23 02:56:29,929 This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
INFO: 2024-05-23 02:56:30,581 Number of results in first chunk: 1
INFO: 2024-05-23 02:56:30,763 Number of results in first chunk: 1
INFO: 2024-05-23 02:56:30,937 Number of results in first chunk: 1
INFO: 2024-05-23 02:56:31,108 Number of results in first chunk: 1
INFO: 2024-05-23 02:56:31,299 Number of results in first chunk: 0
INFO: 2024-05-23 02:56:32,688 Number of results in first chunk: 76
INFO: 2024-05-23 02:56:32,888 Number of results in first chunk: 0
INFO: 2024-05-23 02:56:33,080 Number of results in first chunk: 0
INFO: 2024-05-23 02:56:33,281 N

In [6]:
table_sample = connector.get_table_sample(table='cardio_test', sample_size=50)
print(table_sample)

INFO: 2024-05-23 02:57:44,693 Number of results in first chunk: 50
INFO: 2024-05-23 02:57:44,694 Snowflake Connector for Python Version: 3.10.0, Python Version: 3.10.12, Platform: Linux-5.10.186-179.751.amzn2.x86_64-x86_64-with-glibc2.35
INFO: 2024-05-23 02:57:44,695 This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
INFO: 2024-05-23 02:57:45,480 Number of results in first chunk: 50
INFO: 2024-05-23 02:57:45,666 Number of results in first chunk: 1
INFO: 2024-05-23 02:57:45,851 Number of results in first chunk: 1
[1mDataset 
 
[0m[1mShape: [0m(50, 12)
[1mSchema: [0m
         Column Variable type
0            id         float
1           age         float
2        height         float
3        weight         float
4         ap_hi         float
5         ap_lo         float
6   cholesterol         

In [8]:
query_output = connector.query("SELECT * FROM patients.cardio_test;")
print(query_output)

INFO: 2024-05-23 02:58:06,244 Number of results in first chunk: 1000
INFO: 2024-05-23 02:58:06,466 Number of results in first chunk: 1000
INFO: 2024-05-23 02:58:06,649 Number of results in first chunk: 1
INFO: 2024-05-23 02:58:06,839 Number of results in first chunk: 1
[1mDataset 
 
[0m[1mShape: [0m(1000, 12)
[1mSchema: [0m
         Column Variable type
0            id           int
1           age           int
2        height           int
3        weight           int
4         ap_hi           int
5         ap_lo           int
6   cholesterol          bool
7          gluc          bool
8         smoke          bool
9          alco          bool
10       active          bool
11       cardio          bool




## Write to your Snowflake
If you need to replicate an entire database or perform actions such as joining or merging full tables, you can read all tables within a schema or a specified set of tables using the Snowflake connector. The following actions are possible with the Snowflake connector:

- Read an entire database in either lazy or non-lazy mode.
- Read a specific set of tables.

#### Lazy mode
Lazy mode in YData Fabric's RDBMs connectors allows you to create an iterator that defers reading data from the database tables until an action is required. This approach optimizes performance and resource usage by loading data only when necessary.

When using lazy mode, the data is not immediately fetched from the database. Instead, the connector sets up an iterator that references the tables. Data is read only when you perform actions that require accessing the actual data, such as counting the number of rows, joining tables, or filtering data.

In [12]:
database = connector.read_database(lazy=True)
print(database)

INFO: 2024-05-23 03:03:48,611 Number of results in first chunk: 1
INFO: 2024-05-23 03:03:48,809 Number of results in first chunk: 7
INFO: 2024-05-23 03:03:48,991 Number of results in first chunk: 1
INFO: 2024-05-23 03:03:49,191 Number of results in first chunk: 0
INFO: 2024-05-23 03:03:50,297 Number of results in first chunk: 76
INFO: 2024-05-23 03:03:50,479 Number of results in first chunk: 0
INFO: 2024-05-23 03:03:50,671 Number of results in first chunk: 0
INFO: 2024-05-23 03:03:50,885 Number of results in first chunk: 1
INFO: 2024-05-23 03:03:51,093 Number of results in first chunk: 1
INFO: 2024-05-23 03:03:51,382 Number of results in first chunk: 1
INFO: 2024-05-23 03:03:51,595 Number of results in first chunk: 1
INFO: 2024-05-23 03:03:51,786 Number of results in first chunk: 1
INFO: 2024-05-23 03:03:51,973 Number of results in first chunk: 1
INFO: 2024-05-23 03:03:52,172 Number of results in first chunk: 1
INFO: 2024-05-23 03:03:52,351 Number of results in first chunk: 1
[1mMulti

In [13]:
tables = connector.get_tables(tables=['cardio_test', 'cardio_test2'])
print(tables)


INFO: 2024-05-23 03:04:35,966 Snowflake Connector for Python Version: 3.10.0, Python Version: 3.10.12, Platform: Linux-5.10.186-179.751.amzn2.x86_64-x86_64-with-glibc2.35
INFO: 2024-05-23 03:04:35,968 This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
INFO: 2024-05-23 03:04:36,679 Number of results in first chunk: 1
INFO: 2024-05-23 03:04:36,882 Number of results in first chunk: 1
INFO: 2024-05-23 03:04:37,064 Number of results in first chunk: 1
INFO: 2024-05-23 03:04:37,246 Number of results in first chunk: 1
INFO: 2024-05-23 03:04:37,429 Number of results in first chunk: 0
INFO: 2024-05-23 03:04:38,785 Number of results in first chunk: 76
INFO: 2024-05-23 03:04:38,980 Number of results in first chunk: 0
INFO: 2024-05-23 03:04:39,164 Number of results in first chunk: 0
INFO: 2024-05-23 03:04:39,351 N