# ![Snowflake connector](images/snowflake_connector_logo.png)
This example walks through the basics of reading and writing data with the Ray Snowflake connector.

## Connection properties
The Snowflake connection properties need to be provided to the data source upon creation. The minimal required properties are `user`, `password`, `account` and `warehouse`. To use API keys instead of a password, functionality to load Snowflake API keys is also provided. API keys can be loaded from a file specified by the `private_key_file` property, or can be passed directly via the `private_key` property. If the key is password protected, the password can be given via the `pk_password` property.  Optional properties like database and schema can also be provided at construction or be included in the fully specified table name of format `db.schema.table` when calling read or write operations with a table or subquery.

Below is an example of loading properties from the environment, and filtering them by the 'SNOWFLAKE_' prefix.

In [1]:
import os

# get properties form env
env_connect_props = {
    key.replace('SNOWFLAKE_','').lower(): value 
    for key,value in os.environ.items() if 'SNOWFLAKE_' in key
}

# add db and schema in connect props
connect_props = dict(
    database = 'SNOWFLAKE_SAMPLE_DATA',
    schema = 'TPCH_SF1',
    warehouse='COMPUTE_WH',
    **env_connect_props
)

print('Connection properties:')
print('\n'.join(connect_props.keys()))

Connection properties:
database
schema
warehouse
account
private_key_file
pk_password
user


# Reading from Snowflake
Ray will use Snowflake optimizations that allow query results to be read in parallel into a Ray cluster. The created Ray datasets is composed of Pandas dataframes that are spread across the Ray cluster to allow for the distributed operations required in machine learning.

![Snowflake read table](images/snowflake_read.png)

### Read from tables
In order to read an entire table into a a Ray cluster, utilize the Ray data `read_snowflake` method. The code below will read in a sample table from the Snowflake sample database.

In [2]:
from ray.data import read_snowflake

# read the entire table
ds = read_snowflake(connect_props, table='CUSTOMER') 

# display the first 3 results
print('count:',ds.count())
ds.limit(3).to_pandas()

2023-02-22 21:12:04,885	INFO worker.py:1242 -- Using address localhost:9031 set in the environment variable RAY_ADDRESS
find: ‘.git’: No such file or directory
2023-02-22 21:12:05,139	INFO worker.py:1360 -- Connecting to existing Ray cluster at address: 10.0.60.86:9031...
2023-02-22 21:12:05,147	INFO worker.py:1548 -- Connected to Ray cluster. View the dashboard at [1m[32mhttps://console.anyscale.com/api/v2/sessions/ses_vnmb5jgl4z6q98h61dx25rccju/services?redirect_to=dashboard [39m[22m
2023-02-22 21:12:05,152	INFO packaging.py:330 -- Pushing file package 'gcs://_ray_pkg_48402d355c084b8cd6a5fcc3d258b432.zip' (0.86MiB) to Ray cluster...
2023-02-22 21:12:05,165	INFO packaging.py:343 -- Successfully pushed file package 'gcs://_ray_pkg_48402d355c084b8cd6a5fcc3d258b432.zip'.


count: 150000


Read progress: 100%|██████████| 1/1 [00:00<00:00,  2.87it/s]
Read progress: 100%|██████████| 1/1 [00:00<00:00, 1048.58it/s]


Unnamed: 0,C_CUSTKEY,C_NAME,C_ADDRESS,C_NATIONKEY,C_PHONE,C_ACCTBAL,C_MKTSEGMENT,C_COMMENT
0,1,Customer#000000001,"IVhzIApeRb ot,c,E",15,25-989-741-2988,711.56,BUILDING,"to the even, regular platelets. regular, ironi..."
1,2,Customer#000000002,"XSTf4,NCwDVaWNe6tEgvwfmRchLXak",13,23-768-687-3665,121.65,AUTOMOBILE,l accounts. blithely ironic theodolites integr...
2,3,Customer#000000003,MG9kdTD2WBHm,1,11-719-748-3364,7498.12,AUTOMOBILE,"deposits eat slyly ironic, even instructions...."


### Read with a query
For more control over columns and rows read, as well as joining data from multiple tables, a query can be specified instead of a table name. 

In [3]:
QUERY = 'SELECT C_ACCTBAL, C_MKTSEGMENT FROM CUSTOMER WHERE C_ACCTBAL < 0'

# read the result of the query
ds2 = read_snowflake(connect_props, query=QUERY)

# display the first 3 results
print('count:',ds2.count())
ds2.limit(3).to_pandas()



count: 13692


Read progress: 100%|██████████| 1/1 [00:00<00:00,  2.63it/s]
Read progress: 100%|██████████| 1/1 [00:00<00:00, 931.86it/s]


Unnamed: 0,C_ACCTBAL,C_MKTSEGMENT
0,-485.69,MACHINERY
1,-759.74,FURNITURE
2,-12.97,MACHINERY


### Additional read parameters
For reading from Snowflake, underlying Python API arguments are also available. The `timeout` and `params` arguments may be used in the [cursor execute method](https://docs.snowflake.com/en/user-guide/python-connector-api.html#object-cursor).

The code below uses the params to specify params to be used by Snowflake when executing the query.

In [4]:
QUERY = 'SELECT C_ACCTBAL, C_MKTSEGMENT FROM CUSTOMER WHERE C_ACCTBAL > ?'

ds3 = read_snowflake(connect_props, query=QUERY, params=[1000], timeout=1000)
print('count:',ds3.count())
ds3.limit(3).to_pandas()

count: 122881


Read progress: 100%|██████████| 1/1 [00:00<00:00,  2.62it/s]
Read progress: 100%|██████████| 1/1 [00:00<00:00, 1021.75it/s]


Unnamed: 0,C_ACCTBAL,C_MKTSEGMENT
0,9957.56,HOUSEHOLD
1,2526.92,BUILDING
2,7975.22,AUTOMOBILE


## Writing
The Ray Snowflake connector will use Snowflake API to write each partition of data in parallel. Each partition of data in the Ray dataset will have a write task that writes in parallel to Snowflake. 
![Snowflake write table](images/snowflake_write.png)

### Write to tables
In order to write a dataset into Snowflake table, use the `write_snowflake` method of the dataset object. Repartition the dataset in order to set the number of write tasks.

First, a new database and table needs to be created using the Ray Snowflake Connector or the native Snowflake API. 

> Note: When using the Ray data connector, we can use the API key loading functionality built into the connector class.

In [5]:
from ray.data.datasource import SnowflakeConnector

write_connect_props = {
    **connect_props, 
    'database':'RAY_SAMPLE', 
    'schema':'PUBLIC'
}
with SnowflakeConnector(**write_connect_props) as con:
    # create destination database
    con.query(f'CREATE DATABASE IF NOT EXISTS RAY_SAMPLE')
    con.commit()
    
    # create destination table
    con.query('''
        CREATE OR REPLACE TABLE CUSTOMER_COPY 
        LIKE SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.CUSTOMER
    ''')

The example below writes the previously read data into a new database table that are created using the Snowflake Python API.

In [6]:
# write the dataset to the table 
ds.write_snowflake(
    write_connect_props, 
    table='CUSTOMER_COPY'
)

# read the new table
ds4 = read_snowflake(write_connect_props, table='CUSTOMER_COPY')
print('count:',ds4.count())
ds4.limit(3).to_pandas()

2023-02-22 21:12:14,301	INFO bulk_executor.py:41 -- Executing DAG InputDataBuffer[Input] -> TaskPoolMapOperator[read->write]
read->write: 100%|██████████| 19/19 [00:11<00:00,  1.60it/s]
2023-02-22 21:12:26,201	INFO connection.py:280 -- Snowflake Connector for Python Version: 3.0.0, Python Version: 3.10.9, Platform: Linux-5.13.0-1025-aws-x86_64-with-glibc2.31
2023-02-22 21:12:26,202	INFO connection.py:974 -- This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
2023-02-22 21:12:26,327	INFO cursor.py:727 -- query: [COMMIT]
2023-02-22 21:12:26,370	INFO cursor.py:740 -- query execution done
2023-02-22 21:12:26,371	INFO cursor.py:878 -- Number of results in first chunk: 1
2023-02-22 21:12:26,372	INFO connection.py:581 -- closed
2023-02-22 21:12:26,385	INFO connection.py:584 -- No async queries seem to be runn

count: 150000


Read progress: 100%|██████████| 1/1 [00:00<00:00,  4.01it/s]
Read progress: 100%|██████████| 1/1 [00:00<00:00, 966.65it/s]


Unnamed: 0,C_CUSTKEY,C_NAME,C_ADDRESS,C_NATIONKEY,C_PHONE,C_ACCTBAL,C_MKTSEGMENT,C_COMMENT
0,87868,Customer#000087868,uPZ8cB41EapBpopIsIFmDo,1,11-552-716-1697,1049.32,FURNITURE,platelets affix quickly after the boldly iron...
1,87869,Customer#000087869,nmpIurFWrzzqZgXGlIRdlDkXpJkEd,17,27-765-230-3208,825.6,AUTOMOBILE,ke regular accounts. express ac
2,87870,Customer#000087870,kzLemw38LYB0djEi9kYMbUgQsgHiCCJG,8,18-436-876-3101,2827.08,BUILDING,regular asymptotes according to the slyly fin...


### Additional write parameters
For writing to Snowflake, the native Snowflake API arguments are also available from the [write_pandas](https://docs.snowflake.com/en/user-guide/python-connector-api.html#module-snowflake-connector-pandas-tools) method. The following is a list of the parameters that may be useful:

- `auto_create_table`: When true, will automatically create a table with corresponding columns for each column in the passed in DataFrame. The table will not be created if it already exists
- `overwrite`: When true, and if auto_create_table is true, then it drops the table. Otherwise, it truncates the table. In both cases it will replace the existing contents of the table with that of the passed in Pandas DataFrame.
- `table_type`: The table type of to-be-created table. The supported table types include ``temp``/``temporary`` and ``transient``. Empty means permanent table as per SQL convention.

In the example below, we use the `auto_create_table` parameter to create the output table before writing.

In [7]:
# write the dataset to the table, using an autocreated table
ds.write_snowflake(
    write_connect_props, 
    table='CUSTOMER_COPY_2',
    auto_create_table=True
)

# read the new table
ds5 = read_snowflake(write_connect_props, table='CUSTOMER_COPY_2')
print('count:',ds5.count())
ds5.limit(3).to_pandas()

2023-02-22 21:12:28,484	INFO bulk_executor.py:41 -- Executing DAG InputDataBuffer[Input] -> TaskPoolMapOperator[read->write]
read->write: 100%|██████████| 19/19 [00:12<00:00,  1.57it/s]
2023-02-22 21:12:40,591	INFO connection.py:280 -- Snowflake Connector for Python Version: 3.0.0, Python Version: 3.10.9, Platform: Linux-5.13.0-1025-aws-x86_64-with-glibc2.31
2023-02-22 21:12:40,592	INFO connection.py:974 -- This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
2023-02-22 21:12:40,808	INFO cursor.py:727 -- query: [COMMIT]
2023-02-22 21:12:40,854	INFO cursor.py:740 -- query execution done
2023-02-22 21:12:40,855	INFO cursor.py:878 -- Number of results in first chunk: 1
2023-02-22 21:12:40,855	INFO connection.py:581 -- closed
2023-02-22 21:12:40,866	INFO connection.py:584 -- No async queries seem to be runn

count: 1800000


Read progress: 100%|██████████| 1/1 [00:00<00:00,  4.13it/s]
Read progress: 100%|██████████| 1/1 [00:00<00:00, 864.98it/s]


Unnamed: 0,C_CUSTKEY,C_NAME,C_ADDRESS,C_NATIONKEY,C_PHONE,C_ACCTBAL,C_MKTSEGMENT,C_COMMENT
0,87868,Customer#000087868,uPZ8cB41EapBpopIsIFmDo,1,11-552-716-1697,1049.32,FURNITURE,platelets affix quickly after the boldly iron...
1,87869,Customer#000087869,nmpIurFWrzzqZgXGlIRdlDkXpJkEd,17,27-765-230-3208,825.6,AUTOMOBILE,ke regular accounts. express ac
2,87870,Customer#000087870,kzLemw38LYB0djEi9kYMbUgQsgHiCCJG,8,18-436-876-3101,2827.08,BUILDING,regular asymptotes according to the slyly fin...


## Advanced Usage
If more low level access to the Ray Snowflake connector is needed, the underlying `SnowflakConnector` and `SnowflakeDatasource` can be used.

### Snowflake Connector
The `SnowflakeConnector` class holds the connection properties and logic required to establish a connection with Snowflake. Internally it calls the native Python Snowflake API in order to read and write from and to Snowflake tables in parallel across the cluster. The datasource uses the Snowflake Python API's optimized `read_batch` and `write_pandas` methods to enable parallel read and writes of data.

The connector is also a Python context manager, and utilize `with` semantics to define when a connection should be established, db operations commited to the database, and the connection closed. 

The code below will read from a sample table using the connector to manage the connection.

In [8]:
from ray.data.datasource import SnowflakeConnector

# query the number of rows, using the connection context to
# manage transactions
with SnowflakeConnector(**connect_props) as con:
    count = con.query_int(f'SELECT COUNT(*) FROM CUSTOMER')

print(count)

2023-02-22 21:12:44,437	INFO connection.py:280 -- Snowflake Connector for Python Version: 3.0.0, Python Version: 3.10.9, Platform: Linux-5.13.0-1025-aws-x86_64-with-glibc2.31
2023-02-22 21:12:44,438	INFO connection.py:974 -- This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
2023-02-22 21:12:44,554	INFO cursor.py:727 -- query: [SELECT COUNT(*) FROM CUSTOMER]
2023-02-22 21:12:44,631	INFO cursor.py:740 -- query execution done
2023-02-22 21:12:44,632	INFO cursor.py:878 -- Number of results in first chunk: 1
2023-02-22 21:12:44,633	INFO cursor.py:727 -- query: [COMMIT]
2023-02-22 21:12:44,681	INFO cursor.py:740 -- query execution done
2023-02-22 21:12:44,682	INFO cursor.py:878 -- Number of results in first chunk: 1
2023-02-22 21:12:44,683	INFO connection.py:581 -- closed
2023-02-22 21:12:44,695	INFO conne

150000


Alternatively, you can use `try` blocks with the connector's `open`, `commit` and `close` methods. 

In [9]:
connector = SnowflakeConnector(**connect_props)
try:
    connector.open()
    count = connector.query_int(f'SELECT COUNT(*) FROM CUSTOMER')
finally:
    connector.close()
    
print(count)

2023-02-22 21:12:44,788	INFO connection.py:280 -- Snowflake Connector for Python Version: 3.0.0, Python Version: 3.10.9, Platform: Linux-5.13.0-1025-aws-x86_64-with-glibc2.31
2023-02-22 21:12:44,789	INFO connection.py:974 -- This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
2023-02-22 21:12:44,910	INFO cursor.py:727 -- query: [SELECT COUNT(*) FROM CUSTOMER]
2023-02-22 21:12:44,976	INFO cursor.py:740 -- query execution done
2023-02-22 21:12:44,976	INFO cursor.py:878 -- Number of results in first chunk: 1
2023-02-22 21:12:44,977	INFO cursor.py:727 -- query: [COMMIT]
2023-02-22 21:12:45,023	INFO cursor.py:740 -- query execution done
2023-02-22 21:12:45,024	INFO cursor.py:878 -- Number of results in first chunk: 1
2023-02-22 21:12:45,025	INFO connection.py:581 -- closed
2023-02-22 21:12:45,039	INFO conne

150000


### Snowflake Datasource
The Snowflake datasource can be used with the Ray data `read_datasource` and `write_datasource` methods to read and write to Snowflake databases using the distibuted processing capabilities of Ray data. The datasource uses a SnowflakeConnector class that is derived from the DBAPI2Connector class. 

Below is an exmaple of creating the datasource using the previously defined connect properties, and then using it to read and write.

In [10]:
from ray.data import read_datasource
from ray.data.datasource import SnowflakeDatasource

# use read_datasource to read
ds = read_datasource(
    SnowflakeDatasource(connector), 
    table='CUSTOMER'
)
 
ds.limit(3).to_pandas()

Read progress: 100%|██████████| 1/1 [00:00<00:00,  3.27it/s]
Read progress: 100%|██████████| 1/1 [00:00<00:00, 1132.98it/s]


Unnamed: 0,C_CUSTKEY,C_NAME,C_ADDRESS,C_NATIONKEY,C_PHONE,C_ACCTBAL,C_MKTSEGMENT,C_COMMENT
0,1,Customer#000000001,"IVhzIApeRb ot,c,E",15,25-989-741-2988,711.56,BUILDING,"to the even, regular platelets. regular, ironi..."
1,2,Customer#000000002,"XSTf4,NCwDVaWNe6tEgvwfmRchLXak",13,23-768-687-3665,121.65,AUTOMOBILE,l accounts. blithely ironic theodolites integr...
2,3,Customer#000000003,MG9kdTD2WBHm,1,11-719-748-3364,7498.12,AUTOMOBILE,"deposits eat slyly ironic, even instructions...."


In [11]:
# use write_datasource to write
connector = SnowflakeConnector(**write_connect_props)
datasource = SnowflakeDatasource(connector)
ds.write_datasource(
    datasource, 
    table='CUSTOMER_3',
    auto_create_table=True
)

2023-02-22 21:12:46,588	INFO bulk_executor.py:41 -- Executing DAG InputDataBuffer[Input] -> TaskPoolMapOperator[read->write]
read->write: 100%|██████████| 19/19 [00:08<00:00,  2.18it/s]
2023-02-22 21:12:55,326	INFO connection.py:280 -- Snowflake Connector for Python Version: 3.0.0, Python Version: 3.10.9, Platform: Linux-5.13.0-1025-aws-x86_64-with-glibc2.31
2023-02-22 21:12:55,327	INFO connection.py:974 -- This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
2023-02-22 21:12:55,457	INFO cursor.py:727 -- query: [COMMIT]
2023-02-22 21:12:55,501	INFO cursor.py:740 -- query execution done
2023-02-22 21:12:55,502	INFO cursor.py:878 -- Number of results in first chunk: 1
2023-02-22 21:12:55,502	INFO connection.py:581 -- closed
2023-02-22 21:12:55,520	INFO connection.py:584 -- No async queries seem to be runn

### DML and DDL
The connector can also be used for any DDL or DML operations you would normally execute through the Snowflake Python API. These operations just pass through to the underlying Snowflake API. 

The code below will create the objects needed for writing to tables. Note that a commit is issued between the queries so the DDL operation executes prior to the next one that is dependent. An alternative is to use two `with` blocks to define transaction boundaries.

In [12]:
with connector as con:
    con.query(f'CREATE DATABASE IF NOT EXISTS RAY')
    con.commit()
    con.query(f'''
        CREATE OR REPLACE TABLE RAY.PUBLIC.CUSTOMER_COPY
            LIKE SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.CUSTOMER
    ''')

2023-02-22 21:12:55,605	INFO connection.py:280 -- Snowflake Connector for Python Version: 3.0.0, Python Version: 3.10.9, Platform: Linux-5.13.0-1025-aws-x86_64-with-glibc2.31
2023-02-22 21:12:55,607	INFO connection.py:974 -- This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
2023-02-22 21:12:55,734	INFO cursor.py:727 -- query: [CREATE DATABASE IF NOT EXISTS RAY]
2023-02-22 21:12:55,775	INFO cursor.py:740 -- query execution done
2023-02-22 21:12:55,776	INFO cursor.py:878 -- Number of results in first chunk: 1
2023-02-22 21:12:55,777	INFO cursor.py:727 -- query: [COMMIT]
2023-02-22 21:12:55,851	INFO cursor.py:740 -- query execution done
2023-02-22 21:12:55,852	INFO cursor.py:878 -- Number of results in first chunk: 1
2023-02-22 21:12:55,853	INFO cursor.py:727 -- query: [CREATE OR REPLACE TABLE RAY.PUBLI

### Pandas data mapping
The Snowflake Datasource converts Pandas data types using the Snowflake Python Connector API. Data mappings are available from the Snowflake [documentation](https://docs.snowflake.com/en/user-guide/python-connector-pandas.html#snowflake-to-pandas-data-mapping). 

The below code is an example of reading and writing all the available data formats.

In [13]:
with connector as con:
    con.query("""
        CREATE OR REPLACE TABLE SAMPLE_TABLE (
            ID INT,
            SAMPLE_NUMBER NUMBER(6,2),
            SAMPLE_DECIMAL DECIMAL(8,3),
            SAMPLE_FLOAT FLOAT,
            SAMPLE_VARCHAR VARCHAR,
            SAMPLE_BINARY BINARY,
            SAMPLE_INT INT,
            SAMPLE_DATE DATE,
            SAMPLE_TIME TIME,
            SAMPLE_TIMESTAMP_TZ TIMESTAMP_TZ,
            SAMPLE_TIMESTAMP_NTZ TIMESTAMP_NTZ,
            SAMPLE_TIMESTAMP_LTZ TIMESTAMP_LTZ,
            SAMPLE_GEOGRAPHY GEOGRAPHY,
            SAMPLE_VARIANT VARIANT,
            SAMPLE_ARRAY ARRAY,
            SAMPLE_OBJECT OBJECT
        )
    """)
    con.commit()
    con.query("""
        INSERT INTO SAMPLE_TABLE 
        VALUES (
            0,
            1111.11,
            22222.222,
            3.333333333,
            '4444444444',
            '01ffeeddaa',
            6666,
            TO_DATE('2007-07-07'),
            TO_TIME('08:00:00.000'),
            TO_TIMESTAMP_TZ('2009-07-08 08:00:00'),
            TO_TIMESTAMP_NTZ('2010-07-08 08:00:00.000'),
            TO_TIMESTAMP_LTZ('2011-07-08 08:00:00.000'),
            'POINT(-122.35 37.55)',
            NULL,
            NULL,
            NULL
        )
    """)
    con.query("""UPDATE SAMPLE_TABLE SET SAMPLE_VARIANT = to_variant(parse_json('{"key3": "value3", "key4": "value4"}'))""")
    con.query("UPDATE SAMPLE_TABLE SET SAMPLE_ARRAY = [1,'two',3,4]")
    con.query("UPDATE SAMPLE_TABLE SET SAMPLE_OBJECT = {'thirteen':13, 'zero':0}")

sample = read_snowflake(write_connect_props, table='SAMPLE_TABLE')
sample.to_pandas()

2023-02-22 21:12:56,241	INFO connection.py:280 -- Snowflake Connector for Python Version: 3.0.0, Python Version: 3.10.9, Platform: Linux-5.13.0-1025-aws-x86_64-with-glibc2.31
2023-02-22 21:12:56,242	INFO connection.py:974 -- This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
2023-02-22 21:12:56,377	INFO cursor.py:727 -- query: [CREATE OR REPLACE TABLE SAMPLE_TABLE ( ID INT, SAMPLE_NUMBER NUMBER(6,2), SAMPLE...]
2023-02-22 21:12:56,515	INFO cursor.py:740 -- query execution done
2023-02-22 21:12:56,516	INFO cursor.py:878 -- Number of results in first chunk: 1
2023-02-22 21:12:56,517	INFO cursor.py:727 -- query: [COMMIT]
2023-02-22 21:12:56,567	INFO cursor.py:740 -- query execution done
2023-02-22 21:12:56,568	INFO cursor.py:878 -- Number of results in first chunk: 1
2023-02-22 21:12:56,568	INFO cursor.p

Unnamed: 0,ID,SAMPLE_NUMBER,SAMPLE_DECIMAL,SAMPLE_FLOAT,SAMPLE_VARCHAR,SAMPLE_BINARY,SAMPLE_INT,SAMPLE_DATE,SAMPLE_TIME,SAMPLE_TIMESTAMP_TZ,SAMPLE_TIMESTAMP_NTZ,SAMPLE_TIMESTAMP_LTZ,SAMPLE_GEOGRAPHY,SAMPLE_VARIANT,SAMPLE_ARRAY,SAMPLE_OBJECT
0,0,1111.11,22222.222,3.333333,4444444444,b'\x01\xff\xee\xdd\xaa',6666,2007-07-07,08:00:00,2009-07-08 08:00:00-07:00,2010-07-08 08:00:00,2011-07-08 08:00:00-07:00,"{\n ""coordinates"": [\n -122.35,\n 37.55...","{\n ""key3"": ""value3"",\n ""key4"": ""value4""\n}","[\n 1,\n ""two"",\n 3,\n 4\n]","{\n ""thirteen"": 13,\n ""zero"": 0\n}"


The below code writes the sample data back to Snowflake:

In [14]:
new_sample = sample.drop_columns(['SAMPLE_BINARY']) # binary column write does not work in Snowflake API
new_sample.write_snowflake(
    write_connect_props, 
    table='SAMPLE_TABLE_DEST', 
    auto_create_table=True
)
read_snowflake(
    write_connect_props, 
    table='SAMPLE_TABLE_DEST'
).to_pandas()

2023-02-22 21:12:59,780	INFO bulk_executor.py:41 -- Executing DAG InputDataBuffer[Input] -> TaskPoolMapOperator[read->MapBatches(<lambda>)->write]
read->MapBatches(<lambda>)->write: 100%|██████████| 1/1 [00:02<00:00,  2.34s/it]
2023-02-22 21:13:02,129	INFO connection.py:280 -- Snowflake Connector for Python Version: 3.0.0, Python Version: 3.10.9, Platform: Linux-5.13.0-1025-aws-x86_64-with-glibc2.31
2023-02-22 21:13:02,130	INFO connection.py:974 -- This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
2023-02-22 21:13:02,261	INFO cursor.py:727 -- query: [COMMIT]
2023-02-22 21:13:02,305	INFO cursor.py:740 -- query execution done
2023-02-22 21:13:02,306	INFO cursor.py:878 -- Number of results in first chunk: 1
2023-02-22 21:13:02,307	INFO connection.py:581 -- closed
2023-02-22 21:13:02,319	INFO connection.

Unnamed: 0,ID,SAMPLE_NUMBER,SAMPLE_DECIMAL,SAMPLE_FLOAT,SAMPLE_VARCHAR,SAMPLE_INT,SAMPLE_DATE,SAMPLE_TIME,SAMPLE_TIMESTAMP_TZ,SAMPLE_TIMESTAMP_NTZ,SAMPLE_TIMESTAMP_LTZ,SAMPLE_GEOGRAPHY,SAMPLE_VARIANT,SAMPLE_ARRAY,SAMPLE_OBJECT
0,0,1111.11,22222.222,3.333333,4444444444,6666,2007-07-07,08:00:00,2009-07-08 15:00:00,1278576000000000,2011-07-08 15:00:00,"{\n ""coordinates"": [\n -122.35,\n 37.55...","{\n ""key3"": ""value3"",\n ""key4"": ""value4""\n}","[\n 1,\n ""two"",\n 3,\n 4\n]","{\n ""thirteen"": 13,\n ""zero"": 0\n}"
1,0,1111.11,22222.222,3.333333,4444444444,6666,2007-07-07,08:00:00,2009-07-08 15:00:00,1278576000000000,2011-07-08 15:00:00,"{\n ""coordinates"": [\n -122.35,\n 37.55...","{\n ""key3"": ""value3"",\n ""key4"": ""value4""\n}","[\n 1,\n ""two"",\n 3,\n 4\n]","{\n ""thirteen"": 13,\n ""zero"": 0\n}"
2,0,1111.11,22222.222,3.333333,4444444444,6666,2007-07-07,08:00:00,2009-07-08 15:00:00,1278576000000000,2011-07-08 15:00:00,"{\n ""coordinates"": [\n -122.35,\n 37.55...","{\n ""key3"": ""value3"",\n ""key4"": ""value4""\n}","[\n 1,\n ""two"",\n 3,\n 4\n]","{\n ""thirteen"": 13,\n ""zero"": 0\n}"
3,0,1111.11,22222.222,3.333333,4444444444,6666,2007-07-07,08:00:00,2009-07-08 15:00:00,1278576000000000,2011-07-08 15:00:00,"{\n ""coordinates"": [\n -122.35,\n 37.55...","{\n ""key3"": ""value3"",\n ""key4"": ""value4""\n}","[\n 1,\n ""two"",\n 3,\n 4\n]","{\n ""thirteen"": 13,\n ""zero"": 0\n}"
4,0,1111.11,22222.222,3.333333,4444444444,6666,2007-07-07,08:00:00,2009-07-08 15:00:00,1278576000000000,2011-07-08 15:00:00,"{\n ""coordinates"": [\n -122.35,\n 37.55...","{\n ""key3"": ""value3"",\n ""key4"": ""value4""\n}","[\n 1,\n ""two"",\n 3,\n 4\n]","{\n ""thirteen"": 13,\n ""zero"": 0\n}"
5,0,1111.11,22222.222,3.333333,4444444444,6666,2007-07-07,08:00:00,2009-07-08 15:00:00,1278576000000000,2011-07-08 15:00:00,"{\n ""coordinates"": [\n -122.35,\n 37.55...","{\n ""key3"": ""value3"",\n ""key4"": ""value4""\n}","[\n 1,\n ""two"",\n 3,\n 4\n]","{\n ""thirteen"": 13,\n ""zero"": 0\n}"
6,0,1111.11,22222.222,3.333333,4444444444,6666,2007-07-07,08:00:00,2009-07-08 15:00:00,1278576000000000,2011-07-08 15:00:00,"{\n ""coordinates"": [\n -122.35,\n 37.55...","{\n ""key3"": ""value3"",\n ""key4"": ""value4""\n}","[\n 1,\n ""two"",\n 3,\n 4\n]","{\n ""thirteen"": 13,\n ""zero"": 0\n}"
7,0,1111.11,22222.222,3.333333,4444444444,6666,2007-07-07,08:00:00,2009-07-08 15:00:00,1278576000000000,2011-07-08 15:00:00,"{\n ""coordinates"": [\n -122.35,\n 37.55...","{\n ""key3"": ""value3"",\n ""key4"": ""value4""\n}","[\n 1,\n ""two"",\n 3,\n 4\n]","{\n ""thirteen"": 13,\n ""zero"": 0\n}"
