# Executing `PUT`, `CREATE TABLE`, and `COPY INTO` Commands Using Python

**USE CASE:**

Upload a large, local CSV file to Snowflake so that users can then query it using SQL or upload large quantity of files to Snowflake.

**Q:** But why use Python?  Why not just execute the PUT, CREATE, and COPY INTO commands using SnowSQL CLI or Snowflake web UI?<br>
**A:** There are situations where we have a need to automate this process and/or schedule this process.  By being able to *programmatically* issue those commands, we no longer have to necessarily rely on a person to manually type and execute those commands.

**Q:** Why not use SQLAlchemy and load a pandas dataframe directly as a Snowflake table?  It would be a quicker workflow and requires far less code since no prior DDL statements need to be executed.<br>
**A:** Pandas is an `in-memory` solution and would "choke" on a large CSV.  With this PUT -> CREATE -> COPY INTO pattern, you are using Snowflake's compute and storage resources which were of course designed for large data.  So this pattern could be used to upload a large batch of files.

#### Generally, the process takes 3 steps:

`PUT` -> `CREATE TABLE` -> `COPY INTO`

#### Imports of necessary Python libraries

In [1]:
from pathlib import Path
import configparser
import pandas as pd
import snowflake.connector as sfc

#### There are a few different approaches for providing secret credentials without publicly exposing them.  One approach is to obtain them from a config file saved locally on your machine.

In [2]:
config = configparser.ConfigParser()
config.read(Path.home() / '.config' / 'config.ini')
SF_USERNAME = config['snowflake']['username']
SF_PASSWORD = config['snowflake']['password']
SF_ACCOUNT = config['snowflake']['account']
SF_AUTHENTICATOR = config['snowflake']['authenticator']

#### Executing a `PUT` command using Python

**NOTE:** Examples in other notebooks, we did not instantiate or create a cursor object and then execute queries with it.  Since we are using cursor object in the following examples, we should ensure that the cursor will be closed in the event that a problem or error occurs.  The connection object `con` will automatically be disposed of due to the `with` context management, but the cursor object will not be automatically disposed of.  Therefore, I've added `try/finally` clauses to ensure the `cur` object will be closed.

In [3]:
with sfc.connect(
    user=SF_USERNAME,
    password=SF_PASSWORD,
    account=SF_ACCOUNT,
    authenticator=SF_AUTHENTICATOR,
    database = 'your_db',
    schema = 'your_schema',
    warehouse = 'your_warehouse',
    role='your_role',
) as con:
    cur = con.cursor()
    try:
        cur.execute(r'put file://C:\Users\i33859\gitprojects\jupyter-sql\Snowflake\data\cars.csv @~/test;')
    finally:
        cur.close()

Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...


#### Let's confirm that the CSV file was uploaded:

In [5]:
with sfc.connect(
    user=SF_USERNAME,
    password=SF_PASSWORD,
    account=SF_ACCOUNT,
    authenticator=SF_AUTHENTICATOR,
    database = 'your_db',
    schema = 'your_schema',
    warehouse = 'your_warehouse',
    role='your_role',
) as con:
    cur = con.cursor()
    try:
        cur.execute(
            """
            select distinct
                metadata$filename as file_path
            from
                @~/test;
            """
        )
        for (col1,) in cur:
            print(f"File path: {col1}")
    finally:
        cur.close()

Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
File path: @~/test/cars.csv.gz


From the output above, we see that `cars.csv.gz` has been uploaded.

#### Create File Format and Create an Empty `cars` Table

In [8]:
with sfc.connect(
    user=SF_USERNAME,
    password=SF_PASSWORD,
    account=SF_ACCOUNT,
    authenticator=SF_AUTHENTICATOR,
    database = 'your_db',
    schema = 'your_schema',
    warehouse = 'your_warehouse',
    role='your_role',
) as con:
    cur = con.cursor()
    try:
        cur.execute(
            """
            -- Create CSV File Format with semicolon as delimiter
            create or replace file format csv_semicolon_format
                type = csv
                field_delimiter = ';'
                skip_header = 1
                null_if = ('NULL', 'null')
                empty_field_as_null = true
                compression = gzip;
            """
        )
        cur.execute(
            """
            -- Create "empyt" cars table
            create or replace table cars (
                car varchar(100),
                mpg number(6,1),
                cylinders integer,
                displacement decimal(10,1),
                horsepower decimal(10,1),
                weight decimal(10,1),
                acceleration decimal(6,1),
                model varchar(50),
                origin varchar(50)
            );
            """
        )
    finally:
        cur.close()

Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...


#### Finally, we will execute the `COPY INTO` command to create the `cars` Snowflake table from the uploaded `cars.csv.gz` file

In [9]:
with sfc.connect(
    user=SF_USERNAME,
    password=SF_PASSWORD,
    account=SF_ACCOUNT,
    authenticator=SF_AUTHENTICATOR,
    database = 'your_db',
    schema = 'your_schema',
    warehouse = 'your_warehouse',
    role='your_role',
) as con:
    cur = con.cursor()
    try:
        cur.execute(r"copy into cars from @~/test/cars.csv.gz file_format = (format_name = 'csv_semicolon_format');")
    finally:
        cur.close()

Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...


#### Now we can query the `cars` table

In [11]:
with sfc.connect(
    user=SF_USERNAME,
    password=SF_PASSWORD,
    account=SF_ACCOUNT,
    authenticator=SF_AUTHENTICATOR,
    database = 'your_db',
    schema = 'your_schema',
    warehouse = 'your_warehouse',
    role='your_role',
) as con:
    sql = "SELECT * from cars"
    df = pd.read_sql(sql, con)
df.head()

Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...


Unnamed: 0,CAR,MPG,CYLINDERS,DISPLACEMENT,HORSEPOWER,WEIGHT,ACCELERATION,MODEL,ORIGIN
0,Chevrolet Chevelle Malibu,18.0,8,307.0,130.0,3504.0,12.0,70,US
1,Buick Skylark 320,15.0,8,350.0,165.0,3693.0,11.5,70,US
2,Plymouth Satellite,18.0,8,318.0,150.0,3436.0,11.0,70,US
3,AMC Rebel SST,16.0,8,304.0,150.0,3433.0,12.0,70,US
4,Ford Torino,17.0,8,302.0,140.0,3449.0,10.5,70,US
