## Overview

There are many different ways to get data into Snowflake. Different use cases, requirements, team skillsets, and technology choices all contribute to making the right decision on how to ingest data. This quickstart will guide you through an example of the same data loaded with different methods:

*   SQL Inserts from the Python Connector
*   File Upload & Copy (Warehouse) from the Python Connector
*   File Upload & Copy (Snowpipe) using Python
*   File Upload & Copy (Serverless) from the Python Connector
*   Inserting Data from a Dataframe with Snowpark
*   From Kafka - in Snowpipe (Batch) mode
*   From Kafka - in Snowpipe Streaming mode
*   From Java SDK - Using the Snowflake Ingest Service

Prerequisites

*   Snowflake Account with the ability to create a User, Role, Database, Snowpipe, Serverless Task, Execute Task
*   Familiarity with Python, Kafka, and/or Java
*   Basic knowledge of Docker
*   Ability to run Docker locally

Mac Requirements

*   [Docker](https://docs.docker.com/desktop/install/mac-install/) Installed
*   [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/macos.html) Installed

Linux Requirements

*   [Docker](https://docs.docker.com/engine/install/ubuntu/) Installed
*   [Conda](https://docs.conda.io/projects/conda/en/stable/user-guide/install/linux.html) Installed

Windows Requirements

*   [WSL with Ubuntu](https://learn.microsoft.com/en-us/windows/wsl/install) for Windows
*   [Docker](https://docs.docker.com/engine/install/ubuntu/) Installed in Ubuntu
*   [Conda](https://docs.conda.io/projects/conda/en/stable/user-guide/install/linux.html) Installed in Ubuntu


To create the environment needed, run the following in your shell, using the environment.yml found in the Snowflake Data Ingestion\python folder.

``` 
conda env create -f environment.yml
conda activate sf-ingest-examples 
```


Ok, with that done, we're going to start generating fake data.

Most of the ingest patterns we will go through in this guide will actually outperform the faker library so it is best to run the data generation once and reuse that generated data in the different ingest patterns.

In your "Snowflake Data Ingestion\python\" folder, you will see data_generator.py. 

This code will take the number of tickets to create as an arg and output the json data with one lift ticket (record) per line. The rest of the files in this guide will be put in this same directory.

To test this generator, run the following in your shell:

``` python ./data_generator.py 1 ```
You should see 1 record written to output.

In order to quickly have data available for the rest of the guide, dump a lot of data to a file for re-use.

Run the following in your shell:

``` python ./data_generator.py 100000 | gzip > data.json.gz ```

You can increase or decrease the size of records to any number that you would like to use. This will currently output the sample data to your current directory, but you can pick any folder you would like. This file will be used in subsequent steps so note where you stored this data and replace later if needed.

We're going to go ahead and create a new role, warehouse, database and schema

In [None]:
SET VAR_UNAME = (SELECT CURRENT_USER);
select $VAR_UNAME;

In [None]:
CREATE or REPLACE WAREHOUSE INGEST;
CREATE or REPLACE ROLE INGEST;
GRANT USAGE ON WAREHOUSE INGEST TO ROLE INGEST;
GRANT OPERATE ON WAREHOUSE INGEST TO ROLE INGEST;
GRANT USAGE ON WAREHOUSE SNOW_INGESTION_WH TO ROLE INGEST;
GRANT OPERATE ON WAREHOUSE SNOW_INGESTION_WH TO ROLE INGEST;
CREATE or REPLACE DATABASE INGEST;
CREATE or REPLACE SCHEMA INGEST;
GRANT OWNERSHIP ON DATABASE INGEST TO ROLE INGEST;
GRANT OWNERSHIP ON SCHEMA INGEST.INGEST TO ROLE INGEST;

-- Change this password.
CREATE or REPLACE USER INGEST PASSWORD='<CHANGE_THIS>' LOGIN_NAME='INGEST' MUST_CHANGE_PASSWORD=FALSE, DISABLED=FALSE, DEFAULT_WAREHOUSE='INGEST', DEFAULT_NAMESPACE='INGEST.INGEST', DEFAULT_ROLE='INGEST';
GRANT ROLE INGEST TO USER INGEST;
GRANT ROLE INGEST TO USER IDENTIFIER($VAR_UNAME);

Now, we need to go back to our shell and run the following:

```
openssl genrsa 4096 | openssl pkcs8 -topk8 -inform PEM -out rsa_key.p8 -nocrypt
openssl rsa -in rsa_key.p8 -pubout -out rsa_key.pub
PUBK=`cat ./rsa_key.pub | grep -v KEY- | tr -d '\012'`
echo "ALTER USER INGEST SET RSA_PUBLIC_KEY='$PUBK';
```


You'll copy the output of the command above and paste that and run it below (replace what is here).

In [None]:
ALTER USER INGEST SET RSA_PUBLIC_KEY='<REPLACE>';

Now, you need to get the private key for the user by running the following in your shell.

```
PRVK=`cat ./rsa_key.p8 | grep -v KEY- | tr -d '\012'`
echo "PRIVATE_KEY=$PRVK"
```

Now, on to our last step to get things set up.

You should have a ras_key.p8 files and an ras_key.pub file

The final file we need to create is a .env file.

It will follow the following format:

```
SNOWFLAKE_ACCOUNT=<ACCOUNT_HERE>
SNOWFLAKE_USER=INGEST
PRIVATE_KEY=<PRIVATE_KEY_HERE>
```

Your PRIVATE_KEY should have been output from your command above and you can find your account by clicking on your initials on the bottom left, hovering over your account, and copying the "locator"

### Make sure you protect your .env and .p8 file as those are credentials directly to the INGEST user.

