# Using Astronomer Airflow with Snowflake 

### Prerequisites
1) A valid Snowflake and S3 account

2) The Astronomer CLI or a running version of Airflow. (This guide was written to work with Airflow on Astronomer, but the same code should work for vanilla Airflow as well) 

Navigate here to get set up:
https://github.com/astronomerio/astro-cli

 

### Getting Started

Navigate to a project directory and run `astro airflow init` in a terminal.

This will generate a skeleton file directory:
```
.
├── dags
│   └── example-dag.py
├── Dockerfile
├── include
├── packages.txt
├── plugins

```
Clone this repository into your plugins folder:
https://github.com/airflow-plugins/snowflake_plugin

This gives you the Airflow plugins needed to interact with Snowflake. For a full list of community contributed plugins, check out:
- https://github.com/apache/incubator-airflow/tree/master/airflow

- https://github.com/airflow-plugins/


### Start a local Airflow Instance:

Before you can spin up Airflow, you will need to specify that your image builds with all of the dependencies necessary to snowflake and Amazon S3. 

Add the following to your `packages.txt` and `requirements.txt`:

packages.txt
    
    musl
    gcc
    make
    g++
    lz4-dev
    cyrus-sasl-dev
    openssl-dev
    python3-dev
    
requirements.txt
    
    azure-common==1.1.14
    azure-nspkg==2.0.0
    azure-storage==0.36.0
    ijson==2.3
    pycryptodome==3.6.4
    snowflake-connector-python==1.6.5
    

<br>
Now run

    `astro airflow start`
<br>

This should spin up a few docker containers on your machine. Run `docker ps` and you should see:

```
CONTAINER ID        IMAGE                     COMMAND                  CREATED             STATUS              PORTS                                        NAMES
1fc88586da10        notebook/airflow:latest   "tini -- /entrypoint…"   5 seconds ago       Up 5 seconds        5555/tcp, 8793/tcp, 0.0.0.0:8080->8080/tcp   notebook_webserver_1
a1d02ea75c2b        notebook/airflow:latest   "tini -- /entrypoint…"   6 seconds ago       Up 1 second         5555/tcp, 8080/tcp, 8793/tcp                 notebook_scheduler_1
d0edb1f6c497        postgres:10.1-alpine      "docker-entrypoint.s…"   6 seconds ago       Up 6 seconds        0.0.0.0:5432->5432/tcp                       notebook_postgres_1
```

###  Enter your Connection Credentials

Navigate to Admin-->Connections-->Create and create a new connection within your Airflow instance. The `conn_id` will be used to refer to your connection.

This will be your local development environment. Navigate to `localhost:8080` to see your Airflow dashboard.
<br> ![connection](img/snowflake_conn.png)
<br>

Do the same thing for your S3 connection.




### Write your DAG

Because the `snowflake_plugin` was added to the `plugins` directory, it can be imported as an airflow plugin.

See the attached example for what this could look like. DAG files should go in the `dags` folder.


### Deploy your DAG

Once you get your DAG working locally and your Astronomer cluster deployed, you can authenticate and start deploying!

Run 

    astro auth login

You should be prompted to log into your instance. Once you've authenticated, run

    astro airflow deploy
    
and chose which Airflow instance you want to deploy to.

The deploy will packages the entire project directory (dags, plugins, and all the requirements and packages needed for the code to run) into a Docker image and push it to the your Kubernetes Cluster.

Once you enter all your credentials in your production instance, everything is good to go.

![dag_success](img/snowflake_dag.png)