## Snowflake Data Load Notebook
This notebook creates the required Snowflake objects, stages a CSV from GitHub, and loads it into a table – all using **Snowpark for Python**.
**Prerequisites**
1. The `snowflake-snowpark-python` and `requests` packages are installed (see the next code cell).
2. Environment variables with your connection info are set in the kernel/session:
   * `SNOWFLAKE_ACCOUNT` –  Go to Account Details in snowflake . It should  look like bewlo 
   * `SNOWFLAKE_USER` 
   * `SNOWFLAKE_PASSWORD`

[connections.my_example_connection]
- account = "XXXX-XXXXX"
- user = "HUSEYN"
- role = "ACCOUNTADMIN"


In [13]:
import pathlib
import os, pathlib, requests
from snowflake.snowpark import Session


In [2]:

print("Working dir:", os.getcwd())
print("Files here:", os.listdir())


Working dir: c:\Users\ping\Documents\Bootcamps\Data-Analytics-Engineer-Bootcamp\dataflow\notebooks
Files here: ['1_load_data.ipynb', 'snowpark_bootstrap.ipynb']


In [3]:

from dotenv import load_dotenv
load_dotenv()

connection_parameters = {
    "account":   os.environ["SNOWFLAKE_ACCOUNT"],
    "user":      os.environ["SNOWFLAKE_USER"],
    "password":  os.environ["SNOWFLAKE_PASSWORD"],
    "role":      "ACCOUNTADMIN",  
    "warehouse": "COMPUTE_WH",        
}

session = Session.builder.configs(connection_parameters).create()
session.sql("SELECT CURRENT_VERSION() AS VERSION").show()

-------------
|"VERSION"  |
-------------
|9.12.1     |
-------------



In [4]:
# ---- Helper to run multiple SQL statements safely ----
def run_many(sql: str):
    for stmt in [s.strip() for s in sql.split(";") if s.strip()]:
        session.sql(stmt).collect()

In [None]:
# ---- Warehouse / Database / Schema / Role / User ----
run_many('''
CREATE OR REPLACE WAREHOUSE DBT_WH WAREHOUSE_SIZE = "XSMALL";

CREATE DATABASE IF NOT EXISTS DBT_DB;
CREATE SCHEMA   IF NOT EXISTS DBT_DB.DBT_SCHEMA;

CREATE ROLE     IF NOT EXISTS DBT_ROLE;

GRANT USAGE          ON WAREHOUSE DBT_WH            TO ROLE DBT_ROLE;
GRANT ALL PRIVILEGES ON DATABASE  DBT_DB            TO ROLE DBT_ROLE;
GRANT ALL PRIVILEGES ON SCHEMA    DBT_DB.DBT_SCHEMA TO ROLE DBT_ROLE;

CREATE USER IF NOT EXISTS DBT_USER
  PASSWORD            = 'StrongPassword12345' 
  DEFAULT_ROLE        = DBT_ROLE
  DEFAULT_WAREHOUSE   = DBT_WH
  MUST_CHANGE_PASSWORD = FALSE;

GRANT ROLE DBT_ROLE TO USER DBT_USER;
''')
print("Bootstrap complete.")

✅ Bootstrap complete.


In [6]:

run_many("CREATE OR REPLACE STAGE DBT_DB.DBT_SCHEMA.NETFLIX_RAW_STAGE;")

In [None]:

csv_url   = "https://raw.githubusercontent.com/HuseynA28/DataFlow-Snowflake-Airflow-dbt-Docker-CICD-/refs/heads/main/data/netflix_titles.csv"
local_csv = pathlib.Path("netflix_titles.csv")
local_csv.write_bytes(requests.get(csv_url, timeout=30).content)


3399671

In [None]:

session.file.put(
    str(local_csv),                               
    "@DBT_DB.DBT_SCHEMA.NETFLIX_RAW_STAGE",       
    overwrite=True,
)

print("File uploaded to stage.")

✅ File uploaded to stage.
