# snowflakeR Quickstart -- Workspace Notebook

This notebook is for **Snowflake Workspace Notebooks** (Python kernel + `%%R` magic).
For local environments (RStudio, Posit, JupyterLab), use `local_quickstart.ipynb`.

**Sections:**
1. Setup (install R + snowflakeR)
2. Connect to Snowflake
3. Queries & Table Operations
4. DBI & dbplyr Integration
5. Visualization with ggplot2
6. Cleanup

## 1. Setup

### Step 1: Install R environment (~3 minutes, first time only)

In [None]:
# Install R + rpy2 via setup script
!bash ../setup_r_environment.sh --basic

### Step 2: Configure rpy2 and register `%%R` magic

In [None]:
import sys
sys.path.insert(0, '..')

from r_helpers import setup_r_environment
result = setup_r_environment()

if result['success']:
    print(f"R {result['r_version']} ready. %%R magic registered.")
else:
    print("Setup failed:", result['errors'])

### Step 3: Install and load snowflakeR

In [None]:
%%R
if (!requireNamespace("snowflakeR", quietly = TRUE)) {
  install.packages("../../snowflakeR", repos = NULL, type = "source")
}
library(snowflakeR)

---
## 2. Connect to Snowflake

In Workspace Notebooks, `sfr_connect()` auto-detects the active Snowpark session.

In [None]:
%%R
conn <- sfr_connect()
conn

In [None]:
%%R
# Set warehouse, database, and schema if not already set
conn <- sfr_use(conn, warehouse = "COMPUTE_WH", database = "MY_DB", schema = "PUBLIC")
sfr_status(conn)

---
## 3. Queries & Table Operations

In [None]:
%%R
# Run a SQL query
result <- sfr_query(conn, "SELECT CURRENT_TIMESTAMP() AS now, CURRENT_USER() AS user_name")
rprint(result)

In [None]:
%%R
# Write a data.frame to Snowflake
sfr_write_table(conn, "SFR_MTCARS", mtcars, overwrite = TRUE)

In [None]:
%%R
# List tables
tables <- sfr_list_tables(conn)
rcat("Tables:", paste(head(tables, 10), collapse = ", "))

In [None]:
%%R
# Read it back
df <- sfr_read_table(conn, "SFR_MTCARS")
rview(df, n = 5)

In [None]:
%%R
# Describe columns
rprint(sfr_list_fields(conn, "SFR_MTCARS"))

---
## 4. DBI & dbplyr Integration

In [None]:
%%R
library(DBI)

DBI::dbGetQuery(conn, "SELECT 42 AS answer") |> rprint()
DBI::dbExistsTable(conn, "SFR_MTCARS")

In [None]:
%%R
library(dplyr)
library(dbplyr)

# Lazy reference to Snowflake table
cars_tbl <- tbl(conn, "SFR_MTCARS")

# dplyr pipeline -- generates SQL, runs on collect()
summary <- cars_tbl |>
  group_by(cyl) |>
  summarise(
    n       = n(),
    avg_mpg = mean(mpg, na.rm = TRUE),
    avg_hp  = mean(hp, na.rm = TRUE)
  ) |>
  arrange(cyl) |>
  collect()

rprint(summary)

---
## 5. Visualization with ggplot2

Use `%%R -w WIDTH -h HEIGHT` and `print(p)` for plots in Workspace Notebooks.

In [None]:
%%R -w 700 -h 450
library(ggplot2)

df <- sfr_read_table(conn, "SFR_MTCARS")

p <- ggplot(df, aes(x = wt, y = mpg, color = factor(cyl))) +
  geom_point(size = 3) +
  labs(title = "MPG vs Weight by Cylinder Count",
       x = "Weight (1000 lbs)", y = "Miles per Gallon",
       color = "Cylinders") +
  theme_minimal()

print(p)  # print() required in Workspace Notebooks

---
## 6. Cleanup

In [None]:
%%R
sfr_execute(conn, "DROP TABLE IF EXISTS SFR_MTCARS")
sfr_execute(conn, "DROP TABLE IF EXISTS SFR_QUICKSTART_TEST")
rcat("Cleanup complete.")