# snowflakeR Quickstart -- Workspace Notebook

This notebook is for **Snowflake Workspace Notebooks** (Python kernel + `%%R` magic).
For local environments (RStudio, Posit, JupyterLab), use `local_quickstart.ipynb`.

**Before you start:** Copy `notebook_config.yaml.template` to `notebook_config.yaml`
and edit it with your warehouse, database, and schema.

**Sections:**
1. Setup (install R + snowflakeR)
2. Connect & set execution context
3. Queries & Table Operations
4. DBI & dbplyr Integration
5. Visualization with ggplot2
6. Cleanup

## 1. Setup

### Step 1: Install R environment (~3 minutes, first time only)

In [None]:
# Install R + rpy2 via setup script (included in this directory)
!bash setup_r_environment.sh --basic

### Step 2: Configure rpy2 and register `%%R` magic

In [None]:
from r_helpers import setup_r_environment
result = setup_r_environment()

if result['success']:
    print(f"R {result['r_version']} ready. %%R magic registered.")
else:
    print("Setup failed:", result['errors'])

### Step 3: Install and load snowflakeR

In [None]:
# Resolve the absolute path to the snowflakeR package root.
# This notebook lives at snowflakeR/inst/notebooks/, so the package root
# (the directory containing DESCRIPTION) is two levels up.
import os
snowflaker_path = os.path.normpath(os.path.join(os.getcwd(), "..", ".."))
print(f"snowflakeR path: {snowflaker_path}")
assert os.path.isfile(os.path.join(snowflaker_path, "DESCRIPTION")), \
    f"DESCRIPTION not found in {snowflaker_path} -- check your working directory"

# Export as env var so R can read it via Sys.getenv()
os.environ["SNOWFLAKER_PATH"] = snowflaker_path

In [None]:
%%R
# Suppress interactive prompts (Workspace Notebooks have no stdin)
options(repos = c(CRAN = "https://cloud.r-project.org"))

if (!requireNamespace("snowflakeR", quietly = TRUE)) {
  # Install required dependencies from CRAN first (repos=NULL skips CRAN)
  deps <- c("DBI", "methods", "reticulate", "cli", "rlang")
  for (pkg in deps) {
    if (!requireNamespace(pkg, quietly = TRUE))
      install.packages(pkg, type = "source", quiet = TRUE)
  }

  # Option 1: Install from local repo cloned into the Workspace
  # (absolute path resolved in the previous Python cell via env var)
  install.packages(Sys.getenv("SNOWFLAKER_PATH"), repos = NULL, type = "source")

  # Option 2: Install from GitHub via pak (once published to public repo)
  # install.packages("pak", type = "source", quiet = TRUE)
  # pak::pak("Snowflake-Labs/snowflakeR", ask = FALSE, upgrade = FALSE)
}
library(snowflakeR)

---
## 2. Connect & Set Execution Context

Workspace Notebooks do **not** auto-set database or schema.
`sfr_load_notebook_config()` reads `notebook_config.yaml` and runs
`USE WAREHOUSE / DATABASE / SCHEMA` to set the execution context.

All table references in this notebook use fully qualified names via `sfr_fqn()`.

In [None]:
%%R
# Connect (auto-detects Workspace session)
conn <- sfr_connect()

# Load config and set execution context
conn <- sfr_load_notebook_config(conn)
conn

---
## 3. Queries & Table Operations

In [None]:
%%R
# Run a SQL query
result <- sfr_query(conn, "SELECT CURRENT_TIMESTAMP() AS now, CURRENT_USER() AS user_name")
rprint(result)

In [None]:
%%R
# Write a data.frame to Snowflake (fully qualified name)
sfr_write_table(conn, sfr_fqn(conn, "SFR_MTCARS"), mtcars, overwrite = TRUE)

In [None]:
%%R
# List tables
tables <- sfr_list_tables(conn)
rcat("Tables:", paste(head(tables, 10), collapse = ",\n  "))

In [None]:
%%R
# Read it back (fully qualified name)
df <- sfr_read_table(conn, sfr_fqn(conn, "SFR_MTCARS"))
rview(df, n = 5)

In [None]:
%%R
# Describe columns
rprint(sfr_list_fields(conn, sfr_fqn(conn, "SFR_MTCARS")))

---
## 4. DBI & dbplyr Integration

In [None]:
%%R
library(DBI)

DBI::dbGetQuery(conn, "SELECT 42 AS answer") |> rprint()
DBI::dbExistsTable(conn, sfr_fqn(conn, "SFR_MTCARS"))

In [None]:
%%R
library(dplyr)
library(dbplyr)

# Lazy reference to Snowflake table (fully qualified)
cars_tbl <- tbl(conn, sfr_fqn(conn, "SFR_MTCARS"))

# dplyr pipeline -- generates SQL, runs on collect()
summary <- cars_tbl |>
  group_by(cyl) |>
  summarise(
    n       = n(),
    avg_mpg = mean(mpg, na.rm = TRUE),
    avg_hp  = mean(hp, na.rm = TRUE)
  ) |>
  arrange(cyl) |>
  collect()

rprint(summary)

---
## 5. Visualization with ggplot2

Use `%%R -w WIDTH -h HEIGHT` and `print(p)` for plots in Workspace Notebooks.

In [None]:
%%R -w 700 -h 450
library(ggplot2)

df <- sfr_read_table(conn, sfr_fqn(conn, "SFR_MTCARS"))

p <- ggplot(df, aes(x = wt, y = mpg, color = factor(cyl))) +
  geom_point(size = 3) +
  labs(title = "MPG vs Weight by Cylinder Count",
       x = "Weight (1000 lbs)", y = "Miles per Gallon",
       color = "Cylinders") +
  theme_minimal()

print(p)  # print() required in Workspace Notebooks

---
## 6. Cleanup

In [None]:
%%R
sfr_execute(conn, paste("DROP TABLE IF EXISTS", sfr_fqn(conn, "SFR_MTCARS")))
rcat("Cleanup complete.")