# snowflakeR Quickstart

This notebook demonstrates the `snowflakeR` R package for connecting to Snowflake,
running queries, and working with data -- all from R.

**Works in both:**
- Snowflake Workspace Notebooks (Python kernel + `%%R` magic)
- Local environments (RStudio, Posit Workbench, JupyterLab with IR kernel)

**Sections:**
1. [Setup](#section-1-setup)
2. [Connect to Snowflake](#section-2-connect)
3. [Queries & Table Operations](#section-3-queries)
4. [DBI & dbplyr Integration](#section-4-dbi-dbplyr)
5. [Visualization with ggplot2](#section-5-visualization)

---

# Section 1: Setup

Choose the path that matches your environment.

## Path A: Snowflake Workspace Notebook

Workspace Notebooks use a Python kernel. We install R via `micromamba`, then
use `%%R` magic cells to run R code.

### Step 1: Install R environment

Run the setup script from the parent `r_notebook/` directory (first time only, ~3 min):

In [None]:
# Workspace Notebook only -- install R + rpy2
# Skip this cell if running locally in RStudio/JupyterLab

!bash ../setup_r_environment.sh --basic

### Step 2: Configure rpy2 and register `%%R` magic

In [None]:
# Workspace Notebook only -- configure rpy2
import sys
sys.path.insert(0, '..')

from r_helpers import setup_r_environment
result = setup_r_environment()

if result['success']:
    print(f"R {result['r_version']} ready. %%R magic registered.")
else:
    print("Setup failed:", result['errors'])

### Step 3: Install snowflakeR (in the R environment)

In [None]:
%%R
# Install snowflakeR from the local repo (Workspace Notebook)
# In production, this would be: install.packages("snowflakeR")
if (!requireNamespace("snowflakeR", quietly = TRUE)) {
  install.packages(
    "../../snowflakeR",
    repos = NULL,
    type = "source"
  )
}
library(snowflakeR)
cat("snowflakeR loaded successfully\n")

## Path B: Local Environment (RStudio / Posit / JupyterLab)

If you're running this locally with an R kernel, skip the cells above and run:

```r
# Install (one time)
# install.packages("pak")
# pak::pak("Snowflake-Labs/snowflakeR")

# Or from local source:
# install.packages("path/to/snowflakeR", repos = NULL, type = "source")

library(snowflakeR)

# One-time Python environment setup
sfr_install_python_deps()
```

---

# Section 2: Connect to Snowflake

`sfr_connect()` auto-detects your environment:
- **Workspace Notebook:** Wraps the active Snowpark session (no credentials needed)
- **Local:** Reads `~/.snowflake/connections.toml` or accepts explicit parameters

In [None]:
%%R
# Auto-detect: works in both Workspace Notebooks and locally
conn <- sfr_connect()
conn

### Alternative: Explicit parameters (local only)

```r
%%R
conn <- sfr_connect(
  account   = "xy12345.us-east-1",
  user      = "MYUSER",
  warehouse = "COMPUTE_WH",
  database  = "MY_DB",
  schema    = "MY_SCHEMA",
  authenticator = "externalbrowser"
)
```

In [None]:
%%R
# Check connection status
sfr_status(conn)

In [None]:
%%R
# Switch warehouse or schema if needed
# sfr_use(conn, warehouse = "ML_WH", schema = "PUBLIC")

---

# Section 3: Queries & Table Operations

## Run SQL queries

In [None]:
%%R
# Return results as a data.frame
result <- sfr_query(conn, "SELECT CURRENT_TIMESTAMP() AS now, CURRENT_USER() AS user_name")
result

In [None]:
%%R
# DDL/DML -- no result set
sfr_execute(conn, "
  CREATE TABLE IF NOT EXISTS SFR_QUICKSTART_TEST (
    id INT,
    name STRING,
    value DOUBLE
  )
")

## Table operations

In [None]:
%%R
# List tables in the current schema
tables <- sfr_list_tables(conn)
head(tables, 20)

In [None]:
%%R
# Check if a table exists
sfr_table_exists(conn, "SFR_QUICKSTART_TEST")

In [None]:
%%R
# Write a data.frame to Snowflake
sfr_write_table(conn, "SFR_MTCARS", mtcars, overwrite = TRUE)
cat("Wrote", nrow(mtcars), "rows to SFR_MTCARS\n")

In [None]:
%%R
# Read it back
df <- sfr_read_table(conn, "SFR_MTCARS")
str(df)

In [None]:
%%R
# Describe columns
sfr_list_fields(conn, "SFR_MTCARS")

---

# Section 4: DBI & dbplyr Integration

When the `DBI` package is installed, `sfr_connection` objects work with the
standard R database ecosystem -- `DBI::dbGetQuery()`, `dplyr::tbl()`, etc.

## DBI

In [None]:
%%R
library(DBI)

# Standard DBI calls work with sfr_connection objects
DBI::dbGetQuery(conn, "SELECT 42 AS answer")

DBI::dbListTables(conn) |> head(10)

DBI::dbExistsTable(conn, "SFR_MTCARS")

## dbplyr -- dplyr verbs on Snowflake tables

With `dbplyr`, you can use familiar `dplyr` verbs. Operations are translated
to SQL and pushed down to Snowflake -- nothing executes until `collect()`.

In [None]:
%%R
library(dplyr)
library(dbplyr)

# Create a lazy reference to the Snowflake table
cars_tbl <- tbl(conn, "SFR_MTCARS")

# dplyr pipeline -- generates SQL, doesn't fetch yet
summary <- cars_tbl |>
  group_by(cyl) |>
  summarise(
    n     = n(),
    avg_mpg = mean(mpg, na.rm = TRUE),
    avg_hp  = mean(hp, na.rm = TRUE)
  ) |>
  arrange(cyl)

# See the generated SQL
show_query(summary)

In [None]:
%%R
# collect() fetches results into R
result <- collect(summary)
result

---

# Section 5: Visualization with ggplot2

In Workspace Notebooks, use `%%R -w WIDTH -h HEIGHT` to control plot size.
Locally, plots render normally.

In [None]:
%%R -w 700 -h 450
library(ggplot2)

# Read data from Snowflake and plot
cars <- sfr_read_table(conn, "SFR_MTCARS")

p <- ggplot(cars, aes(x = wt, y = mpg, color = factor(cyl))) +
  geom_point(size = 3, alpha = 0.8) +
  geom_smooth(method = "lm", se = FALSE, linetype = "dashed") +
  labs(
    title = "Fuel Efficiency by Weight",
    subtitle = "Data from Snowflake via snowflakeR",
    x = "Weight (1000 lbs)",
    y = "Miles per Gallon",
    color = "Cylinders"
  ) +
  theme_minimal(base_size = 14)

print(p)

In [None]:
%%R -w 700 -h 400
# Bar chart of average MPG by cylinder count
avg_data <- cars |>
  dplyr::group_by(cyl) |>
  dplyr::summarise(avg_mpg = mean(mpg), .groups = "drop")

ggplot(avg_data, aes(x = factor(cyl), y = avg_mpg, fill = factor(cyl))) +
  geom_col(width = 0.6) +
  geom_text(aes(label = round(avg_mpg, 1)), vjust = -0.5, size = 4) +
  labs(
    title = "Average MPG by Cylinder Count",
    x = "Cylinders", y = "Average MPG"
  ) +
  theme_minimal(base_size = 14) +
  theme(legend.position = "none")

---

## Cleanup

In [None]:
%%R
# Drop test tables
sfr_execute(conn, "DROP TABLE IF EXISTS SFR_QUICKSTART_TEST")
sfr_execute(conn, "DROP TABLE IF EXISTS SFR_MTCARS")

# Disconnect
sfr_disconnect(conn)
cat("Done.\n")

---

## Next Steps

- **Model Registry:** See `model_registry_demo.ipynb` for training and deploying R models
- **Feature Store:** See `feature_store_demo.ipynb` for managing features and generating training data
- **Vignettes:** Run `vignette("getting-started", package = "snowflakeR")` for full documentation