# snowflakeR Quickstart -- Workspace Notebook

This notebook is for **Snowflake Workspace Notebooks** (Python kernel + `%%R` magic).
For local environments (RStudio, Posit, JupyterLab), use `local_quickstart.ipynb`.

**Before you start:** Copy `notebook_config.yaml.template` to `notebook_config.yaml`
and edit it with your warehouse, database, and schema.

**Sections:**
1. Setup (install R + snowflakeR)
2. Connect & set execution context
3. Queries & Table Operations
4. DBI & dbplyr via RSnowflake (optional)
5. Visualization with ggplot2
6. Cleanup

## 1. Setup

Steps 1-3 are the only Python cells you'll need to run. They bootstrap the R
environment and register the `%%R` magic -- after that, everything is pure R.

### Step 1: Install R environment (~3 minutes, first time only)

Everything is installed in user-space (no `sudo` or root required). The script
uses [micromamba](https://github.com/mamba-org/mamba) (BSD-3-Clause) and packages
from [conda-forge](https://conda-forge.org/) -- both community open-source,
not affiliated with Anaconda, Inc. Safe for commercial use.

In [None]:
# Install R + rpy2 via setup script (included in this directory)
!bash setup_r_environment.sh --basic

### Step 2: Configure rpy2 and register `%%R` magic

Workspace Notebooks only have Python cells. The `%%R` cell magic tells the Python
kernel to hand the cell to rpy2, which executes it as R code and returns the output.
After this step, any cell starting with `%%R` runs R -- it feels like an R notebook.

In [None]:
from r_helpers import setup_r_environment
result = setup_r_environment()

if result['success']:
    print(f"R {result['r_version']} ready. %%R magic registered.")
else:
    print("Setup failed:", result['errors'])

### Step 3: Install and load snowflakeR

Last Python cell -- resolves the path to the snowflakeR package source.
After this, everything is `%%R`.

In [None]:
# Resolve the absolute path to the snowflakeR package root.
# This notebook lives at snowflakeR/inst/notebooks/, so the package root
# (the directory containing DESCRIPTION) is two levels up.
import os
snowflaker_path = os.path.normpath(os.path.join(os.getcwd(), "..", ".."))
print(f"snowflakeR path: {snowflaker_path}")
assert os.path.isfile(os.path.join(snowflaker_path, "DESCRIPTION")), \
    f"DESCRIPTION not found in {snowflaker_path} -- check your working directory"

# Export as env var so R can read it via Sys.getenv()
os.environ["SNOWFLAKER_PATH"] = snowflaker_path

In [None]:
%%R
# Suppress interactive prompts (Workspace Notebooks have no stdin)
options(repos = c(CRAN = "https://cloud.r-project.org"))

# Remove stale install (if any) so we always get the latest source
try(remove.packages("snowflakeR"), silent = TRUE)

# Install required dependencies from CRAN first (repos=NULL skips CRAN)
deps <- c("methods", "reticulate", "cli", "rlang")
for (pkg in deps) {
  if (!requireNamespace(pkg, quietly = TRUE))
    install.packages(pkg, type = "source", quiet = TRUE)
}

# Option 1: Install from local repo cloned into the Workspace
# (absolute path resolved in the previous Python cell via env var)
install.packages(Sys.getenv("SNOWFLAKER_PATH"), repos = NULL, type = "source")

# Option 2: Install from GitHub via pak (once published to public repo)
# install.packages("pak", type = "source", quiet = TRUE)
# pak::pak("Snowflake-Labs/snowflakeR", ask = FALSE, upgrade = FALSE)

library(snowflakeR)

---
## 2. Connect & Set Execution Context

From here on, it's all R. No more Python.

Workspace Notebooks do **not** auto-set database or schema.
`sfr_load_notebook_config()` reads `notebook_config.yaml` and runs
`USE WAREHOUSE / DATABASE / SCHEMA` to set the execution context.

All table references in this notebook use fully qualified names via `sfr_fqn()`.

**Tip:** Need to install additional R packages? Use this pattern to suppress
prompts and noise (Workspace Notebooks have no stdin):
```r
options(repos = c(CRAN = "https://cloud.r-project.org"))
install.packages("mypackage", type = "source", quiet = TRUE)
suppressPackageStartupMessages(library(mypackage))
```
For packages with system library dependencies (e.g., `sf`, `curl`), use the
conda-forge equivalent in `r_packages.yaml` instead.

In [None]:
%%R
# Connect (auto-detects Workspace session)
conn <- sfr_connect()

# Load config and set execution context
conn <- sfr_load_notebook_config(conn)
conn

---
## 3. Queries & Table Operations

In [None]:
%%R
# Run a SQL query
result <- sfr_query(conn, "SELECT CURRENT_TIMESTAMP() AS now, CURRENT_USER() AS user_name")
rprint(result)

In [None]:
%%R
# Write a data.frame to Snowflake (fully qualified name)
sfr_write_table(conn, sfr_fqn(conn, "SFR_MTCARS"), mtcars, overwrite = TRUE)

In [None]:
%%R
# List tables
tables <- sfr_list_tables(conn)
rcat("Tables:", paste(head(tables, 10), collapse = ",\n  "))

In [None]:
%%R
# Read it back (fully qualified name)
df <- sfr_read_table(conn, sfr_fqn(conn, "SFR_MTCARS"))
rview(df, n = 5)

In [None]:
%%R
# Describe columns
rprint(sfr_list_fields(conn, sfr_fqn(conn, "SFR_MTCARS")))

---
## 4. DBI & dbplyr via RSnowflake (optional)

For standard DBI-compliant database access and dbplyr integration, use the
`RSnowflake` package. You can obtain an `RSnowflake` connection from your
`sfr_connection` via `sfr_dbi_connection()`, or create one directly with
`DBI::dbConnect(RSnowflake::Snowflake(), ...)`.

In [None]:
%%R
library(RSnowflake)
library(DBI)

# Get an RSnowflake DBI connection from the sfr_connection
dbi_con <- sfr_dbi_connection(conn)

DBI::dbGetQuery(dbi_con, "SELECT 42 AS answer") |> rprint()
DBI::dbExistsTable(dbi_con, sfr_fqn(conn, "SFR_MTCARS"))

In [None]:
%%R
library(dplyr)
library(dbplyr)

# Lazy reference to Snowflake table via the RSnowflake DBI connection
cars_tbl <- tbl(dbi_con, I(sfr_fqn(conn, "SFR_MTCARS")))

# dplyr pipeline -- generates SQL, runs on collect()
summary <- cars_tbl |>
  group_by(CYL) |>
  summarise(
    n       = n(),
    avg_mpg = mean(MPG, na.rm = TRUE),
    avg_hp  = mean(HP, na.rm = TRUE)
  ) |>
  arrange(CYL) |>
  collect()

rprint(summary)

---
## 5. Visualization with ggplot2

Use `%%R -w WIDTH -h HEIGHT` and `print(p)` for plots in Workspace Notebooks.

In [None]:
%%R -w 700 -h 450
library(ggplot2)

df <- sfr_read_table(conn, sfr_fqn(conn, "SFR_MTCARS"))

p <- ggplot(df, aes(x = wt, y = mpg, color = factor(cyl))) +
  geom_point(size = 3) +
  labs(title = "MPG vs Weight by Cylinder Count",
       x = "Weight (1000 lbs)", y = "Miles per Gallon",
       color = "Cylinders") +
  theme_minimal()

print(p)  # print() required in Workspace Notebooks

---
## 6. Cleanup

In [None]:
%%R
# Uncomment to clean up demo objects
# (commented out to avoid accidental deletion on Run All)
#
# sfr_execute(conn, paste("DROP TABLE IF EXISTS", sfr_fqn(conn, "SFR_MTCARS")))
# rcat("Cleanup complete.")