Snowflake-native SQL features: semi-structured data, materialized views, CDC streams, and zero-copy cloning
Note: A Snowflake account is needed for full execution (free 30-day trial at snowflake.com). All Python logic is unit-testable locally via MockSnowflakeClient — tests pass without credentials.
| Feature | Description | File |
|---|---|---|
| Virtual Warehouse | Auto-suspend/resume compute — no idle cost | 01_setup.sql |
| VARIANT column | Schema-flexible JSON storage (no schema migrations needed) | 01_setup.sql |
| QUALIFY clause | Window function filter without subquery | 02_analytics_queries.sql |
| PARSE_JSON + FLATTEN | Query nested JSON arrays with LATERAL FLATTEN |
02_analytics_queries.sql |
| Time Travel | Query historical data: AT(OFFSET => -3600) |
02_analytics_queries.sql |
| Zero-copy clone | CLONE TABLE — instant dev copy, no storage cost |
01_setup.sql |
| Materialized View | Pre-computed daily revenue — auto-refreshed | 02_analytics_queries.sql |
| Task | Scheduled SQL — cron-like pipeline without Airflow | 02_analytics_queries.sql |
| Stream (CDC) | Capture every INSERT/UPDATE/DELETE incrementally | 01_setup.sql |
| Feature | Traditional DB | Snowflake |
|---|---|---|
| Scaling compute | Manual (DBA, downtime) | Instant (ALTER WAREHOUSE SET SIZE = 'LARGE') |
| Semi-structured data | Separate JSON column with text | Native VARIANT type with indexing |
| Historical queries | Manual backups | Built-in Time Travel (up to 90 days) |
| Dev environments | Full data copy | Zero-copy clone (instant, no storage cost) |
| Scheduled pipelines | Cron + scripts | Native Tasks (no external scheduler needed) |
| Data sharing | ETL to partner | Native Data Sharing (no copying) |
# 1. Run tests (no Snowflake account needed)
pip install pandas pytest
pytest tests/ -v
# 2. Connect to real Snowflake
cp .env.example .env
# Fill in your credentials
pip install snowflake-connector-python python-dotenv
python -c "
from dotenv import load_dotenv; load_dotenv()
from src.snowflake_client import SnowflakeClient
client = SnowflakeClient()
df = client.get_revenue_by_region()
print(df)
"
# 3. Run SQL files in Snowflake UI (Snowsight)
# Upload sql/01_setup.sql → run to create warehouse + tables
# Upload sql/02_analytics_queries.sql → run analytical queries- Virtual Warehouses: Snowflake separates storage from compute — you can have 10 warehouses of different sizes all querying the same data simultaneously
- QUALIFY clause: cleaner than wrapping a window function in a subquery — unique to Snowflake/BigQuery/Databricks
- VARIANT type: store JSON without a schema — use
:operator to navigate nested fields (METADATA:channel::STRING) - Time Travel: query data from any point in the past — invaluable for debugging pipeline failures ("what did the table look like before that ETL ran?")
- Zero-copy cloning: how Snowflake enables instant dev environments without duplicating storage
MIT
Part of a 10-project Data Analyst portfolio