--- 
## — QAS scale factor (the “speed budget” knob)

### What is it?

Think of QAS as a pool of **serverless scan workers** Snowflake can borrow to turbo-scan and pre-filter data for your query. The **scale factor** is your *speed budget*: an **upper bound** on how much of that serverless horsepower a warehouse is allowed to lease, expressed as a multiplier of the warehouse’s own size/cost. Default is **8**; set **0** to remove the cap. You’re billed by the second, separately from the warehouse. ([Snowflake Documentation][1])

> Example: a **MEDIUM** warehouse costs **4 credits/hour**. With a scale factor of **5**, QAS can spend up to an additional **20 credits/hour** *while it’s actively accelerating queries*. When no accelerated work is happening, that QAS spend is **0**. ([Snowflake Documentation][1])

### The real purpose (what problem it solves)

Some queries do huge scans but return a sliver of rows. Upsizing the warehouse speeds *everything*, even the lightweight queries. QAS instead **offloads the scan+filter heavy lifting** to serverless compute for only the queries that need it—reducing wall-clock time for the hogs and relieving pressure on your warehouse for everyone else. It often helps “outlier” queries so the rest of the workload runs smoother. ([Snowflake Documentation][2])

### How it works (in plain English)

When Snowflake detects an eligible scan with selective filters, it **splits the scan across many serverless workers**. Those workers read micro-partitions, apply the filters, and stream back only the rows your warehouse actually needs. The warehouse still does the remaining plan (joins/agg/sort), but the slowest part—**scanning and filtering**—is dramatically parallelized. QAS **uses only as many workers as needed and available**, bounded by your scale factor and service capacity. Results vary with availability; estimates assume the service can allocate the full amount. ([Snowflake Documentation][1])

### What advantage does it bring?

* **Faster long scans** without keeping a permanently bigger warehouse.
* **Cost control**: you can cap spend with the scale factor; you pay per second only when QAS is actually used.
* **Better “mixed workloads”**: outliers are offloaded so ordinary queries aren’t starved.
* **Simple to try**: flip it on per warehouse; no code changes. ([Snowflake Documentation][2])

### Quick enable (copy/paste)

```sql
-- Turn QAS on for a warehouse
ALTER WAREHOUSE my_wh SET ENABLE_QUERY_ACCELERATION = TRUE;

-- Give it a generous budget (remove cap)
ALTER WAREHOUSE my_wh SET QUERY_ACCELERATION_MAX_SCALE_FACTOR = 0;

-- Or set a sensible cap (e.g., 5x of the warehouse)
ALTER WAREHOUSE my_wh SET QUERY_ACCELERATION_MAX_SCALE_FACTOR = 5;
```

(Enterprise Edition or higher required.) ([Snowflake Documentation][2])

### A realistic scenario (feel the pain → see the fix)

**Story:** Marketing fires a SQL that scans **12 TB** of events to answer “users who did A then B in a week.” On a MEDIUM warehouse, it takes \~**30 minutes**. You enable QAS with **scale factor = 8**. The exact same query drops to \~**7–10 minutes** because QAS parallelizes the **TableScan** nodes and ships back only the rows matching the selective filters. The warehouse finishes the joins/aggregations faster because it’s fed fewer rows. You didn’t touch the SQL, didn’t resize the warehouse, and teammates’ small queries stop feeling sluggish because the hog’s scan moved off the warehouse. ([Snowflake Documentation][1])

---

# Part 2 — Queries that are eligible for QAS (and how to check)

### The big idea

QAS focuses on **scan and filter** stages. Eligible commands include **SELECT, INSERT, CTAS, and COPY INTO <table>**. Within a supported command, QAS might accelerate the whole statement or just a subquery/clause if that part is eligible. In practice, you’ll see QAS attached to **TableScan** operators in the Query Profile. ([Snowflake Documentation][1])

> Note: Snowflake **expanded** INSERT support—previously only the scan part of `INSERT … SELECT` could be accelerated; now **all portions of eligible INSERTs** can be accelerated. ([Snowflake Documentation][3])

### What makes a query eligible?

A query (or a part of it) is typically eligible when **there’s enough parallelizable scan work** and **filters are selective**. Common reasons **not** eligible:

* **Not enough partitions to scan** (too little scan work to offset QAS overhead).
* **Low selectivity filters** or **high-cardinality GROUP BY** that kills the benefit.
* **LIMIT without ORDER BY** (nondeterministic result handling).
* Use of **nondeterministic functions** (e.g., `RANDOM()`, `SEQ`) in ways that block acceleration.
  These rules evolve, but that’s the gist today. ([Snowflake Documentation][1])

### How to check eligibility (two reliable tools)

**1) Ask Snowflake about a specific past query (last 14 days):**

```sql
-- Returns JSON with status and estimated times at several scale factors
SELECT PARSE_JSON(SYSTEM$ESTIMATE_QUERY_ACCELERATION('<query_id>'));
```

Look for `"status": "eligible"` and `"estimatedQueryTimes"`, plus `"upperLimitScaleFactor"`. If it says `"ineligible"`, QAS won’t help that query as written. (Works only for queries executed in the last **14 days**.) ([Snowflake Documentation][4])

**2) Hunt across history for best candidates:**

```sql
-- Which queries would benefit most (by time eligible for acceleration)?
SELECT query_id, eligible_query_acceleration_time
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_ACCELERATION_ELIGIBLE
WHERE start_time > DATEADD('day', -7, CURRENT_TIMESTAMP())
ORDER BY eligible_query_acceleration_time DESC;

-- Which warehouses have the most eligible time?
SELECT warehouse_name, SUM(eligible_query_acceleration_time) AS total_eligible_time
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_ACCELERATION_ELIGIBLE
WHERE start_time > DATEADD('day', -7, CURRENT_TIMESTAMP())
GROUP BY warehouse_name
ORDER BY total_eligible_time DESC;

-- What “upper limit” scale factor Snowflake would consider for this warehouse?
SELECT MAX(upper_limit_scale_factor)
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_ACCELERATION_ELIGIBLE
WHERE warehouse_name = 'MY_WH'
  AND start_time > DATEADD('day', -7, CURRENT_TIMESTAMP());
```

These views show *how much* of a query’s time is eligible and what **upper limit scale factor** the service would consider. ([Snowflake Documentation][1])

### “Only fetch and filtering operations are executed on QAS” — let’s be precise

* **Mostly true for SELECT/CTAS/COPY**: QAS accelerates **scan/filter** work (you’ll see **TableScan** nodes with “Query Acceleration” stats in the profile). Joins, aggregations, sorts are still executed by your warehouse. ([Snowflake Documentation][1])
* **INSERT got broader**: as of a 2024 change, **all portions of eligible INSERT** statements can be accelerated, not just the scan part of `INSERT … SELECT`. That’s a nuance worth remembering. ([Snowflake Documentation][3])

### How to confirm QAS actually helped a run (after you enable it)

* **Query Profile → Overview**: look for “**Query Acceleration**” stats (e.g., *Partitions scanned by service*, *Scans selected for acceleration*).
* **Account Usage → QUERY\_HISTORY**: the columns
  `QUERY_ACCELERATION_BYTES_SCANNED`,
  `QUERY_ACCELERATION_PARTITIONS_SCANNED`, and
  `QUERY_ACCELERATION_UPPER_LIMIT_SCALE_FACTOR`
  will be > 0 for accelerated queries. (Note: bytes/partitions can appear *higher* because of intermediary results QAS creates—that’s expected.) ([Snowflake Documentation][1])

---

## Putting it together: a 10-minute, safe experiment

1. **Find a candidate:**

```sql
SELECT query_id
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_ACCELERATION_ELIGIBLE
WHERE start_time > DATEADD('day', -7, CURRENT_TIMESTAMP())
ORDER BY eligible_query_acceleration_time DESC
LIMIT 1;
```

2. **Estimate the benefit:**

```sql
SELECT PARSE_JSON(SYSTEM$ESTIMATE_QUERY_ACCELERATION('<that_query_id>'));
```

3. **Enable QAS on your dev/test warehouse and run once more:**

```sql
ALTER WAREHOUSE dev_wh SET ENABLE_QUERY_ACCELERATION = TRUE, QUERY_ACCELERATION_MAX_SCALE_FACTOR = 5;
```

4. **Verify effect (either place):**

* Query Profile → see “Query Acceleration” numbers.
* `SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY` → the QAS columns > 0. ([Snowflake Documentation][1])

---

## Gotchas & pro tips 

* **Not a concurrency feature:** Scale factor ≠ “more slots.” QAS speeds *parts of eligible queries*; **queues** still depend on warehouse size and multi-cluster settings. Use both if you need throughput *and* faster scans. ([Snowflake Documentation][1])
* **Set a cap first:** Start with **2–5**, measure, then raise. **0** is for “go fast, we’ll pay what it takes”. ([Snowflake Documentation][1])
* **Eligibility evolves:** Snowflake keeps expanding patterns it can accelerate; re-check periodically. ([Snowflake Documentation][1])
* **Enterprise edition required.** ([Snowflake Documentation][2])

---

## Practice questions (to make sure it sticks)

1. Explain—in one minute—**what the QAS scale factor** controls and how **cost** is bounded when set to 5 on a MEDIUM warehouse. (Give the math.) ([Snowflake Documentation][1])
2. Name **four reasons** a query may be **ineligible** for QAS and how you’d rewrite or index/cluster to improve eligibility. ([Snowflake Documentation][1])
3. Walk through how you would use **`SYSTEM$ESTIMATE_QUERY_ACCELERATION`** and **`QUERY_ACCELERATION_ELIGIBLE`** to pick a scale factor for a new warehouse. What does `"upperLimitScaleFactor"` mean? ([Snowflake Documentation][4])
4. In the **Query Profile**, where do you see that QAS was used, and what metrics prove it? What does it mean if total bytes scanned are *higher* than without QAS? ([Snowflake Documentation][1])
5. Clarify this statement: “QAS accelerates **only** scan/filter stages.” When is that accurate, and what changed for **INSERT** statements? ([Snowflake Documentation][1])

---



---

## **Practice Questions for QAS**

### **1. When would you enable Query Acceleration Service (QAS) for a warehouse?**

👉 **Answer:**
I would enable QAS when my queries are **long-running scan-heavy workloads** where users often experience queueing or slow response time.

Example:
Suppose my analytics team runs **ad-hoc queries** on a 50 TB sales table. A typical query might filter last 2 years of data with some joins. Even with a `LARGE` warehouse, the response time is 2–3 minutes. Here, scaling up vertically (to XL or 2XL) doesn’t help much, because Snowflake warehouse scaling increases compute power but **still distributes data scan across the same set of nodes**.

By enabling QAS, Snowflake automatically creates **temporary micro-clusters** dedicated for data scan and filtering. The warehouse doesn’t get overloaded and queries complete faster.

So the **trigger condition** is:

* Heavy scans (billions of rows, wide fact tables)
* When multi-cluster scaling doesn’t help because it only solves queueing, not execution time
* When SLA-sensitive queries need faster performance without constantly running an oversized warehouse

---

### **2. How does QAS differ from Multi-cluster scaling?**

👉 **Answer:**

* **Multi-cluster scaling (horizontal scaling):**
  Solves **queueing issues**. If 50 users submit queries at once, Snowflake spawns new clusters so queries run in parallel. But each query still takes the same amount of time.

* **Query Acceleration Service (QAS):**
  Solves **long-running query execution issues**. Instead of queueing, here a single query gets divided into smaller tasks. QAS spins up **short-lived compute resources** that execute scan and filtering in parallel with the warehouse, reducing latency.

Think of it this way:

* Multi-cluster = **more checkout counters at a supermarket** (reduces queue)
* QAS = **more staff helping you inside the store to find products quickly** (reduces time per transaction).

---

### **3. What is the Query Acceleration Scale Factor?**

👉 **Answer:**
The **scale factor** is a **configuration parameter** that controls **how much additional compute QAS can use** relative to the warehouse size.

* Example: Scale Factor = `8` on a `MEDIUM` warehouse
  → Snowflake can use up to 8× more temporary QAS compute for acceleration.
* This ensures you don’t accidentally spend unlimited credits.

**Purpose:**
It prevents runaway costs and allows you to cap QAS resources. Without it, a query might keep consuming QAS compute and blow up cost.

**Real-life scenario:**
Imagine a finance department running **complex historical queries** at month-end. Normally, these queries would take 30–40 minutes. With QAS scale factor = 8, the same queries finish in \~8–10 minutes. Finance team delivers reports before the business meeting.

---

### **4. Which queries are eligible for QAS?**

👉 **Answer:**
Snowflake has strict eligibility rules. QAS is designed only for **scan and filter-heavy queries**. Specifically:

✅ Eligible:

* Queries where a **large amount of raw data must be scanned and filtered** before returning a small result set
* Queries where warehouse compute becomes bottlenecked on scanning, not CPU for joins/aggregations

❌ Not Eligible:

* Queries that are **join-heavy** or **aggregation-heavy** (e.g., `GROUP BY`, `JOIN`, `WINDOW FUNCTIONS`)
* Queries that are already short-running (<1 sec)

**Checking eligibility in Snowflake:**

* Use `SYSTEM$ESTIMATE_QUERY_ACCELERATION(query_id)` → tells how much faster a query would run with QAS
* Use `QUERY_ACCELERATION_ELIGIBLE` in `QUERY_HISTORY` → shows whether a query qualified

**Example:**
Suppose I run:

```sql
SELECT *
FROM sales
WHERE region = 'APAC' AND order_date > '2022-01-01';
```

* The query scans 20 TB of sales data but returns only 200 MB.
* Eligible for QAS → filtering/scanning can be offloaded.

But if I run:

```sql
SELECT region, SUM(revenue)
FROM sales
GROUP BY region;
```

* Heavy aggregation, not just scanning. QAS won’t help here.

---

### **5. What parts of query execution are accelerated by QAS?**

👉 **Answer:**
QAS **only accelerates scan + filter operations**. That means:

* Reading rows from storage
* Applying WHERE clause filters
* Projecting columns

But **NOT accelerated**:

* Joins
* Aggregations (`SUM`, `COUNT`, `GROUP BY`)
* Window functions
* Sorting

**Why?** Because QAS is built to parallelize I/O-heavy workloads. Once data is filtered down, the warehouse nodes still do the heavy lifting for joins and aggregations.

Example:

* Query scans 50 TB → filters down to 500 GB → QAS parallelizes this scan
* After filtering, warehouse does join/aggregate on the reduced dataset

So QAS acts as a **turbocharger for data scanning**, not a replacement for the warehouse compute engine.

---


## HowQAS works under the hood?
---

# scene: “month-end crunch at Streamly”

* table: `FACT_EVENTS` (petabyte-scale clickstream)
* layout: natural Snowflake micro-partitions (created as data lands), plus light clustering on `(event_date, country)`
* shape: \~9 months of data, \~35 billion rows, \~600 TB compressed
* typical query (slow one): product asks for “active users who did A then B within 15 minutes last 30 days, country = 'IN'”, feeding a dashboard tile

```sql
-- the slow query
WITH a AS (
  SELECT user_id, event_ts
  FROM FACT_EVENTS
  WHERE event_name = 'A'
    AND event_date >= DATEADD(day, -30, CURRENT_DATE())
    AND country = 'IN'
),
b AS (
  SELECT user_id, event_ts
  FROM FACT_EVENTS
  WHERE event_name = 'B'
    AND event_date >= DATEADD(day, -30, CURRENT_DATE())
    AND country = 'IN'
)
SELECT COUNT(DISTINCT a.user_id)
FROM a
JOIN b
  ON a.user_id = b.user_id
 AND b.event_ts BETWEEN a.event_ts AND a.event_ts + INTERVAL '15 minutes';
```

On a **MEDIUM** warehouse this takes \~20–25 minutes when traffic spikes. You enable QAS with a modest cap:

```sql
ALTER WAREHOUSE ANALYTICS_WH
  SET ENABLE_QUERY_ACCELERATION = TRUE,
      QUERY_ACCELERATION_MAX_SCALE_FACTOR = 4;  -- “speed budget” cap
```

---

# how QAS works — step by step (what actually happens)

## 0) quick mental model

* **warehouse** = the conductor and main band (does planning, joins, aggs, final stages).
* **QAS** = a temporary brass section Snowflake spins up *only* to blast through **TableScan + WHERE** work, then disappears.

---

## 1) compile & prune (warehouse)

* the planner builds a graph like:

```
TableScan(FACT_EVENTS) → Filter (date, country, event_name) → Project (user_id, event_ts)
            ↘ (same again for subquery b)
                    → Join on user_id/time-window
                    → Aggregate COUNT DISTINCT
```

* **metadata pruning** removes micro-partitions that are obviously irrelevant (older than 30 days, wrong country, wrong event). this already saves a ton.
* what’s left is still **a very large set of micro-partitions** (MPs) because India traffic is huge—classic **data skew**.

---

## 2) eligibility check (warehouse)

* the planner marks each **TableScan** with “acceleration potential”:

  * big remaining scan? ✅
  * selective filters? (`country='IN'`, `event='A'`/`'B'`, `event_date last 30 days`) ✅
  * enough MPs to parallelize? ✅
    → **eligible**.

---

## 3) choose a “speed budget” (warehouse + QAS control plane)

* the service considers:

  * your **scale factor cap** (=4)
  * an internal **upper-limit** for this query (based on partitioning & selectivity)
  * current service availability
* result is a target parallelism for helper workers (say **\~3.5×** the warehouse, bounded at **4×**).

> this doesn’t make the warehouse “stronger”; it just adds **more parallel scan+filter workers** elsewhere so your band isn’t stuck reading the whole library alone.

---

## 4) slice the remaining scan into **subtasks** (QAS)

* QAS groups the eligible micro-partitions into **scan chunks**.
* think of micro-partitions like pages in many boxes:

```
Remaining MPs after pruning (illustrative):

[MP_101 ... MP_150]  (event_date D-30..D-25, country IN)
[MP_201 ... MP_480]  (D-24..D-10, country IN)  <-- heavy zone (campaign spike)
[MP_481 ... MP_620]  (D-9..D,    country IN)
```

* QAS carves these into **balanced bundles** (subtasks). heavy zones get split **more finely** to avoid stragglers:

```
Subtasks:
T1: MP_101..115
T2: MP_116..130
T3: MP_131..150
T4: MP_201..230
T5: MP_231..260
T6: MP_261..290
...
T18: MP_451..480
T19: MP_481..545
T20: MP_546..620
```

* subtasks are sized so they’re **roughly equal wall time**, not equal count. if QAS notices a task lagging, it **splits it again** and hands the remainder to another helper (anti-straggler).

---

## 5) offload & stream back (QAS workers)

for each subtask `Ti`, a serverless worker:

1. **reads** the assigned micro-partitions directly from storage.
2. **applies your WHERE filters** (`event_name`, `country`, `date`).
3. **projects** only needed columns (`user_id`, `event_ts`).
4. **streams** the filtered rows back toward the warehouse.

ASCII view:

```
       QAS worker A  ← T4, T5, T6 (partial)
       QAS worker B  ← T7, T8
Warehouse  ←←← filtered rows (user_id, event_ts) ← QAS worker C  ← T9
       QAS worker D  ← T10 (heavy → split into T10a/T10b on the fly)
```

* **no joins/aggregations** happen in QAS; it’s just turbo-scanning and filtering.

---

## 6) warehouse keeps the “brain” work

* while QAS streams rows, the warehouse:

  * buffers them,
  * executes the **JOIN** (time-window) between `a` and `b`,
  * runs the **COUNT DISTINCT**,
  * returns results.

The **join/agg stages run earlier and faster** because rows arrive **pre-filtered** and **much sooner** than if the warehouse had to do all scans itself.

---

## 7) dynamic balancing & straggler control (during the run)

* if worker **D** is stuck on a fat subtask (e.g., a day with a cricket final!), QAS splits it:

  * `T10` → `T10a` + `T10b`
  * assigns `T10b` to another idle worker
* this keeps the **longest tail** short and avoids the “one last partition” syndrome.

---

## 8) wrap-up & teardown

* when all scan subtasks finish, QAS workers vanish.
* you pay **only for seconds used**.
* the Query Profile now shows a **“Query Acceleration”** section on the `TableScan` nodes (e.g., *Partitions scanned by service*, *Bytes scanned by service*).

---

# the same story as a numbered flow (quick recap)

1. query arrives → plan built
2. metadata pruning removes obvious MPs
3. planner flags TableScan as **eligible**
4. QAS decides **parallelism** bounded by your **scale factor**
5. QAS splits remaining MPs into **balanced subtasks**
6. serverless workers **read + filter + project** and **stream back**
7. warehouse **joins/aggregates** on a much smaller, earlier stream
8. QAS **tears down**, you see acceleration stats & separate serverless cost

---

# tiny lab you can run (safe & concrete)

> goal: feel QAS decisions and see subtasks in the profile (you’ll see acceleration on the `TableScan`).

```sql
-- 1) pick a warehouse & turn on QAS with a safe cap
ALTER WAREHOUSE ANALYTICS_WH
  SET ENABLE_QUERY_ACCELERATION = TRUE,
      QUERY_ACCELERATION_MAX_SCALE_FACTOR = 3;

-- 2) run your slow query once; note query_id
-- (from history or UI)

-- 3) ask Snowflake how much QAS *would* help that exact run
SELECT PARSE_JSON(SYSTEM$ESTIMATE_QUERY_ACCELERATION('<your_query_id>')) AS est;

-- look for:
-- est:status = 'eligible'
-- est:originalQueryTime vs est:estimatedQueryTimes: { "1": ..., "2": ..., "3": ... }
-- est:upperLimitScaleFactor

-- 4) re-run the query and open Query Profile
-- In the TableScan node(s), check:
-- - Scans selected for acceleration
-- - Partitions scanned by service
-- - Bytes scanned by service

-- 5) optional: see numbers in history
SELECT query_id,
       query_acceleration_partitions_scanned,
       query_acceleration_bytes_scanned,
       query_acceleration_upper_limit_scale_factor
FROM   SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE  query_id = '<your_query_id>';
```

**What you’ll observe:**

* with QAS **off**: TableScan duration dominates.
* with QAS **on**: TableScan duration shrinks; you’ll see **partitions/bytes “by service”** > 0, and overall elapsed time drops. on busy days, you’ll notice **more subtasks** (more partitions “by service”).

---

# important clarifications (common traps)

* “QAS makes my warehouse faster.”
  → not exactly. the warehouse doesn’t get stronger; **scan work moves off** to serverless helpers, so **end-to-end time** improves and the warehouse has more headroom for joins/agg.

* “QAS is for queues.”
  → queues are about **concurrency**; fix with **multi-cluster**. QAS is about **long scan/filters**.

* “Only fetch/filter run on QAS.”
  → correct for SELECT/CTAS/COPY: it’s **TableScan + WHERE + projection**. the **join/agg** stages are warehouse work.

---

# a second mini-scenario (to cement it)

You also run:

```sql
SELECT
  DATE_TRUNC('day', event_ts) AS d,
  COUNT(*) AS clicks
FROM FACT_EVENTS
WHERE event_date >= DATEADD(day, -7, CURRENT_DATE())
  AND country IN ('IN','BD','PK')
GROUP BY 1
ORDER BY 1;
```

* **why QAS helps**: huge 7-day scan; filters are selective; aggregation is cheap once rows are filtered.
* **how subtasks form**: QAS splits the remaining MPs by day & country hot spots (e.g., Sunday traffic), balances them, streams filtered rows; warehouse groups by day quickly.
* **what you see**: TableScan shows many partitions “by service”; total time shrinks from, say, 6m → 2m on the same warehouse.

---
