# Getting Started with Interactive Tables

## Load libraries and define custom functions

Firstly, make sure to have prerequisite libraries installed in your Snowflake Notebook environment. 

To do this, click on **Packages** and add the following packages to the **Anaconda Packages** tab:
- `matplotlib`
- `tabulate`

In [None]:
import snowflake.connector as snow
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import time
import datetime
import random
import statistics
import tabulate
from concurrent.futures import ThreadPoolExecutor, as_completed

conn_kwargs={}

def execute_and_print(query):
    cursor.execute(query)
    print(tabulate.tabulate(cursor.fetchall()))

def run_and_measure(count, mode):

    if mode =="std":
        query = """
                SELECT SearchEngineID, ClientIP, COUNT(*) AS c, SUM(IsRefresh), AVG(ResolutionWidth) FROM 
                BENCHMARK_FDN.HITS2_CSV
                WHERE SearchPhrase <> '' GROUP BY SearchEngineID, ClientIP ORDER BY c DESC LIMIT 10;
                """
        warehouse_query ="USE WAREHOUSE wh"
    else:
        query = """
                SELECT SearchEngineID, ClientIP, COUNT(*) AS c, SUM(IsRefresh), AVG(ResolutionWidth) FROM 
                BENCHMARK_INTERACTIVE.CUSTOMERS
                WHERE SearchPhrase <> '' GROUP BY SearchEngineID, ClientIP ORDER BY c DESC LIMIT 10;
                """
        warehouse_query ="USE WAREHOUSE interactive_demo_b"
    timings = []
    with snow.connect(**conn_kwargs) as conn:
        with conn.cursor() as cur:
            cur.execute(warehouse_query)
            cursor.execute('ALTER SESSION SET USE_CACHED_RESULT = FALSE;')
            for i in range(count+1):
                t0 = time.time()
                cur.execute(query).fetchall()
                time_taken = time.time() - t0
                timings.append(time_taken)
                
    return timings[1:]
    
def plot_data(data, title, time_taken, color='#29B5E8'):
    # Separate titles and counts
    titles = [item[0] for item in data]
    counts = [item[1] for item in data]

    # Plot bar chart
    
    plt.figure(figsize=(12, 4))
    plt.bar(titles, counts, color=color)
    plt.xticks(rotation=45, ha='right')
    plt.ylabel("Counts")
    plt.xlabel("Title")
    plt.title(title)
    plt.text(0.5, 1.5, f'Time taken: {time_taken:.4f} seconds',
         ha='center', va='top',
         transform=plt.gca().transAxes,
         fontdict={'size': 16})
    #plt.tight_layout()
    plt.show()

# Separate titles and counts
#titles = ['Run 1', 'Run 2', 'Run 3', 'Run 4', 'Run 5', 'Run 6', 'Run 7', 'Run 8']
counts_std = [0.1,0.15, 0.09, 0.12, 0.11, 0.13, 0.10, 0.14]
counts_iw = [0.05, 0.08, 0.07, 0.06, 0.09, 0.08, 0.07, 0.06]

## Setting up connection to a Snowflake deployment and verifying versions

Here, we'll connect to Snowflake and verify the version and confirm that key interactive features are enabled, before setting the active database and role for the session.

In [None]:
config = { }
cursor = snow.connect(**config).cursor()
execute_and_print('select current_version();')
execute_and_print("show parameters like 'ENABLE_INTERACTIVE_WAREHOUSES' for account;")
execute_and_print("show parameters like 'ENABLE_INTERACTIVE_TABLE_DDL' for account;")
execute_and_print("show parameters like 'SHOW_INCLUDE_INTERACTIVE_TABLES' for account;")
query = """ USE DATABASE MY_DEMO_DB; """
execute_and_print(query)

query = """ USE ROLE SYSADMIN;  """
execute_and_print(query)

## Create an interactive warehouse & Turn it on
![alt text](https://github.com/sfc-gh-cnantasenamat/sfquickstarts/blob/patch-4/site/sfguides/src/getting_started_with_interactive_tables/assets/create-turn-on-interactive-warehouse.png?raw=true)

Next, let's create our `interactive_demo_b` warehouse and immediately turn it on:

In [None]:
query = """
CREATE or REPLACE INTERACTIVE WAREHOUSE interactive_demo_b
                WAREHOUSE_SIZE = 'XSMALL'
                MIN_CLUSTER_COUNT = 1
                MAX_CLUSTER_COUNT = 1
                COMMENT = 'Interactive warehouse demo';
"""
execute_and_print(query)
query = """
ALTER WAREHOUSE INTERACTIVE_DEMO_B RESUME;
"""
execute_and_print(query)

## The Data

Run `setup.sql` to setup the database that we'll need for the forthcoming tutorial. After running the setup script, you'll have created the `MY_DEMO_DB` database and `BENCHMARK_FDN` schema that houses the `HITS2_CSV` table.

In [None]:
USE WAREHOUSE WH;
SELECT * FROM MY_DEMO_DB.BENCHMARK_FDN.HITS2_CSV

## Create an interactive table
![alt text](https://github.com/sfc-gh-cnantasenamat/sfquickstarts/blob/patch-4/site/sfguides/src/getting_started_with_interactive_tables/assets/create-interactive-table.png?raw=true)

Now, we'll use the `WH` warehouse to efficiently create our new interactive `CUSTOMERS` table by copying all the data from the original standard table:

In [None]:
print("Switch to demo database")
print(cursor.execute("USE DATABASE MY_DEMO_DB").fetchall())

print("Use a standard warehouse for creating the interactive table's data")
print(cursor.execute("USE WAREHOUSE WH").fetchall())

query = """
CREATE OR REPLACE INTERACTIVE TABLE 
MY_DEMO_DB.BENCHMARK_INTERACTIVE.CUSTOMERS CLUSTER BY (ClientIP)
AS
 SELECT * FROM MY_DEMO_DB.BENCHMARK_FDN.HITS2_CSV
 
"""
execute_and_print(query)

## Attach interactive table to a warehouse

![alt text](https://github.com/sfc-gh-cnantasenamat/sfquickstarts/blob/patch-4/site/sfguides/src/getting_started_with_interactive_tables/assets/attach-interactive-table-to-warehouse.png?raw=true)

Next, we'll attach our interactive table to the warehouse, which pre-warms the data cache for optimal query performance:

In [None]:
query = """
USE DATABASE MY_DEMO_DB;
"""
execute_and_print(query)

query = """
ALTER WAREHOUSE interactive_demo_b ADD TABLES(BENCHMARK_INTERACTIVE.CUSTOMERS);
"""
execute_and_print(query)

## Run queries with interactive warehouse

![alt text](https://github.com/sfc-gh-cnantasenamat/sfquickstarts/blob/patch-4/site/sfguides/src/getting_started_with_interactive_tables/assets/run-queries-with-interactive-warehouse.png?raw=true)

Now, we'll run our first performance test on the interactive setup by executing a page-view query, timing its execution, and then plotting the results.

We'll start by activating the interactive warehouse and disabling the result cache:

In [None]:
print("Use a standard warehouse for creating the interactive table's data")
cursor.execute("USE WAREHOUSE interactive_demo_b")
cursor.execute('USE DATABASE MY_DEMO_DB;')
cursor.execute('ALTER SESSION SET USE_CACHED_RESULT = FALSE;')

Next, we'll run a query to find the top 10 most viewed pages for July 2013, measures how long it takes, and then plots the results and execution time:

In [None]:
query = """
SELECT Title, COUNT(*) AS PageViews
FROM BENCHMARK_INTERACTIVE.CUSTOMERS
WHERE CounterID = 62
  AND EventDate >= '2013-07-01'
  AND EventDate <= '2013-07-31'
  AND DontCountHits = 0
  AND IsRefresh = 0
  AND Title <> ''
  AND REGEXP_LIKE(Title, '^[\\x00-\\x7F]+$')
  AND LENGTH(Title) < 20
GROUP BY Title
ORDER BY PageViews DESC
LIMIT 10;
"""

start_time = time.time()
result = cursor.execute(query).fetchall()
end_time = time.time()
time_taken = end_time - start_time

plot_data(result, "Page visit analysis (Interactive)", time_taken)

## Compare to a standard warehouse

![alt text](https://github.com/sfc-gh-cnantasenamat/sfquickstarts/blob/patch-4/site/sfguides/src/getting_started_with_interactive_tables/assets/compare-to-standard-warehouse.png?raw=true)

To establish a performance baseline, we'll run an identical page-view query on a standard warehouse to measure and plot its results for comparison.

We'll start by preparing the session for a performance benchmark by selecting a standard `WH` warehouse, disabling the result cache, and setting the active database:

In [None]:
print("Use a standard warehouse for creating the interactive table's data")
cursor.execute("USE WAREHOUSE WH")
cursor.execute('ALTER SESSION SET USE_CACHED_RESULT = FALSE;')
cursor.execute('USE DATABASE MY_DEMO_DB;')

Here, we'll run a top 10 page views analysis by executing the query, measuring its performance, and immediately plotting the results and execution time:

In [None]:
query = """
SELECT Title, COUNT(*) AS PageViews
FROM BENCHMARK_FDN.HITS2_CSV
WHERE CounterID = 62
  AND EventDate >= '2013-07-01'
  AND EventDate <= '2013-07-31'
  AND DontCountHits = 0
  AND IsRefresh = 0
  AND Title <> ''
  AND REGEXP_LIKE(Title, '^[\\x00-\\x7F]+$')
  AND LENGTH(Title) < 20
GROUP BY Title
ORDER BY PageViews DESC
LIMIT 10;
"""

start_time = time.time()
result = cursor.execute(query).fetchall()
end_time = time.time()
time_taken = end_time - start_time

plot_data(result, "Page visit analysis (Standard)", time_taken, '#5B5B5B')


## Run some queries concurrently

To directly compare performance, we'll benchmark both the interactive and standard warehouses over several runs and then plot their latencies side-by-side in a grouped bar chart:

In [None]:
runs = 5

counts_iw = run_and_measure(runs,"iw")
print(counts_iw)

counts_std = run_and_measure(runs,"std")
print(counts_std)

titles = [f"R{i}" for i in range(1, len(counts_iw)+1)]

x = np.arange(len(titles))  # the label locations
width = 0.35  # bar width

fig, ax = plt.subplots(figsize=(8, 5))
ax.bar(x - width/2, counts_std, width, label="Standard", color="#5B5B5B")
ax.bar(x + width/2, counts_iw, width, label="Interactive", color="#29B5E8")

ax.set_ylabel("Latency")
ax.set_xlabel("Query run")
ax.set_title("Standard vs Interactive warehouse")
ax.set_xticks(x)
ax.set_xticklabels(titles)
ax.legend(
    loc='upper center',
    bbox_to_anchor=(0.5, -0.15),
    ncol=2
)
plt.show()

