### 3.5 Comparing performance 

Now we have just executed two workloads, one before the Auto Clustering (AC) enablement on the `TRAFFIC` table, and one after the enablement on `TRAFFIC_CLUSTERED` table. Let's compare the result and see if there are any improvements.

Run the following query, which will scan the query history by Warehouse Name, extract each of the query tag's latest query ID and compare them before and after the AC change.

In [None]:
-- for operations & analysis
use warehouse WH_SUMMIT25_PERF_OPS; 

with base_queries as (
    select 
        query_id,
        warehouse_name,
        REGEXP_SUBSTR(REGEXP_SUBSTR(query_text, 'BASE WORKLOAD QUERY - [0-9]{2}'), '[0-9]{2}') as my_tag,
        query_tag, 
        start_time,
        round(execution_time/1000, 2) as execution_time_sec,
        round(execution_time/1000/3600 * 8, 2) as wh_credits
    FROM TABLE(
        INFORMATION_SCHEMA.QUERY_HISTORY_BY_WAREHOUSE(
            WAREHOUSE_NAME =>'WH_SUMMIT25_PERF_BASE'
        )
    )  
    where 
        execution_time > 0
        and query_text like '%BASE WORKLOAD QUERY%'
        and error_code is null 
        and query_type = 'SELECT'
    qualify row_number() over(partition by my_tag order by start_time desc) = 1
),
ac_queries as (
    select 
        query_id,
        warehouse_name,
        REGEXP_SUBSTR(REGEXP_SUBSTR(query_text, 'AC WORKLOAD QUERY - [0-9]{2}'), '[0-9]{2}') as my_tag,
        query_tag, 
        start_time,
        round(execution_time/1000, 2) as execution_time_sec,
        round(execution_time/1000/3600 * 8, 2) as wh_credits
    FROM TABLE(
        INFORMATION_SCHEMA.QUERY_HISTORY_BY_WAREHOUSE(
            WAREHOUSE_NAME =>'WH_SUMMIT25_PERF_AC'
        )
    )
    where 
        execution_time > 0
        and query_text like '%AC WORKLOAD QUERY%'
        and error_code is null 
        and query_type = 'SELECT'
    qualify row_number() over (partition by my_tag order by start_time desc) = 1
)
select
    b.query_id as base_query_id,
    ac.query_id as ac_query_id,
    b.my_tag as query_tag,
    b.execution_time_sec as base_execution_time_sec,
    ac.execution_time_sec as ac_execution_time_sec,
    round((base_execution_time_sec-ac_execution_time_sec)/base_execution_time_sec  * 100, 2) || '%' as improvement_rate
from base_queries b
left join ac_queries ac on (
    b.my_tag = ac.my_tag
)
order by query_tag
;

We can see that after enabling the Auto Clustering on the `TRAFFIC` table alone (we have not touched the `USER_PROFILE` yet), we have substantial improvements already on all queries we have used in our workload testing.

Let's check the stats again using the SP that we created earlier:

In [None]:
CALL insert_multiple_query_stats(
    'WH_SUMMIT25_PERF_AC', 
    'QUERY_STATS_AC', 
    'MODULE3_PART2_AC_WORKLOAD'
);

Now, let's check what's the table scan coverage for the `TRAFFIC_CLUSTERED` table compared with before:

In [None]:
with latest_query_each_tag as (
    select query_id
    from query_stats_ac
    qualify row_number() over (partition by query_tag order by start_time desc) = 1
)
select 
    distinct
    s.query_id,
    query_tag,
    operator_attributes:table_name::string as table_name,
    operator_statistics:pruning:partitions_scanned as mp_scanned,
    operator_statistics:pruning:partitions_total as mp_total,
    round(mp_scanned/mp_total, 4) * 100 as scan_rate
from query_stats_ac s
join latest_query_each_tag q on (
    s.query_id = q.query_id
)
where 
    mp_total is not null
    and table_name = 'SQL_PERF_OPTIMIZATION.PUBLIC.TRAFFIC_CLUSTERED'
order by query_tag
;

As you can see, we have reduced from FULL TABLE SCAN to only scan about 2% of all partitions (with Query 04, 05 and 09 are the exception due to their filtering conditions that covered over 12 months), this helps reduce the query execution time and hence improve overall query performance.

### 3.6 Cost Analysis

You can now compare cost between queries without auto clustering, and queries with auto clustering. Since you have not clustered the table `TRAFFIC_CLUSTERED` here so 3rd branch of the union below is not relevant but it could be used in your own workload performance tuning exercise. 

In [None]:
-- comparing costs
select 
    'Baseline', 
    sum(credits_used)
from table(information_schema.warehouse_metering_history(dateadd('days',-10,current_date())))
where 
    WAREHOUSE_NAME = 'WH_SUMMIT25_PERF_BASE'
group by 1

union all

select 
    'AC_workload', 
    sum(credits_used)
from table(information_schema.warehouse_metering_history(dateadd('days',-1,current_date())))
where 
    WAREHOUSE_NAME = 'WH_SUMMIT25_PERF_AC'
group by 1

union all

select 
    'AC_Cost', 
    sum(credits_used)
from table(information_schema.automatic_clustering_history (
    date_range_start => dateadd(D, -10, current_date),
    date_range_end => current_date,
    table_name => 'SQL_PERF_OPTIMIZATION.PUBLIC.TRAFFIC_CLUSTERED')
)
group by 1
;