# 04 , Run Benchmark Queries (Ground Truth)

This notebook runs all 24 benchmark SQL queries against the reporting tables built in notebook 03.

For each query, we:
1. Load the original SQL from `docs/queries/*.json`
2. Adapt it to run against our local parquet files (replace `reporting.table` → `read_parquet('...')`)
3. Execute and display the results
4. Save the output as a CSV in `data/results/real/` (ground truth for DP comparison)

If a query fails, that tells us the reporting table build has a schema mismatch we need to fix.

In [1]:
import json
import re
from pathlib import Path

import duckdb
import pandas as pd
from IPython.display import display, Markdown

QUERIES = Path("../docs/queries")
REPORTING = Path("../data/reporting")
RESULTS = Path("../data/results/real")
RESULTS.mkdir(parents=True, exist_ok=True)

con = duckdb.connect()


def adapt_sql(sql: str) -> str:
    def replacer(match):
        table = match.group(1)
        path = REPORTING / f"{table}.parquet"
        return f"read_parquet('{path}')"
    return re.sub(r'reporting\.(\w+)', replacer, sql)


def run_query(name: str) -> pd.DataFrame | None:
    qfile = QUERIES / f"{name}.json"
    with open(qfile) as f:
        data = json.load(f)
        if isinstance(data, list):
            data = data[0]
    
    original_sql = data["sql"]
    question = data.get("question", "")[:120]
    adapted_sql = adapt_sql(original_sql)
    
    display(Markdown(f"## `{name}`\n\n{question}..."))
    
    try:
        df = con.execute(adapted_sql).df()
        display(Markdown(f"✓ {len(df)} rows, {len(df.columns)} columns: `{', '.join(df.columns.tolist())}`"))
        display(df.head(10))
        
        out = RESULTS / f"{name}.csv"
        df.to_csv(out, index=False)
        display(Markdown(f"Saved to `{out}`"))
        return df
        
    except Exception as e:
        display(Markdown(f"✗ FAILED: {e}\n\nAdapted SQL: `{adapted_sql[:300]}...`"))
        return None


---
## Run all 24 queries

We run each query one at a time so we can inspect results and catch any issues. Queries are grouped by type.

### Aggregate stats + joins

In [2]:
run_query("avg_platform_power_c0_freq_temp_by_chassis")

## `avg_platform_power_c0_freq_temp_by_chassis`

Provide me with a summary of the avg power consumed, the avg package c0, the average package frequency, and the average ...

FloatProgress(value=0.0, layout=Layout(width='auto'), style=ProgressStyle(bar_color='black'))

✓ 4 rows, 6 columns: `chassistype, number_of_systems, avg_psys_rap_watts, avg_pkg_c0, avg_freq_mhz, avg_temp_centigrade`

Unnamed: 0,chassistype,number_of_systems,avg_psys_rap_watts,avg_pkg_c0,avg_freq_mhz,avg_temp_centigrade
0,Notebook,59,4.423014,37.446711,1582.049938,44.529059
1,Desktop,29,6.316126,45.068928,5442.45882,41.838204
2,Intel NUC/STK,4,3.873279,42.443253,1755.288567,41.304743
3,2 in 1,12,2.553132,45.564168,1991.687716,51.176779


Saved to `../data/results/real/avg_platform_power_c0_freq_temp_by_chassis.csv`

Unnamed: 0,chassistype,number_of_systems,avg_psys_rap_watts,avg_pkg_c0,avg_freq_mhz,avg_temp_centigrade
0,Notebook,59,4.423014,37.446711,1582.049938,44.529059
1,Desktop,29,6.316126,45.068928,5442.45882,41.838204
2,Intel NUC/STK,4,3.873279,42.443253,1755.288567,41.304743
3,2 in 1,12,2.553132,45.564168,1991.687716,51.176779


In [3]:
run_query("server_exploration_1")

## `server_exploration_1`

I am trying to identify servers based on their network utilization. I believe that typically, servers will send more dat...

✓ 1497 rows, 10 columns: `guid, nrs, received_bytes, sent_bytes, chassistype, vendor, model, ram, os, number_of_cores`

Unnamed: 0,guid,nrs,received_bytes,sent_bytes,chassistype,vendor,model,ram,os,number_of_cores
0,fbd46745310e434cbeaddfb9f2f4d864,1333306.0,148963400000.0,198692700000.0,Notebook,HP,HP ZBook Power G7 Mobile Workstation,16.0,Win10,6.0
1,fc00420d9cf9483e8f5adc9cf84d76b5,474328.0,549954000000.0,2132567000000.0,Notebook,Acer,Aspire E5-576,8.0,Win10,4.0
2,fc2e624ac2824e5ea3cca747d6b5b9ff,9121354.0,13554040000000.0,28910470000000.0,Server/WS,Dell,Other,256.0,Win Server,
3,fc39f46cc6f74d85acb36fcaa5eea7d0,4836038.0,201487200000.0,229559700000.0,Notebook,Dell,XPS 13 9360,8.0,Win10,2.0
4,fc3ad80dc9a14c7f93ab97e1c9519d6f,8861228.0,716391400000.0,749389500000.0,Desktop,AZW,Default string,8.0,Win10,4.0
5,fca25d1efcf5410bbf0a7ae7e28106aa,5572223.0,41885350000.0,48898660000.0,2 in 1,HP,HP Spectre x360 Convertible 15-ch0xx,16.0,Win11,4.0
6,fcef8a63951f42ddbd9857338c8a149b,3042765.0,502089500000.0,980034300000.0,Notebook,Lenovo,Lenovo Legion Y7000P2020H,16.0,Win11,8.0
7,fd53969a917a49eabca1b0c5f3466835,5274229.0,13264440000.0,20102780000.0,Desktop,Gigabyte,Default string,16.0,Win10,6.0
8,fd82aaf7576e4b83836f746a8d9299ce,745994.0,368345100000.0,547788300000.0,Desktop,Unknown,Other,32.0,Win10,8.0
9,fda5c66dd1824ac8a39e05765fd06d4f,144002.0,22698820000.0,36107160000.0,Notebook,Dell,Vostro 15-3568,8.0,Win10,2.0


Saved to `../data/results/real/server_exploration_1.csv`

Unnamed: 0,guid,nrs,received_bytes,sent_bytes,chassistype,vendor,model,ram,os,number_of_cores
0,fbd46745310e434cbeaddfb9f2f4d864,1333306.0,1.489634e+11,1.986927e+11,Notebook,HP,HP ZBook Power G7 Mobile Workstation,16.0,Win10,6
1,fc00420d9cf9483e8f5adc9cf84d76b5,474328.0,5.499540e+11,2.132567e+12,Notebook,Acer,Aspire E5-576,8.0,Win10,4
2,fc2e624ac2824e5ea3cca747d6b5b9ff,9121354.0,1.355404e+13,2.891047e+13,Server/WS,Dell,Other,256.0,Win Server,
3,fc39f46cc6f74d85acb36fcaa5eea7d0,4836038.0,2.014872e+11,2.295597e+11,Notebook,Dell,XPS 13 9360,8.0,Win10,2
4,fc3ad80dc9a14c7f93ab97e1c9519d6f,8861228.0,7.163914e+11,7.493895e+11,Desktop,AZW,Default string,8.0,Win10,4
...,...,...,...,...,...,...,...,...,...,...
1492,fa8c70a5e9e64ba0a21f008bc7bed8b3,1640303.0,5.255070e+11,2.275789e+12,Desktop,Unknown,Other,8.0,Win10,4
1493,fa910200789840588b059e58d1199626,23492648.0,4.981598e+12,1.377434e+13,Desktop,Gigabyte,Z390 DESIGNARE,16.0,Win11,4
1494,fae1c5788032408d860697e840d34ed4,3171582.0,5.175706e+10,1.532461e+11,Desktop,Other,SH370,16.0,Win10,6
1495,faed35c41085420290bad4ddfb699cb2,9536926.0,1.751784e+11,1.203804e+12,Intel NUC/STK,Intel,NUC10i5FNK,16.0,Win10,4


In [4]:
run_query("mods_blockers_by_osname_and_codename")

## `mods_blockers_by_osname_and_codename`

Provide me a summary count distribution of the number of top modern sleepstudy blockers, broke out by Windows os name an...

FloatProgress(value=0.0, layout=Layout(width='auto'), style=ProgressStyle(bar_color='black'))

✓ 11 rows, 5 columns: `os_name, os_codename, num_entries, number_of_systems, entries_per_system`

Unnamed: 0,os_name,os_codename,num_entries,number_of_systems,entries_per_system
0,Win10,21H1,745957,937,796.11206
1,Win11,22H2,47476969,38777,1224.359001
2,Win10,20H1,213673,141,1515.411348
3,Win10,19H2,261054,254,1027.771654
4,Win10,19H1,76060,61,1246.885246
5,Win10,20H2,1503478,1521,988.479947
6,Win10,21H2,11046208,11381,970.583253
7,Win10,22H2,19387138,12158,1594.599276
8,Win10,RS5,186074,149,1248.818792
9,Win10,RS4,7287,17,428.647059


Saved to `../data/results/real/mods_blockers_by_osname_and_codename.csv`

Unnamed: 0,os_name,os_codename,num_entries,number_of_systems,entries_per_system
0,Win10,21H1,745957,937,796.11206
1,Win11,22H2,47476969,38777,1224.359001
2,Win10,20H1,213673,141,1515.411348
3,Win10,19H2,261054,254,1027.771654
4,Win10,19H1,76060,61,1246.885246
5,Win10,20H2,1503478,1521,988.479947
6,Win10,21H2,11046208,11381,970.583253
7,Win10,22H2,19387138,12158,1594.599276
8,Win10,RS5,186074,149,1248.818792
9,Win10,RS4,7287,17,428.647059


In [5]:
run_query("top_mods_blocker_types_durations_by_osname_and_codename")

## `top_mods_blocker_types_durations_by_osname_and_codename`

Provide me a summary count and duration distribution of the number of top modern sleepstudy blockers, broke out by Windo...

FloatProgress(value=0.0, layout=Layout(width='auto'), style=ProgressStyle(bar_color='black'))

✓ 10568 rows, 8 columns: `os_name, os_codename, blocker_name, blocker_type, activity_level, number_of_clients, average_active_time_in_seconds, number_of_occurences`

Unnamed: 0,os_name,os_codename,blocker_name,blocker_type,activity_level,number_of_clients,average_active_time_in_seconds,number_of_occurences
0,Win10,21H2,No CS Phase,PDC Phase,high,9783,6388.151977,943566
1,Win11,21H2,USB xHCI Compliant Host Controller (\_SB.PC00....,Fx Device,moderate,83,936.84536,552
2,Win11,21H2,Intel(R) PCI Express Root Port #14 - 43B5 (\_S...,Fx Device,high,5,1383.384104,239
3,Win10,22H2,Universal Telemetry Client,Activator,low,2592,85.922664,25370
4,Win10,22H2,WNS,Activator,moderate,2202,269.834387,30493
5,Win11,22H2,Lock Screen,Activator,low,10546,11.776727,99272
6,Win11,22H2,Windows Push Notifications,Activator,moderate,1643,387.551297,6382
7,Win11,22H2,WU,Activator,moderate,7409,579.731766,27525
8,Win10,21H2,WU,Activator,low,3221,105.255326,19489
9,Win11,22H2,Intel(R) Wi-Fi 6 AX201 160MHz (\_SB.PC00.CNVW),Fx Device,low,4219,68.228542,717330


Saved to `../data/results/real/top_mods_blocker_types_durations_by_osname_and_codename.csv`

Unnamed: 0,os_name,os_codename,blocker_name,blocker_type,activity_level,number_of_clients,average_active_time_in_seconds,number_of_occurences
0,Win10,21H2,No CS Phase,PDC Phase,high,9783,6388.151977,943566
1,Win11,21H2,USB xHCI Compliant Host Controller (\_SB.PC00....,Fx Device,moderate,83,936.845360,552
2,Win11,21H2,Intel(R) PCI Express Root Port #14 - 43B5 (\_S...,Fx Device,high,5,1383.384104,239
3,Win10,22H2,Universal Telemetry Client,Activator,low,2592,85.922664,25370
4,Win10,22H2,WNS,Activator,moderate,2202,269.834387,30493
...,...,...,...,...,...,...,...,...
10563,Win10,20H2,Intel(R) Serial IO I2C Host Controller - 9DC5 ...,Fx Device,high,1,8186.625500,6
10564,Win10,21H2,Contrôleur audio de la technologie Intel(R) Sm...,Fx Device,low,1,17.630613,46
10565,Win10,21H2,Intel(R) PCI Express Root Port #11 - 9DB2 (\_S...,Fx Device,moderate,1,328.140200,1
10566,Win10,19H1,USB xHCI Compliant Host Controller (\_SB.PC00....,Fx Device,moderate,1,413.433600,2


### Geographic / demographic breakdowns

In [6]:
run_query("Xeon_network_consumption")

## `Xeon_network_consumption`

I want to characterize client systems with Xeon processors network consumption, vs non-Xeon systems, further bifurcated ...

✓ 8 rows, 5 columns: `processor_class, os, number_of_systems, avg_bytes_received, avg_bytes_sent`

Unnamed: 0,processor_class,os,number_of_systems,avg_bytes_received,avg_bytes_sent
0,Non-Server Class,Win Server,58,8306623000000.0,6575987000000.0
1,Server Class,Win11,27,798237000000.0,503003600000.0
2,Server Class,Win Server,52,7639491000000.0,2285146000000.0
3,Non-Server Class,,3,196592000000.0,30642120000.0
4,Server Class,Win10,297,5.703838e+17,5.703804e+17
5,Non-Server Class,Win11,8602,1.307379e+16,1.307286e+16
6,Non-Server Class,Win8.1,34,701831800000.0,103084200000.0
7,Non-Server Class,Win10,28151,5296385000000000.0,5295321000000000.0


Saved to `../data/results/real/Xeon_network_consumption.csv`

Unnamed: 0,processor_class,os,number_of_systems,avg_bytes_received,avg_bytes_sent
0,Non-Server Class,Win Server,58,8306623000000.0,6575987000000.0
1,Server Class,Win11,27,798237000000.0,503003600000.0
2,Server Class,Win Server,52,7639491000000.0,2285146000000.0
3,Non-Server Class,,3,196592000000.0,30642120000.0
4,Server Class,Win10,297,5.703838e+17,5.703804e+17
5,Non-Server Class,Win11,8602,1.307379e+16,1.307286e+16
6,Non-Server Class,Win8.1,34,701831800000.0,103084200000.0
7,Non-Server Class,Win10,28151,5296385000000000.0,5295321000000000.0


In [7]:
run_query("pkg_power_by_country")

## `pkg_power_by_country`

Provide me a enumeration of the average cpu package power in watts consumed by a client, broke out by country....

✓ 49 rows, 3 columns: `countryname_normalized, number_of_systems, avg_pkg_power_consumed`

Unnamed: 0,countryname_normalized,number_of_systems,avg_pkg_power_consumed
0,Belgium,5,672.575966
1,Russian Federation,43,83.043399
2,United States of America,128,41.189114
3,Austria,4,26.393648
4,Egypt,2,25.834264
5,South Africa,2,14.522945
6,China,27,13.791807
7,Australia,9,11.717887
8,Norway,5,11.322477
9,Sweden,13,10.769468


Saved to `../data/results/real/pkg_power_by_country.csv`

Unnamed: 0,countryname_normalized,number_of_systems,avg_pkg_power_consumed
0,Belgium,5,672.575966
1,Russian Federation,43,83.043399
2,United States of America,128,41.189114
3,Austria,4,26.393648
4,Egypt,2,25.834264
5,South Africa,2,14.522945
6,China,27,13.791807
7,Australia,9,11.717887
8,Norway,5,11.322477
9,Sweden,13,10.769468


In [8]:
run_query("battery_power_on_geographic_summary")

## `battery_power_on_geographic_summary`

Provide me a summary by country of battery usage in the world. For every country with at least 100 client systems, I wou...

✓ 37 rows, 4 columns: `country, number_of_systems, avg_number_of_dc_powerons, avg_duration`

Unnamed: 0,country,number_of_systems,avg_number_of_dc_powerons,avg_duration
0,India,1009,3.069037,169.198309
1,Indonesia,371,3.004482,136.913269
2,Philippines,196,2.988304,132.36163
3,Peru,152,2.951244,177.42001
4,Other,1405,2.811502,154.68973
5,Colombia,170,2.799172,162.710808
6,Chile,164,2.789094,150.158884
7,Viet Nam,202,2.743784,137.852936
8,Argentina,158,2.71466,198.929319
9,Mexico,323,2.68322,204.005118


Saved to `../data/results/real/battery_power_on_geographic_summary.csv`

Unnamed: 0,country,number_of_systems,avg_number_of_dc_powerons,avg_duration
0,India,1009,3.069037,169.198309
1,Indonesia,371,3.004482,136.913269
2,Philippines,196,2.988304,132.36163
3,Peru,152,2.951244,177.42001
4,Other,1405,2.811502,154.68973
5,Colombia,170,2.799172,162.710808
6,Chile,164,2.789094,150.158884
7,Viet Nam,202,2.743784,137.852936
8,Argentina,158,2.71466,198.929319
9,Mexico,323,2.68322,204.005118


In [9]:
run_query("battery_on_duration_cpu_family_gen")

## `battery_on_duration_cpu_family_gen`

Provide me a summary of battery usage durations in minutes, broke out by the different cpu family and generations. Do NO...

✓ 14 rows, 4 columns: `marketcodename, cpugen, number_of_systems, avg_duration_mins_on_battery`

Unnamed: 0,marketcodename,cpugen,number_of_systems,avg_duration_mins_on_battery
0,Coffee Lake,9th Gen i5,638,64.091878
1,Ice Lake,10th Gen i3,805,93.966219
2,Tiger Lake,11th Gen i7,2838,186.877908
3,Whiskey Lake,Pentium/Celeron-Whiskey Lake,102,96.852338
4,Comet Lake,10th Gen i5,2291,120.214483
5,Alder Lake,12th Gen i7,103,137.328631
6,Tiger Lake,11th Gen i3,808,189.579736
7,Ice Lake,10th Gen i5,1785,129.547794
8,Tiger Lake,11th Gen i5,3685,194.69009
9,Comet Lake,10th Gen i7,2228,114.443329


Saved to `../data/results/real/battery_on_duration_cpu_family_gen.csv`

Unnamed: 0,marketcodename,cpugen,number_of_systems,avg_duration_mins_on_battery
0,Coffee Lake,9th Gen i5,638,64.091878
1,Ice Lake,10th Gen i3,805,93.966219
2,Tiger Lake,11th Gen i7,2838,186.877908
3,Whiskey Lake,Pentium/Celeron-Whiskey Lake,102,96.852338
4,Comet Lake,10th Gen i5,2291,120.214483
5,Alder Lake,12th Gen i7,103,137.328631
6,Tiger Lake,11th Gen i3,808,189.579736
7,Ice Lake,10th Gen i5,1785,129.547794
8,Tiger Lake,11th Gen i5,3685,194.69009
9,Comet Lake,10th Gen i7,2228,114.443329


In [10]:
run_query("on_off_mods_sleep_summary_by_cpu_marketcodename_gen")

## `on_off_mods_sleep_summary_by_cpu_marketcodename_gen`

Provide me a statistical summary of on time, off time, mods time, and sleep time broke out by different cpu generations ...

✓ 27 rows, 12 columns: `marketcodename, cpugen, number_of_systems, avg_on_time, avg_off_time, avg_modern_sleep_time, avg_sleep_time, avg_total_time, avg_pcnt_on_time, avg_pcnt_off_time, avg_pcnt_mods_time, avg_pcnt_sleep_time`

Unnamed: 0,marketcodename,cpugen,number_of_systems,avg_on_time,avg_off_time,avg_modern_sleep_time,avg_sleep_time,avg_total_time,avg_pcnt_on_time,avg_pcnt_off_time,avg_pcnt_mods_time,avg_pcnt_sleep_time
0,Coffee Lake,9th Gen i5,2621,33944.22,19165.14,360.98,30203.97,83674.31,40.57,22.9,0.43,36.1
1,Coffee Lake,9th Gen i9,633,37934.87,18591.35,176.49,27122.69,83825.41,45.25,22.18,0.21,32.36
2,Comet Lake,10th Gen i5,4987,31850.55,17272.15,4185.52,30252.51,83560.73,38.12,20.67,5.01,36.2
3,Coffee Lake,9th Gen i3,417,39226.92,14917.44,0.0,29493.41,83637.77,46.9,17.84,0.0,35.26
4,Alder Lake,12th Gen i9,211,35335.63,21215.26,2748.38,25882.67,85181.94,41.48,24.91,3.23,30.39
5,Ice Lake,10th Gen i3,705,24247.18,15464.69,1443.04,41653.7,82808.62,29.28,18.68,1.74,50.3
6,Rocket Lake,11th Gen i7,575,37157.87,17215.72,43.23,30784.23,85201.05,43.61,20.21,0.05,36.13
7,Comet Lake,10th Gen i9,461,35971.25,19390.95,1403.62,27218.83,83984.65,42.83,23.09,1.67,32.41
8,Comet Lake,10th Gen i3,1402,33658.25,17590.05,3487.25,28887.29,83622.83,40.25,21.03,4.17,34.54
9,Rocket Lake,11th Gen i5,883,33246.21,18395.51,129.88,33387.02,85158.62,39.04,21.6,0.15,39.21


Saved to `../data/results/real/on_off_mods_sleep_summary_by_cpu_marketcodename_gen.csv`

Unnamed: 0,marketcodename,cpugen,number_of_systems,avg_on_time,avg_off_time,avg_modern_sleep_time,avg_sleep_time,avg_total_time,avg_pcnt_on_time,avg_pcnt_off_time,avg_pcnt_mods_time,avg_pcnt_sleep_time
0,Coffee Lake,9th Gen i5,2621,33944.22,19165.14,360.98,30203.97,83674.31,40.57,22.9,0.43,36.1
1,Coffee Lake,9th Gen i9,633,37934.87,18591.35,176.49,27122.69,83825.41,45.25,22.18,0.21,32.36
2,Comet Lake,10th Gen i5,4987,31850.55,17272.15,4185.52,30252.51,83560.73,38.12,20.67,5.01,36.2
3,Coffee Lake,9th Gen i3,417,39226.92,14917.44,0.0,29493.41,83637.77,46.9,17.84,0.0,35.26
4,Alder Lake,12th Gen i9,211,35335.63,21215.26,2748.38,25882.67,85181.94,41.48,24.91,3.23,30.39
5,Ice Lake,10th Gen i3,705,24247.18,15464.69,1443.04,41653.7,82808.62,29.28,18.68,1.74,50.3
6,Rocket Lake,11th Gen i7,575,37157.87,17215.72,43.23,30784.23,85201.05,43.61,20.21,0.05,36.13
7,Comet Lake,10th Gen i9,461,35971.25,19390.95,1403.62,27218.83,83984.65,42.83,23.09,1.67,32.41
8,Comet Lake,10th Gen i3,1402,33658.25,17590.05,3487.25,28887.29,83622.83,40.25,21.03,4.17,34.54
9,Rocket Lake,11th Gen i5,883,33246.21,18395.51,129.88,33387.02,85158.62,39.04,21.6,0.15,39.21


### Ranked top-k

In [11]:
run_query("most_popular_browser_in_each_country_by_system_count")

## `most_popular_browser_in_each_country_by_system_count`

Utilizing the web categorization usage data in conjunction with sysinfo geographic data, produce a list of the countries...

✓ 51 rows, 2 columns: `country, browser`

Unnamed: 0,country,browser
0,Argentina,chrome
1,Australia,chrome
2,Austria,chrome
3,Bangladesh,chrome
4,Belgium,chrome
5,Brazil,chrome
6,Canada,chrome
7,Chile,chrome
8,China,edge
9,Colombia,chrome


Saved to `../data/results/real/most_popular_browser_in_each_country_by_system_count.csv`

Unnamed: 0,country,browser
0,Argentina,chrome
1,Australia,chrome
2,Austria,chrome
3,Bangladesh,chrome
4,Belgium,chrome
5,Brazil,chrome
6,Canada,chrome
7,Chile,chrome
8,China,edge
9,Colombia,chrome


### Histograms / distributions

In [12]:
run_query("ram_utilization_histogram")

## `ram_utilization_histogram`

Provide an ordered histogram of client memory capacity, showing the distribution of average utilized ram for each memory...

✓ 44 rows, 3 columns: `ram_gb, count(DISTINCT guid), avg_percentage_used`

Unnamed: 0,ram_gb,count(DISTINCT guid),avg_percentage_used
0,1.0,5,81.0
1,2.0,693,67.0
2,3.0,230,72.0
3,4.0,12251,71.0
4,5.0,45,66.0
5,6.0,1648,61.0
6,7.0,15,44.0
7,8.0,26089,60.0
8,9.0,6,55.0
9,10.0,153,51.0


Saved to `../data/results/real/ram_utilization_histogram.csv`

Unnamed: 0,ram_gb,count(DISTINCT guid),avg_percentage_used
0,1.0,5,81.0
1,2.0,693,67.0
2,3.0,230,72.0
3,4.0,12251,71.0
4,5.0,45,66.0
5,6.0,1648,61.0
6,7.0,15,44.0
7,8.0,26089,60.0
8,9.0,6,55.0
9,10.0,153,51.0


In [13]:
run_query("popular_browsers_by_count_usage_percentage")

## `popular_browsers_by_count_usage_percentage`

Utilizing the web categorization usage data, produce a statistical analysis of the popularity of the browsers in that da...

✓ 3 rows, 4 columns: `browser, percent_systems, percent_instances, percent_duration`

Unnamed: 0,browser,percent_systems,percent_instances,percent_duration
0,edge,55.16,8.91,8.88
1,chrome,82.05,80.3,82.89
2,firefox,18.36,10.79,8.23


Saved to `../data/results/real/popular_browsers_by_count_usage_percentage.csv`

Unnamed: 0,browser,percent_systems,percent_instances,percent_duration
0,edge,55.16,8.91,8.88
1,chrome,82.05,80.3,82.89
2,firefox,18.36,10.79,8.23


### Complex multi-way pivot

In [14]:
run_query("persona_web_cat_usage_analysis")

## `persona_web_cat_usage_analysis`

Provide an analysis of web category duration usage, broke out by client persona classification. Provide the web category...

FloatProgress(value=0.0, layout=Layout(width='auto'), style=ProgressStyle(bar_color='black'))

✓ 11 rows, 31 columns: `persona, number_of_systems, days, content_creation_photo_edit_creation, content_creation_video_audio_edit_creation, content_creation_web_design_development, education, entertainment_music_audio_streaming, entertainment_other, entertainment_video_streaming, finance, games_other, games_video_games, mail, news, unclassified, private, productivity_crm, productivity_other, productivity_presentations, productivity_programming, productivity_project_management, productivity_spreadsheets, productivity_word_processing, recreation_travel, reference, search, shopping, social_social_network, social_communication, social_communication_live`

Unnamed: 0,persona,number_of_systems,days,content_creation_photo_edit_creation,content_creation_video_audio_edit_creation,content_creation_web_design_development,education,entertainment_music_audio_streaming,entertainment_other,entertainment_video_streaming,...,productivity_project_management,productivity_spreadsheets,productivity_word_processing,recreation_travel,reference,search,shopping,social_social_network,social_communication,social_communication_live
0,Casual User,7859,465408.0,,,,,,,,...,,,,,,,,,,
1,Win Store App User,1448,99741.0,0.046,0.011,0.165,1.951,0.134,2.297,22.366,...,0.05,0.337,0.647,0.341,0.823,5.983,2.828,3.203,0.853,0.834
2,Communication,4611,434040.0,0.037,0.023,0.263,1.701,0.143,1.088,8.267,...,0.192,0.69,0.527,0.604,1.344,7.113,2.769,2.285,0.747,0.983
3,Casual Gamer,4774,403219.0,0.143,0.01,0.172,1.453,0.057,2.518,29.719,...,0.028,0.266,0.605,0.173,0.69,3.547,1.704,3.017,1.61,0.648
4,Gamer,6843,475161.0,,,,,,,,...,,,,,,,,,,
5,Unknown,7140,47113.0,,,,,,,,...,,,,,,,,,,
6,Entertainment,1972,119091.0,0.055,0.021,0.212,1.011,0.154,3.544,25.633,...,0.008,0.148,0.347,0.24,0.878,5.966,2.828,3.268,0.889,0.527
7,Web User,19575,1731513.0,,,,,,,,...,,,,,,,,,,
8,File & Network Sharer,1328,90803.0,0.061,0.021,0.138,1.224,0.55,2.874,12.047,...,0.045,0.805,0.258,0.366,1.131,7.009,3.348,1.848,1.005,0.512
9,Content Creator/IT,3595,274780.0,0.159,0.016,0.251,1.598,0.259,2.089,14.331,...,0.199,0.511,0.525,0.34,1.21,6.548,2.543,2.986,1.225,0.734


Saved to `../data/results/real/persona_web_cat_usage_analysis.csv`

Unnamed: 0,persona,number_of_systems,days,content_creation_photo_edit_creation,content_creation_video_audio_edit_creation,content_creation_web_design_development,education,entertainment_music_audio_streaming,entertainment_other,entertainment_video_streaming,...,productivity_project_management,productivity_spreadsheets,productivity_word_processing,recreation_travel,reference,search,shopping,social_social_network,social_communication,social_communication_live
0,Casual User,7859,465408.0,,,,,,,,...,,,,,,,,,,
1,Win Store App User,1448,99741.0,0.046,0.011,0.165,1.951,0.134,2.297,22.366,...,0.05,0.337,0.647,0.341,0.823,5.983,2.828,3.203,0.853,0.834
2,Communication,4611,434040.0,0.037,0.023,0.263,1.701,0.143,1.088,8.267,...,0.192,0.69,0.527,0.604,1.344,7.113,2.769,2.285,0.747,0.983
3,Casual Gamer,4774,403219.0,0.143,0.01,0.172,1.453,0.057,2.518,29.719,...,0.028,0.266,0.605,0.173,0.69,3.547,1.704,3.017,1.61,0.648
4,Gamer,6843,475161.0,,,,,,,,...,,,,,,,,,,
5,Unknown,7140,47113.0,,,,,,,,...,,,,,,,,,,
6,Entertainment,1972,119091.0,0.055,0.021,0.212,1.011,0.154,3.544,25.633,...,0.008,0.148,0.347,0.24,0.878,5.966,2.828,3.268,0.889,0.527
7,Web User,19575,1731513.0,,,,,,,,...,,,,,,,,,,
8,File & Network Sharer,1328,90803.0,0.061,0.021,0.138,1.224,0.55,2.874,12.047,...,0.045,0.805,0.258,0.366,1.131,7.009,3.348,1.848,1.005,0.512
9,Content Creator/IT,3595,274780.0,0.159,0.016,0.251,1.598,0.259,2.089,14.331,...,0.199,0.511,0.525,0.34,1.21,6.548,2.543,2.986,1.225,0.734


### Display device queries

In [15]:
run_query("display_devices_connection_type_resolution_durations_ac_dc")

## `display_devices_connection_type_resolution_durations_ac_dc`

Provide a statistical analysis of the different display connection types and the resolutions ran with those connections....

FloatProgress(value=0.0, layout=Layout(width='auto'), style=ProgressStyle(bar_color='black'))

✓ 431 rows, 5 columns: `connection_type, resolution, number_of_systems, average_duration_on_ac_in_seconds, average_duration_on_dc_in_seconds`

Unnamed: 0,connection_type,resolution,number_of_systems,average_duration_on_ac_in_seconds,average_duration_on_dc_in_seconds
0,COMPOSITE_VIDEO,1080x1920,53,14317.74,4444.61
1,DISPLAYPORT_EMBEDDED,1080x1920,3447,12667.27,1993.5
2,DISPLAYPORT_EMBEDDED,720x1280,639,1212.84,27.15
3,DISPLAYPORT_EMBEDDED,1024x1280,480,682.38,13.21
4,DISPLAYPORT_EMBEDDED,1440x2560,409,11046.56,965.81
5,DISPLAYPORT_EMBEDDED,768x1024,406,802.69,31.12
6,DISPLAYPORT_EMBEDDED,480x640,375,464.62,19.19
7,DISPLAYPORT_EMBEDDED,2160x3840,354,18407.39,1665.35
8,DISPLAYPORT_EMBEDDED,1050x1680,329,858.44,412.42
9,DISPLAYPORT_EMBEDDED,768x1366,321,4352.74,278.34


Saved to `../data/results/real/display_devices_connection_type_resolution_durations_ac_dc.csv`

Unnamed: 0,connection_type,resolution,number_of_systems,average_duration_on_ac_in_seconds,average_duration_on_dc_in_seconds
0,COMPOSITE_VIDEO,1080x1920,53,14317.74,4444.61
1,DISPLAYPORT_EMBEDDED,1080x1920,3447,12667.27,1993.50
2,DISPLAYPORT_EMBEDDED,720x1280,639,1212.84,27.15
3,DISPLAYPORT_EMBEDDED,1024x1280,480,682.38,13.21
4,DISPLAYPORT_EMBEDDED,1440x2560,409,11046.56,965.81
...,...,...,...,...,...
426,OTHER,1026x1552,59,44366.71,256.29
427,OTHER,1080x1728,53,32141.51,272.66
428,OTHER,960x1280,51,12765.31,115.66
429,OTHER,1400x2240,51,28779.22,1802.33


In [16]:
run_query("display_devices_vendors_percentage")

## `display_devices_vendors_percentage`

Provide a statistical summary of the various detected display devices. We want to calculate, as a percentage of the tota...

FloatProgress(value=0.0, layout=Layout(width='auto'), style=ProgressStyle(bar_color='black'))

✓ 21 rows, 4 columns: `vendor_name, number_of_systems, total_number_of_systems, percentage_of_systems`

Unnamed: 0,vendor_name,number_of_systems,total_number_of_systems,percentage_of_systems
0,Samsung,20437,209239,9.77
1,Hisense,1091,209239,0.52
2,Other,200089,209239,95.63
3,HP,16576,209239,7.92
4,Dell,26463,209239,12.65
5,MI,2305,209239,1.1
6,Sansui,4,209239,0.0
7,LG,27528,209239,13.16
8,Acer,4774,209239,2.28
9,VIZIO,5,209239,0.0


Saved to `../data/results/real/display_devices_vendors_percentage.csv`

Unnamed: 0,vendor_name,number_of_systems,total_number_of_systems,percentage_of_systems
0,Samsung,20437,209239,9.77
1,Hisense,1091,209239,0.52
2,Other,200089,209239,95.63
3,HP,16576,209239,7.92
4,Dell,26463,209239,12.65
5,MI,2305,209239,1.1
6,Sansui,4,209239,0.0
7,LG,27528,209239,13.16
8,Acer,4774,209239,2.28
9,VIZIO,5,209239,0.0


### User wait queries

In [17]:
run_query("userwait_top_10_wait_processes")

## `userwait_top_10_wait_processes`

Provide me a list of the top 10 worst applications for wait time. Ignore the following list of processes in compiling th...

✓ 10 rows, 3 columns: `proc_name, total_duration_sec_per_instance, rank`

Unnamed: 0,proc_name,total_duration_sec_per_instance,rank
0,2 Reminder(s),35171.773,1
1,Dell.SecurityManager.SystrayApp.exe,12143.812636,2
2,Miracle Thunder_Cracked _loder_2.93.exe,7437.500667,3
3,PTUpdater.exe,6161.44,4
4,LimeChat2.exe,5934.90775,5
5,setup_ruined_king_a_league_of_legends_storytm_...,5560.299,6
6,AVDm.exe,4966.293,7
7,pop5B87.tmp,3903.591,8
8,BrawlBox.exe,3798.116333,9
9,WinStart.exe,3779.731923,10


Saved to `../data/results/real/userwait_top_10_wait_processes.csv`

Unnamed: 0,proc_name,total_duration_sec_per_instance,rank
0,2 Reminder(s),35171.773,1
1,Dell.SecurityManager.SystrayApp.exe,12143.812636,2
2,Miracle Thunder_Cracked _loder_2.93.exe,7437.500667,3
3,PTUpdater.exe,6161.44,4
4,LimeChat2.exe,5934.90775,5
5,setup_ruined_king_a_league_of_legends_storytm_...,5560.299,6
6,AVDm.exe,4966.293,7
7,pop5B87.tmp,3903.591,8
8,BrawlBox.exe,3798.116333,9
9,WinStart.exe,3779.731923,10


In [18]:
run_query("userwait_top_10_wait_processes_wait_type_ac_dc")

## `userwait_top_10_wait_processes_wait_type_ac_dc`

Provide me a list of the top 10 worst applications for wait time, segregated by type of wait (APPSTARTING) or in applica...

✓ 61 rows, 5 columns: `event_name, acdc, proc_name, total_duration_sec_per_instance, rank`

Unnamed: 0,event_name,acdc,proc_name,total_duration_sec_per_instance,rank
0,WAIT,,chrome.exe,5.81,1
1,APPSTARTING,AC,2 Reminder(s),35171.77,1
2,APPSTARTING,AC,Stark Meter Reader.exe,12277.24,2
3,APPSTARTING,AC,setup_ruined_king_a_league_of_legends_storytm_...,5560.3,3
4,APPSTARTING,AC,AVDm.exe,4966.29,4
5,APPSTARTING,AC,connex.exe,4317.6,5
6,APPSTARTING,AC,pop5B87.tmp,3903.59,6
7,APPSTARTING,AC,Roam Research.exe,3760.87,7
8,APPSTARTING,AC,OutlookConverter.exe,3380.06,8
9,APPSTARTING,AC,AnyMP4 Screen Recorder.exe,2740.13,9


Saved to `../data/results/real/userwait_top_10_wait_processes_wait_type_ac_dc.csv`

Unnamed: 0,event_name,acdc,proc_name,total_duration_sec_per_instance,rank
0,WAIT,,chrome.exe,5.81,1
1,APPSTARTING,AC,2 Reminder(s),35171.77,1
2,APPSTARTING,AC,Stark Meter Reader.exe,12277.24,2
3,APPSTARTING,AC,setup_ruined_king_a_league_of_legends_storytm_...,5560.30,3
4,APPSTARTING,AC,AVDm.exe,4966.29,4
...,...,...,...,...,...
56,WAIT,UN,SavUI.exe,8430.74,6
57,WAIT,UN,MySQLInstallerConsole.exe,8065.61,7
58,WAIT,UN,Dishonored2_x64ShippingRetail.exe,4646.70,8
59,WAIT,UN,DownKyi.exe,4615.61,9


In [19]:
run_query("userwait_top_20_wait_processes_compare_ac_dc_unknown_durations")

## `userwait_top_20_wait_processes_compare_ac_dc_unknown_durations`

Provide me a list of the top 20 worst applications for wait time, providing the average duration/instance PIVOTED by the...

FloatProgress(value=0.0, layout=Layout(width='auto'), style=ProgressStyle(bar_color='black'))

✓ 20 rows, 4 columns: `proc_name, ac_duration, dc_duration, unknown_duration`

Unnamed: 0,proc_name,ac_duration,dc_duration,unknown_duration
0,DriverUpdUI.exe,7.22,7.41,124.47
1,ZapyaAdaptor.exe,1.19,1.26,743.16
2,A Dance of Fire and Ice.exe,19.78,2.2,89.15
3,MicrosoftEdge.exe,14.76,12.19,98.54
4,PfuSshMain.exe,3.54,2971.32,2.99
5,ddm.exe,1699.17,2.72,5.72
6,pixillion.exe,0.99,0.95,254.53
7,,201.76,2.51,15.39
8,farcry3.exe,34.85,3.15,109.36
9,Serato DJ Pro.exe,442.81,28.99,8.41


Saved to `../data/results/real/userwait_top_20_wait_processes_compare_ac_dc_unknown_durations.csv`

Unnamed: 0,proc_name,ac_duration,dc_duration,unknown_duration
0,DriverUpdUI.exe,7.22,7.41,124.47
1,ZapyaAdaptor.exe,1.19,1.26,743.16
2,A Dance of Fire and Ice.exe,19.78,2.2,89.15
3,MicrosoftEdge.exe,14.76,12.19,98.54
4,PfuSshMain.exe,3.54,2971.32,2.99
5,ddm.exe,1699.17,2.72,5.72
6,pixillion.exe,0.99,0.95,254.53
7,,201.76,2.51,15.39
8,farcry3.exe,34.85,3.15,109.36
9,Serato DJ Pro.exe,442.81,28.99,8.41


### Foreground app queries

In [20]:
run_query("top_10_applications_by_app_type_ranked_by_focal_time")

## `top_10_applications_by_app_type_ranked_by_focal_time`

Provide me ranked lists of the top 10 applications by application type. Rank the applications by average amount of focal...

✓ 150 rows, 4 columns: `app_type, exe_name, average_focal_sec_per_day, rank`

Unnamed: 0,app_type,exe_name,average_focal_sec_per_day,rank
0,Communication,zoomrooms.exe,39877.0,1
1,Communication,adiirc.exe,36982.0,2
2,Communication,flex-communicator.exe,32278.0,3
3,Communication,bittitandmaoutlookconfiguratorapplication.exe,23622.0,4
4,Communication,tcfemailminer22.exe,18892.0,5
5,Communication,controller.exe,17648.0,6
6,Communication,zello.exe,15874.0,7
7,Communication,fs.exe,15514.0,8
8,Communication,front.exe,14373.0,9
9,Communication,scriptcommunicator.exe,14000.0,10


Saved to `../data/results/real/top_10_applications_by_app_type_ranked_by_focal_time.csv`

Unnamed: 0,app_type,exe_name,average_focal_sec_per_day,rank
0,Communication,zoomrooms.exe,39877.0,1
1,Communication,adiirc.exe,36982.0,2
2,Communication,flex-communicator.exe,32278.0,3
3,Communication,bittitandmaoutlookconfiguratorapplication.exe,23622.0,4
4,Communication,tcfemailminer22.exe,18892.0,5
...,...,...,...,...
145,Web Browsing,firefoxcobb.exe,23951.0,6
146,Web Browsing,dissenter.exe,22981.0,7
147,Web Browsing,chrome (radeon).exe,20708.0,8
148,Web Browsing,chrome......exe,20084.0,9


In [21]:
run_query("top_10_applications_by_app_type_ranked_by_system_count")

## `top_10_applications_by_app_type_ranked_by_system_count`

Provide me ranked lists of the top 10 applications by application type. Rank the applications by the number of distinct ...

FloatProgress(value=0.0, layout=Layout(width='auto'), style=ProgressStyle(bar_color='black'))

✓ 150 rows, 4 columns: `app_type, exe_name, number_of_systems, rank`

Unnamed: 0,app_type,exe_name,number_of_systems,rank
0,Communication,zoom.exe,13968,1
1,Communication,outlook.exe,12880,2
2,Communication,teams.exe,11311,3
3,Communication,skypebridge.exe,3511,4
4,Communication,atmgr.exe,2555,5
5,Communication,lync.exe,1749,6
6,Communication,linelauncher.exe,1414,7
7,Communication,thunderbird.exe,1399,8
8,Communication,slack.exe,1272,9
9,Communication,ptoneclk.exe,1200,10


Saved to `../data/results/real/top_10_applications_by_app_type_ranked_by_system_count.csv`

Unnamed: 0,app_type,exe_name,number_of_systems,rank
0,Communication,zoom.exe,13968,1
1,Communication,outlook.exe,12880,2
2,Communication,teams.exe,11311,3
3,Communication,skypebridge.exe,3511,4
4,Communication,atmgr.exe,2555,5
...,...,...,...,...
145,Web Browsing,microsoftedge.exe,6165,6
146,Web Browsing,browser_broker.exe,2992,7
147,Web Browsing,brave.exe,2168,8
148,Web Browsing,browser.exe,2105,9


In [22]:
run_query("top_10_applications_by_app_type_ranked_by_total_detections")

## `top_10_applications_by_app_type_ranked_by_total_detections`

Provide me ranked lists of the top 10 applications by application type. Rank the applications by the total number of det...

✓ 150 rows, 4 columns: `app_type, exe_name, total_number_of_detections, rank`

Unnamed: 0,app_type,exe_name,total_number_of_detections,rank
0,Communication,outlook.exe,84245310.0,1
1,Communication,teams.exe,17208150.0,2
2,Communication,zoom.exe,16331121.0,3
3,Communication,thunderbird.exe,3847752.0,4
4,Communication,slack.exe,3124031.0,5
5,Communication,lync.exe,2912100.0,6
6,Communication,atmgr.exe,1005669.0,7
7,Communication,dingtalk.exe,540073.0,8
8,Communication,ciscojabber.exe,445097.0,9
9,Communication,notes2.exe,354952.0,10


Saved to `../data/results/real/top_10_applications_by_app_type_ranked_by_total_detections.csv`

Unnamed: 0,app_type,exe_name,total_number_of_detections,rank
0,Communication,outlook.exe,84245310.0,1
1,Communication,teams.exe,17208150.0,2
2,Communication,zoom.exe,16331121.0,3
3,Communication,thunderbird.exe,3847752.0,4
4,Communication,slack.exe,3124031.0,5
...,...,...,...,...
145,Web Browsing,brave.exe,6999184.0,6
146,Web Browsing,browser.exe,2956688.0,7
147,Web Browsing,whale.exe,1704769.0,8
148,Web Browsing,vivaldi.exe,779026.0,9


### Power consumption queries (single-guid stub data)

In [23]:
run_query("ranked_process_classifications")

## `ranked_process_classifications`

Provide me with a summary ranking of the different user_id process ownership classification in the system_mods_power_con...

✓ 5 rows, 3 columns: `user_id, total_power_consumption, rnk`

Unnamed: 0,user_id,total_power_consumption,rnk
0,UserIdMask,186984.0,1
1,SYSTEM,56672.0,2
2,NONE,28426.0,3
3,LOCAL SERVICE,3493.0,4
4,NETWORK SERVICE,2139.0,5


Saved to `../data/results/real/ranked_process_classifications.csv`

Unnamed: 0,user_id,total_power_consumption,rnk
0,UserIdMask,186984.0,1
1,SYSTEM,56672.0,2
2,NONE,28426.0,3
3,LOCAL SERVICE,3493.0,4
4,NETWORK SERVICE,2139.0,5


In [24]:
run_query("top_10_processes_per_user_id_ranked_by_total_power_consumption")

## `top_10_processes_per_user_id_ranked_by_total_power_consumption`

Provide me with a summary ranking of the top 10 processes for each user_id process classification in the system_mods_pow...

✓ 44 rows, 4 columns: `user_id, app_id, total_power_consumption, rnk`

Unnamed: 0,user_id,app_id,total_power_consumption,rnk
0,NONE,Unknown,28426.0,1
1,LOCAL SERVICE,\Device\HarddiskVolume3\Windows\System32\audio...,1905.0,1
2,LOCAL SERVICE,\Device\HarddiskVolume3\Program Files\Avast So...,572.0,2
3,LOCAL SERVICE,\Device\HarddiskVolume3\Windows\System32\svcho...,454.0,3
4,LOCAL SERVICE,\Device\HarddiskVolume3\Program Files (x86)\Co...,230.0,4
5,LOCAL SERVICE,\Device\HarddiskVolume3\Windows\System32\svcho...,86.0,5
6,LOCAL SERVICE,\Device\HarddiskVolume3\Windows\System32\svcho...,68.0,6
7,LOCAL SERVICE,\Device\HarddiskVolume3\Windows\System32\svcho...,56.0,7
8,LOCAL SERVICE,\Device\HarddiskVolume3\Windows\System32\svcho...,36.0,8
9,LOCAL SERVICE,\Device\HarddiskVolume3\Windows\System32\svcho...,21.0,9


Saved to `../data/results/real/top_10_processes_per_user_id_ranked_by_total_power_consumption.csv`

Unnamed: 0,user_id,app_id,total_power_consumption,rnk
0,NONE,Unknown,28426.0,1
1,LOCAL SERVICE,\Device\HarddiskVolume3\Windows\System32\audio...,1905.0,1
2,LOCAL SERVICE,\Device\HarddiskVolume3\Program Files\Avast So...,572.0,2
3,LOCAL SERVICE,\Device\HarddiskVolume3\Windows\System32\svcho...,454.0,3
4,LOCAL SERVICE,\Device\HarddiskVolume3\Program Files (x86)\Co...,230.0,4
5,LOCAL SERVICE,\Device\HarddiskVolume3\Windows\System32\svcho...,86.0,5
6,LOCAL SERVICE,\Device\HarddiskVolume3\Windows\System32\svcho...,68.0,6
7,LOCAL SERVICE,\Device\HarddiskVolume3\Windows\System32\svcho...,56.0,7
8,LOCAL SERVICE,\Device\HarddiskVolume3\Windows\System32\svcho...,36.0,8
9,LOCAL SERVICE,\Device\HarddiskVolume3\Windows\System32\svcho...,21.0,9


In [25]:
run_query("top_20_most_power_consuming_processes_by_avg_power_consumed")

## `top_20_most_power_consuming_processes_by_avg_power_consumed`

Provide a ranked list of the top 20 most power consuming processes, regardless of user id classification, as ranked by a...

✓ 20 rows, 3 columns: `app_id, total_power_consumption, rnk`

Unnamed: 0,app_id,total_power_consumption,rnk
0,\Device\HarddiskVolume3\Program Files\Google\C...,2526.740741,1
1,\Device\HarddiskVolume3\Program Files (x86)\Mi...,2287.928571,2
2,Unknown,1496.105263,3
3,System,366.452381,4
4,4DF9E0F8.Netflix_6.99.5.0_x64__mcm4njqhnhss8,1343.9,5
5,\Device\HarddiskVolume3\Windows\explorer.exe,290.138889,6
6,System Interrupts,210.904762,7
7,\Device\HarddiskVolume3\Program Files\HP\HP On...,220.666667,8
8,\Device\HarddiskVolume3\Program Files\Microsof...,299.307692,9
9,\Device\HarddiskVolume3\Program Files\Avast So...,89.547619,10


Saved to `../data/results/real/top_20_most_power_consuming_processes_by_avg_power_consumed.csv`

Unnamed: 0,app_id,total_power_consumption,rnk
0,\Device\HarddiskVolume3\Program Files\Google\C...,2526.740741,1
1,\Device\HarddiskVolume3\Program Files (x86)\Mi...,2287.928571,2
2,Unknown,1496.105263,3
3,System,366.452381,4
4,4DF9E0F8.Netflix_6.99.5.0_x64__mcm4njqhnhss8,1343.9,5
5,\Device\HarddiskVolume3\Windows\explorer.exe,290.138889,6
6,System Interrupts,210.904762,7
7,\Device\HarddiskVolume3\Program Files\HP\HP On...,220.666667,8
8,\Device\HarddiskVolume3\Program Files\Microsof...,299.307692,9
9,\Device\HarddiskVolume3\Program Files\Avast So...,89.547619,10


---
## Summary

In [26]:
results = list(RESULTS.glob("*.csv"))
query_files = list(QUERIES.glob("*.json"))

lines = [
    f"Benchmark queries: {len(query_files)} total",
    f"Successfully executed: {len(results)}",
    f"Failed: {len(query_files) - len(results)}",
    "",
]

if len(results) == len(query_files):
    lines.append("✓ All queries passed. Ground truth results saved to `data/results/real/`.")
else:
    executed = {r.stem for r in results}
    all_queries = {q.stem for q in query_files}
    missing = all_queries - executed
    lines.append(f"✗ Missing results for: {missing}")

lines.append("")
lines.append("| File | Rows |")
lines.append("|---|---|")
for f in sorted(results):
    rows = sum(1 for _ in open(f)) - 1
    lines.append(f"| `{f.name}` | {rows} |")

display(Markdown("\n".join(lines)))


Benchmark queries: 24 total
Successfully executed: 24
Failed: 0

✓ All queries passed. Ground truth results saved to `data/results/real/`.

| File | Rows |
|---|---|
| `Xeon_network_consumption.csv` | 8 |
| `avg_platform_power_c0_freq_temp_by_chassis.csv` | 4 |
| `battery_on_duration_cpu_family_gen.csv` | 14 |
| `battery_power_on_geographic_summary.csv` | 37 |
| `display_devices_connection_type_resolution_durations_ac_dc.csv` | 431 |
| `display_devices_vendors_percentage.csv` | 21 |
| `mods_blockers_by_osname_and_codename.csv` | 11 |
| `most_popular_browser_in_each_country_by_system_count.csv` | 51 |
| `on_off_mods_sleep_summary_by_cpu_marketcodename_gen.csv` | 27 |
| `persona_web_cat_usage_analysis.csv` | 11 |
| `pkg_power_by_country.csv` | 49 |
| `popular_browsers_by_count_usage_percentage.csv` | 3 |
| `ram_utilization_histogram.csv` | 44 |
| `ranked_process_classifications.csv` | 5 |
| `server_exploration_1.csv` | 1497 |
| `top_10_applications_by_app_type_ranked_by_focal_time.csv` | 150 |
| `top_10_applications_by_app_type_ranked_by_system_count.csv` | 150 |
| `top_10_applications_by_app_type_ranked_by_total_detections.csv` | 150 |
| `top_10_processes_per_user_id_ranked_by_total_power_consumption.csv` | 44 |
| `top_20_most_power_consuming_processes_by_avg_power_consumed.csv` | 20 |
| `top_mods_blocker_types_durations_by_osname_and_codename.csv` | 10568 |
| `userwait_top_10_wait_processes.csv` | 10 |
| `userwait_top_10_wait_processes_wait_type_ac_dc.csv` | 61 |
| `userwait_top_20_wait_processes_compare_ac_dc_unknown_durations.csv` | 20 |