# CPUs plot with Altair/Vega

Plots
- by name
- by launch date

work in progress to:
- show side by side the CPU perf by name sorted by perf and the plot by launch date
  - with highlighting selection
- change the sorting order interactively. See https://stackoverflow.com/questions/67379937/change-mark-order-via-parameter (doesn't work yet...) and https://altair-viz.github.io/user_guide/transform/window.html

PH, Feb-Mar 2025

In [34]:
import numpy as np
import pandas as pd
import altair as alt

load CSV export of Baserow table (CSV exported in [Retrieve_baserow.ipynb](Retrieve_baserow.ipynb))

In [35]:
cpus = pd.read_csv('CPUs.csv', parse_dates=['Launch date'])
# add Log transforms
cpus['GB6 Single log2'] = np.log2(cpus['GB6 Single']/1000)
cpus['GB6 Multi log2'] = np.log2(cpus['GB6 Multi']/1000)

cpus

Unnamed: 0,Name,Designer,Launch date,Cores,Age,Architecture,GB6 Single,GB6 Multi,Win11,Product URL,GB6 Single log2,GB6 Multi log2
0,Intel Core i5-5200U,Intel,2015-01-01,2,10.2,Broadwell,854.0,1611.0,No,https://www.intel.com/content/www/us/en/produc...,-0.227692,0.687956
1,Intel Core i5-5300U,Intel,2015-01-01,2,10.2,Broadwell,908.0,1692.0,No,https://www.intel.com/content/www/us/en/produc...,-0.139236,0.75873
2,Intel Core i5-6200U,Intel,2015-07-01,2,9.7,Skylake,916.0,1832.0,No,https://www.intel.com/content/www/us/en/produc...,-0.12658,0.87342
3,Intel Core i5-6300U,Intel,2015-07-01,2,9.7,Skylake,965.0,1910.0,No,https://www.intel.com/content/www/us/en/produc...,-0.051399,0.933573
4,Intel Core i5-7200U,Intel,2016-10-01,2,8.5,Kaby Lake,1009.0,1987.0,No,https://www.intel.com/content/www/us/en/produc...,0.012926,0.990592
5,Intel Core i5-7300U,Intel,2017-01-01,2,8.2,Kaby Lake,1088.0,2030.0,No,https://www.intel.com/content/www/us/en/produc...,0.121679,1.02148
6,Intel Core i5-8250U,Intel,2017-07-01,4,7.7,Kaby Lake R,1145.0,3126.0,Yes,https://www.intel.com/content/www/us/en/produc...,0.195348,1.644318
7,Intel Core i5-8350U,Intel,2017-07-01,4,7.7,Kaby Lake R,1191.0,3237.0,Yes,https://www.intel.com/content/www/us/en/produc...,0.252173,1.694657
8,Intel Core i5-8365U,Intel,2019-04-01,4,6.0,Whiskey Lake,1270.0,3167.0,Yes,https://www.intel.com/content/www/us/en/produc...,0.344828,1.663117
9,Intel Core i5-9400H,Intel,2019-04-01,4,6.0,Coffee Lake,1429.0,4347.0,Yes,https://www.intel.com/content/www/us/en/produc...,0.515006,2.12002


In [36]:
cpu_list_plot = [
    'Intel Core i5-5200U',
    'Intel Core i5-5300U',
    'Intel Core i5-6200U',
    'Intel Core i5-6300U',
    'Intel Core i5-7200U',
    'Intel Core i5-7300U',
    'Intel Core i5-8250U',
    'Intel Core i5-8350U',
    'Intel Core i5-8365U',
    #'Intel Core i5-9400H',
    'Intel Core i5-10210U',
    'Intel Core i5-10310U',
    'Intel Core i5-1035G4',
    'Intel Core i5-1035G7',
    'Intel Core i5-1135G7',
    'Intel Core i5-1145G7',
    'Intel Core i5-1235U',
    'Intel Core i5-1245U',
    'Intel Core i5-1335U',
    'Intel Core i5-1345U',
    'Intel Core Ultra 5 125U',
    'Intel Core Ultra 5 135U',
    #'Intel Core Ultra 5 125H',
    #'Intel Core Ultra 5 135H',
    # AMD
    'AMD Ryzen 7 PRO 2700U',
    'AMD Ryzen 5 PRO 3500U',
    'AMD Ryzen 5 PRO 4650U',
    'AMD Ryzen 5 PRO 5650U',
    'AMD Ryzen 5 PRO 6650U',
    'AMD Ryzen 5 PRO 7540U',
    'AMD Ryzen 5 PRO 8540U'
]

In [37]:
i_drop = []
for i in range(len(cpus)):
    if cpus.iloc[i].Name not in cpu_list_plot:
        i_drop.append(i)
print("dropped rows: ", i_drop)
cpus.drop(i_drop, inplace=True)

dropped rows:  [9, 22, 23]


## Plot

### Plot by launch date

to "check" Moore's law

Highlight point on mouse over

In [None]:
highlight = alt.selection_point(name="highlight", on="pointerover", empty=False)

In [None]:
color_designer = alt.when(highlight, empty=True).then(
        alt.Color('Designer:N').scale(scheme='set1') # nice scheme: Intel=blue, AMD=red
    ).otherwise(
        alt.value('lightgray')
    )
#color_w11 = alt.when(highlight, empty=True).then('Win11').otherwise(alt.value('lightgray'))
order = alt.when(highlight).then(alt.value(1)).otherwise(alt.value(0)) # move upfront highlighted items

In [65]:
chart_date = alt.Chart(cpus).mark_point(size=150, filled=True, strokeWidth=5).encode(
    y='Launch date',
    color=color_designer,
    shape='Designer', # reenforce color
    #shape='Win11',
    order=order,
    tooltip=['Name', 'Architecture', 'Cores', 'Launch date', 'Win11'],
).add_params(
    highlight
)

chart_dsingle = chart_date.encode(
    x='GB6 Single',
).properties(
    title='Single-Core Performance'
)
chart_dmulti = chart_date.encode(
    x='GB6 Multi',
    y=alt.Y('Launch date', axis=alt.Axis(labels=False, title='')) # hide y tick labels
).properties(
    title='Multi-Core Performance'
)

(chart_dsingle | chart_dmulti)

Variant with log scale

In [None]:
chart_dsingle_log = chart_dsingle.encode(x='GB6 Single log2')
chart_dmulti_log = chart_dmulti.encode(x='GB6 Multi log2')
(chart_dsingle_log | chart_dmulti_log)

Superimposed charts: works but needs better scaling

In [63]:
(chart_dsingle_log + chart_dmulti_log)

Variant with regression line

In [61]:
chart_dsingle_logreg = chart_dsingle_log.transform_regression('Launch date', 'GB6 Single log2').mark_line()
chart_dmulti_logreg = chart_dmulti_log.transform_regression('Launch date', 'GB6 Multi log2').mark_line()

(chart_dsingle_log+chart_dsingle_logreg) | (chart_dmulti_log + chart_dmulti_logreg)

In [60]:
chart_dsingle_log + chart_dsingle_logreg + chart_dmulti_log + chart_dmulti_logreg

Attempt to retrieve regression parameters (coefficient for Moores law): *doesn't work*

In [52]:
params_dsingle_logreg = chart_dsingle_log.transform_regression('Launch date', 'GB6 Single log2', params=True)

print(params_dsingle_logreg.coef)

AttributeError: 'Chart' object has no attribute 'coef'

### CPUs sorted by perf

In [20]:
color_age = alt.when(highlight).then(
        alt.value('red')
    ).otherwise(
        alt.Color('Age:Q').scale(scheme='plasma', reverse=True, zero=True)
    )
size_highlight = alt.when(highlight).then(alt.value(200))

In [66]:
chart_name = alt.Chart(cpus).mark_point(size=100, filled=True).encode(
    alt.Y('Name').sort(field='GB6 Single', order='descending'),
    tooltip=['Name', 'Architecture', 'Cores', 'Launch date', 'Win11'],
    color=color_age,#'Age',
    shape='Designer',
    size=size_highlight
).add_params(
    highlight
)

chart_nsingle = chart_name.encode(
    x='GB6 Single',
).properties(
    title='Single-Core Performance'
)

chart_nmulti = chart_name.encode(
    x='GB6 Multi',
    y=alt.Y('Name', axis=alt.Axis(labels=False, title='')).sort(field='GB6 Single', order='descending') # hide y tick labels
).properties(
    title='Multi-Core Performance'
)
chart_nsingle | chart_nmulti

### Both charts, with synced highlight

works, but is too big! Need to use the short list only

In [29]:
(chart_nsingle | chart_nmulti) & \
(chart_dsingle | chart_dmulti)

---
Attempt to make the ordering changebable (failed)

In [11]:
index_select = alt.binding_select(options=['GB6 Single', 'GB6 Multi'], name='sort field:')
index_param = alt.param(bind=index_select)
index_param

Parameter('param_1', VariableParameter({
  bind: BindRadioSelect({
    input: 'select',
    name: 'sort field:',
    options: ['GB6 Single', 'GB6 Multi']
  }),
  name: 'param_1'
}))

Note: adding the param as sort field doesn't work. Read example more carefully? https://altair-viz.github.io/gallery/multiple_interactions.html#gallery-multiple-interactions

In [51]:
chart_name = alt.Chart(cpus).transform_window(
    sort=[{'field': 'param_2'}],
    frame=[None, 0],
    perf_sorted='rank(*)'
).mark_point(size=100, filled=True).encode(
    alt.Y('Name').sort(field='perf_sorted', order='descending'),#.sort(field='GB6 Single', order='descending'),
    tooltip='Name',
    #color='Win11',
    color='Age',
    shape='Designer'
).add_params(index_param)

chart_nsingle = chart_name.encode(
    x='GB6 Single',
).properties(
    title='Single-Core Performance'
)
chart_nsingle