## Visualization Builder for the Bachelor Research project "Gamepad Controls for Visualizations"

### Introduction
This file generates the Vega-Lite code used for the three connected Visualizations. It is intended to provide live interaction for testing and tweaking purposes. The code for the final chart versions can be found in a compact python file [`visualization_builder.py`](./visualization_builder.py)  that can be run on its own if you just want to quickly generate the code or use it inside a pipeline. 


In [2]:
# imports
import altair as alt
data_url = "https://raw.githubusercontent.com/nickprbs/Forschungsprojekt/refs/heads/main/yearly_avg_downsampled.csv"

### Single-Chart Pipeline ###
Each chart created with Altair needs to be constructed through a pipeline. Although some steps are **optional**, their *order* **is not!** 
Here is the order that should be followed:
1. **Data**: Set the source of the data (e.g. as Dataframe, URL, etc.)
2. **Transformation**: Perform transformations on the data (e.g. calculate new columns, filter columns, aggregate data points...) 
3. **Visualization**: Choose the kind of visualization (e.g. Bar Chart)
4. **Encodings**: Encode meanings of the data (e.g. axes, colors, tooltips etc.)
5. **Properties**: Additional Properties of the chart (e.g. Width, Height, Title, etc.)

More information can be found in the [Vega-Altair User Guide](https://altair-viz.github.io/user_guide/data.html)

---
### 1. Bar Chart

In [None]:
# Bar chart with line
bar = alt.Chart(data_url).mark_bar().encode(
    x=alt.X('time:T', title='Year'),
    y=alt.Y('tas:Q', aggregate='mean', title='Average Surface Temperature (°C)')
).properties(
    width=600,
    height=400
)

bar

This is very basic but works well. I want to improve the readability by **clustering** the years into groups of five and **scaling** the y-axis more to better show the actual trends. In this configuration, the data is imported through a URL link that contains the data in csv format. Therefore all calculations are not performed with the help of *pandas*. This could be an issue but for now, it is done manually. 

**scaled**:

In [None]:
bar_scaled = alt.Chart(data_url).mark_bar().encode(
    x=alt.X('time:T', title='Year'),
    y=alt.Y('tas:Q', aggregate='mean', title='Average Surface Temperature (°C)', scale=alt.Scale(domain=[250, 300]))
).properties(
    width=600,
    height=400
)
bar_scaled

**scaled + clustered**: 

In [7]:
bar_scaled_clustered = alt.Chart(data_url).transform_calculate(
    year_group = "floor(year(datum.time)/5)*5"
).mark_bar().encode(
    x=alt.X('year_group:O', title='Year (Interval of 5)', axis=alt.Axis(labelAngle=0)),
    y=alt.Y('tas:Q', aggregate='mean', title='Average Surface Temperature (°C)', scale=alt.Scale(domain=[250, 300]))
).properties(
    width=600,
    height=400
)
bar_scaled_clustered

Now, I want to add another line that schows the mean over the years. Because this reintroduces more data points ( average of every year instead of average of every five years), the x-axis needs to be altered: The x-axis of the bar chart is **binned** 

In [22]:
line = alt.Chart(data_url).transform_calculate(
    years = "year(datum.time)"
).mark_line(color='red', strokeWidth=3).encode(
    x=alt.X('years:Q', axis=None, scale=alt.Scale(domain=[2015, 2099])),
    y=alt.Y('tas:Q', aggregate='mean', scale=alt.Scale(domain=[250, 300]))
).properties(
    width=600,
    height=400
)


combined = (bar_scaled_clustered + line).resolve_axis(x='shared', y='shared')
combined