## Reminder on the steps of Altair 

1. **Import Libraries**: Import necessary libraries such as Altair and Pandas to handle data and create visualizations.
   
2. **Prepare Your Data**: Load and prepare your data using Pandas, ensuring it's in a suitable format for visualization.
   
3. **Create a Base Chart**: Use `alt.Chart()` to define a base chart with your DataFrame as the starting point.
   
4. **Choose a Mark Type**: Select the type of visualization (e.g., bar, line, scatter) using the appropriate `mark_*` method.
   
5. **Encode Data**: Map data fields to visual properties like axes, colors, and sizes using the `encode()` method.
   
6. **Display the Chart**: Render the chart in a Jupyter Notebook or save it to a file for sharing or further analysis.

### Data Types

The basic data types supported by Altair are as follows:

| Data Type     | Shorthand Code | Description                          |
|---------------|--------------|--------------------------------------|
| **quantitative** | 'Q'          | A continuous real-valued quantity   |
| **ordinal**      | 'O'          | A discrete ordered quantity         |
| **nominal**      | 'N'          | A discrete unordered category       |
| **temporal**     | 'T'          | A time or date value                |
| **geojson**      | 'G'          | A geographic shape                  |


### About Marks 

Remember that the essential elements of an Altair chart are the data, the mark, and the encoding. There are a number of available marks that you can use;

| Mark      | Method          | Description                                      |
|-----------|----------------|--------------------------------------------------|
| Arc       | `mark_arc()`    | A pie chart.                                    |
| Area      | `mark_area()`   | A filled area plot.                             |
| Bar       | `mark_bar()`    | A bar plot.                                     |
| Circle    | `mark_circle()` | A scatter plot with filled circles.            |
| Geoshape  | `mark_geoshape()` | Visualization containing spatial data.      |
| Image     | `mark_image()`  | A scatter plot with image markers.             |
| Line      | `mark_line()`   | A line plot.                                    |
| Point     | `mark_point()`  | A scatter plot with configurable point shapes. |
| Rect      | `mark_rect()`   | A filled rectangle, used for heatmaps.         |
| Rule      | `mark_rule()`   | A vertical or horizontal line spanning the axis. |
| Square    | `mark_square()` | A scatter plot with filled squares.            |
| Text      | `mark_text()`   | A scatter plot with points represented by text. |
| Tick      | `mark_tick()`   | A vertical or horizontal tick mark.            |
| Trail     | `mark_trail()`  | A line with variable widths.                    |



### About Encodings

The next step is to add visual encoding channels to the selected chart. An encoding channel specifies how a given data column should be mapped onto the visual properties of the visualization. Some of the more frequenty used visual encodings are listed here:

| Channel   | Altair Class | Description                         |
|-----------|-------------|-------------------------------------|
| x         | `X`         | The x-axis value                   | 
| y         | `Y`         | The y-axis value                   | 
| x2        | `X2`        | Second x value for ranges          | 
| y2        | `Y2`        | Second y value for ranges          | 
| longitude | `Longitude` | Longitude for geo charts           |
| latitude  | `Latitude`  | Latitude for geo charts            | 
| color    | `Color`     | Color of the mark                    |
| opacity  | `Opacity`   | Transparency/opacity of the mark     |
| shape    | `Shape`     | Shape of the mark                    |
| size     | `Size`      | Size of the mark                     |
| row      | `Row`       | Row within a grid of facet plots     |
| column   | `Column`    | Column within a grid of facet plots  |


## Interactive elements in Altair

This notebook will focus on adding elements which make visualisations in Altair interactive. 

We will start by revisiting one of the previous visualisations, to explore the dataset again, and to add some interactivity to our charts.

### Basic scroll and zoom interactivity

Firstly let's go back to the very first data set we looked at for visualisation purposes 
- it concerned data about US fish imports over recent years and come from the US government [here](https://www.ers.usda.gov/data-products/aquaculture-data.aspx). 
- We can read it in and print out the first few lines to remind ourselves what it contains: 

In [1]:
import pandas as pd
import altair as alt

fish = pd.read_excel('fishimportdata.xlsx')
fish.head()

Unnamed: 0,Product,2014,2015,2016,2017,2018
0,"Trout, fresh and frozen",95011.936,104791.429,120977.955,135236.725,164695.746
1,"Atlantic salmon, fresh",568901.674,658831.066,842603.917,948877.175,1035924.073
2,"Pacific salmon, fresh",111540.545,69978.873,80010.18,70733.165,90324.372
3,"Atlantic salmon, frozen",21457.836,16356.453,24230.787,22492.341,20197.169
4,"Pacific salmon, frozen",224334.704,238962.677,246204.469,302340.719,360673.107


The very simplest interactivity we can add is simply to allow scrolling and zooming on the axes we create. 

Here's how that looks for a bar chart visualisation of one of the years of data - we add `.interactive()` to our chart to permit it:

- This might be practically useful in that if we haven't quite decided on an appropriate scale we can use this to scroll (only up and down here) and zoom (use your scroll wheel) to get it how we would like it to look. 

- There is a limit here when the axis in question is nominal or ordinal in that the scroll won't let us move that axis. 


In [2]:
# Create a chart using the fish data as bar chart
alt.Chart(fish).mark_bar(color='lightblue').encode(
    x='Product:N', # Using product as a nominal type
    y='2014:Q', # using 2014 column as quantitative
).interactive() # Adding the .interactive property to allow scrolling and zooming

**Try the followings and reflect**

- What if you dropped .interactive() in your display ? 

### Combining two types on an interactive setting

Let's look at another example of two axes which are both quantitative so that we are allowed scrolling in both directions. 

- Here is a visualisation of the numbers for 2015 imports plotted against the numbers for 2014 imports. This sort of visualisation can help us to see quickly which imports have gone up or down year-on-year. 
- To help with that we can add on top a straightline that looks like 'y=x', if a point from our data lies exactly on that line then imports remain the same in both years, below and they have decreased, above and they have increased. 

In [3]:
# Create a scatter plot using the 'fish' DataFrame.
a = alt.Chart(fish).mark_circle(color='blue').encode(
        x=alt.X('2014:Q', title='2014 imports'),  # Map the '2014' column (quantitative data) to the x-axis with a custom title.
        y=alt.Y('2015:Q', title='2015 imports')   # Map the '2015' column (quantitative data) to the y-axis with a custom title.
    ).interactive()  # Enable interactive features such as zooming and panning.

# Create a DataFrame for the reference red line.
# This DataFrame contains two points: (0, 0) and (5,000,000, 5,000,000) to form a diagonal line.
linedata = pd.DataFrame({
    'x': [0, 5000000],  # x-coordinates for the line.
    'y': [0, 5000000]   # y-coordinates for the line.
})

# Create a line chart using the 'linedata' DataFrame.
b = alt.Chart(linedata).mark_line(color='red').encode(
        x=alt.X('x:Q', title=''),  # Map the 'x' field (quantitative data) to the x-axis without a label.
        y=alt.Y('y:Q', title='')   # Map the 'y' field (quantitative data) to the y-axis without a label.
    )

# Overlay the red line chart 'b' on top of the scatter plot 'a'.
# The '+' operator is used in Altair to combine charts.
b + a

- Now the interaction is genuinely useful on this visualisation, since if we view the chart as a whole there is one outlier on the right which means it is diffciult to see what is happening in the bottom left of the chart. 
- We could make a second zoomed version that was also static, but by allowing scrolling and zooming it is just simpler to explore the data as a whole and get a better sense of it.

**REMARK:** Note also in this last example you can see an ad hoc definition of a dataframe as well `linedata = pd.DataFrame({'x': [0,5000000], 'y': [0,5000000]})`, this is the first time we have used this. Broadly the structure is quite straightforward to read, this defines a dataframe with two columns 'x' and 'y' each with two data values in the. 

- The main difficulty is just that care is needed to make sure the different brackets pair up correctly. 
- Here we are using this to provide data to be able to draw the straight line, and so only need two data points to join together. 
- Indeed this ad hoc dataframe definition is good where you quickly want to define a few points for a line or curve, any more than that and it becomes easier to enter the data into a spreadsheet and read it in more properly.

### Hover interactions with `tooltip`

We have already seen another reasonably basic interaction that Altair can overlay on our chart, namely the popup of data values when we hover the mouse over an data mark on the chart. 

- This hover interaction is called a `tooltip` in Altair, and we used it to get data value on the pie chart created in the last notebook, Set 2. 
- To show a second example let's use it here to label the data values in the last chart since currently there is nothing which notes which type of product is imported for each value, so we could add a tooltip to encode that:

In [4]:
# Create a chart using the fish data as bar chart
a = alt.Chart(fish).mark_circle(color='blue').encode(
    x=alt.X('2014:Q',title='2014 imports'), # Map the '2014' column (quantitative data) to the x-axis with a custom title.
    y=alt.Y('2015:Q',title='2015 imports'), # Map the '2015' column (quantitative data) to the y-axis with a custom title.
    tooltip=['Product','2014','2015'] # tooltip property allows hovering over a point to see details
).interactive()

a

Adding the tooltip is then as simple as listing the fields we want to be included in the popup caption when the mouse hovers over the data point in the chart. 
In the previous notebook we also saw that the tooltip fields could include fields calculated as part of a `transform` function in Altair that aggegates data.

**Try the followings and reflect**

- What if you dropped `.interactive()` in your display ? 
- What is the upper-right corner observation details when you move your curser there ?

### Dropdown and radio buttons for selection

Next we can reuse the same data again to look at interacting with a chart by selecting items from a dropdown list or using radio buttons. 

Recall we used before a fold transform so we could fold together the years in our dataset into one field, and this also enables us to refer to the dates as elements within that field - here was the code we used to then stack up the bars for each year in one chart 

In [5]:
alt.Chart(fish).transform_fold(['2014','2015','2016','2017','2018'], as_=['Years','Year']
).mark_bar().encode(
    x=alt.X('Product:N'), 
    y=alt.Y('Year:Q',stack=True),
    color='Years:N'
)

Here we can get away with using a second transform (two don't always work), and we can use a filter transform to select only the year '2014' from the field we created 'Years'. Here's how that would look:

In [6]:
alt.Chart(fish).transform_fold(['2014','2015','2016','2017','2018'], as_=['Years','Year']
).mark_bar().encode(
    x=alt.X('Product:N'),
    y=alt.Y('Year:Q',stack=True),
    color='Years:N'
).transform_filter(
    alt.FieldEqualPredicate(field='Years', equal='2014')
)

- This gives us a template for then choosing which year gets shown by using a dropdown list of the years and allowing interaction to choose one from the list which is then shown in the visualisation. 

- To do that we need to define a dropdown set of options, and a 'binding' to bind those options to a selection from a particular data field. 
- To make it all work together we then need to add the function `.add_params()` to add our binding to the chart. 

Here we also now let the transform filter use the selection we pick, so that the dropdown has the effect we want of picking out the year that gets drawn in the visualisation:

In [7]:
# Define a dropdown menu binding with year options.
year_dropdown = alt.binding_select(options=['2014', '2015', '2016', '2017', '2018'])

# Create a selection object for a single item, bound to the dropdown menu.
year_select = alt.selection_single(
    fields=['Years'],    # Selection based on the 'Years' field.
    bind=year_dropdown,  # Bind the selection to the dropdown options.
    name='Select'        # Name the selection 'Select'.
)

# Create a bar chart using the 'fish' DataFrame.
chart = alt.Chart(fish).transform_fold(
    ['2014', '2015', '2016', '2017', '2018'],  # Transform data from wide to long format.
    as_=['Years', 'Value']  # Rename the transformed columns to 'Years' and 'Year'.
).mark_bar(color='lightblue').encode(
    x=alt.X('Product:N'),  # Encode the x-axis with categorical 'Product' data.
    y=alt.Y('Value:Q')      # Encode the y-axis with quantitative 'Year' data.
).add_params(
    year_select  # Add the dropdown parameter to the chart.
).transform_filter(
    year_select  # Filter the data based on the selected year from the dropdown.
)

# Display the chart.
chart



- Hopefully you can see how this pieces together. the `alt.binding_select()` part lists the options we want in the dropdown list, and we pass them in as a list like this `options=['2014','2015','2016','2017','2018']`. 
- Note here our years are listed as strings (text) with '' around each year, not as numbers - for a different dataset they might be listed as numbers instead, but this works for this data.

Then the binding part links the options to a field within the data we want to visualise. Here `alt.selection_single()` makes a single selection from the dropdown link to something in the chart. Then `fields=['Years']` details which field is being connected to the options, `bind=year_dropdown` connects the dropdown option list here, and `name='Select'` is the label that gets used next to the dropdown box. If you don't use a name then a default piece of text appears there and looks even uglier (the text 'Years' part appears in the label next to the dropdown even when we specify the name). This 'name' part of the interactivity just looks a little messy in the way that Altair constructs it (although this element actually depends a bit on your browser, as different ones render the dropdown to look slightly different!).

### Radio buttons

Another way of selecting from a list is to use radio buttons (round selection buttons). The possible benefit of radio buttons is that all the possible options are listed out as plainly visible rather than hidden in a dropdown list. The code for doing this is really similar to before, we use the `alt.binding_radio()` function to make our radio button option list now. The example shows how to put this together, note it also includes an initialisation of the radio buttons (so one is selected by default initially) by specifying `value=[{'Years':'2014'}]`. The form for that option is to put the field name and the initial option together into that format.

In [8]:
# Adding the radio button binding for the options of different years
year_radio = alt.binding_radio(options=['2014','2015','2016','2017','2018'])
# Create a selection with type=’single’ for the each field by using the above options for the Year field, 
# as the default is selected as 2014
year_select = alt.selection_single(fields=['Years'], 
                                   bind=year_radio, 
                                   name='Select', # Name the selection 'Select'.
                                   value=[{'Years':'2014'}])

# Transform data from wide to long format with transform_fold()
alt.Chart(fish).transform_fold(['2014','2015','2016','2017','2018'], as_=['Years','Year']
).mark_bar(color='lightblue').encode(
    x=alt.X('Product:N'), # X axis for the product as nominal
    y=alt.Y('Year:Q'), # Y axis for the years but keeping quantitative information
).add_selection(
    year_select # Add the radio buttons to the chart.
).transform_filter(
    year_select # Filter the data based on the selected year from the dropdown.
)



With an understanding of the parameter types and conditions, you can now bind parameters to chart elements (e.g. legends) and widgets (e.g. drop-downs and sliders). This is done using the bind option inside param and selection.

There are some other types of selections such as 

- Interval Selection: `alt.selection_interval()`
- Single Selection: `alt.selection_single()`
- Multi Selection: `alt.selection_multi()`

###  Selection Example for Interval with `alt.selection_interval`

As an example of a selection, let’s add an interval selection to a chart.

- To add selection behavior to a chart, we create the selection object and use the add_selection method:
- This adds an interaction to the plot that lets us select points on the plot; perhaps the most common use of a selection is to highlight points by conditioning their color on the result of the selection.


In [16]:
interval = alt.selection_interval()

# Create a bar chart with transformed data for multiple years
alt.Chart(fish).transform_fold(
    ['2014', '2015', '2016', '2017', '2018'],  # Convert wide data into long format
    as_=['Years', 'Value']  # Rename columns: 'Years' for labels, 'Value' for values
).mark_bar(color='lightblue'  # Set bar color
).encode(
    x=alt.X('Product:N'),  # Categorical x-axis for products
    y=alt.Y('Value:Q'),  # Quantitative y-axis for yearly values
).add_selection(
    interval  # Add the interval selection to the chart
)


### Legend interaction with `alt.selection_multi()`

Another nice interaction that Altair offers is to be able to select one of the items in the legend and then do something with it. 

- Here's how that looks, we don't have to specify an initial options list as we don't need to create a new element, the options are just the legend. 
- We do need the binding part though now specified as `alt.selection_multi()`, and with the option `bind='legend'` specified to connect to the legend as the means for selection.

In [11]:
# Create a multi-selection for years, linked to the legend
year_select = alt.selection_multi(fields=['Years'], bind='legend')

# Create a line chart with transformed data for multiple years
alt.Chart(fish).transform_fold(
    ['2014', '2015', '2016', '2017', '2018'],  # Convert wide data into long format
    as_=['Years', 'Value']  # Rename columns: 'Years' for labels, 'Value' for values
).encode(
    x=alt.X('Product:N'),  # Categorical x-axis for products
    y=alt.Y('Value:Q', title='Imports $'),  # Quantitative y-axis with custom title
    color='Years:N'  # Different colors for each year
).mark_line(point=True  # Add points to the line plot
).add_params(year_select)  # Add the selection to the chart



- Well that's worked, but when we make a selection it doesn't do anything to the chart as we haven't used the selection we've made anywhere to change anything about the plot. 
- To do something let's make the year we select highlighted by making it coloured and making the other lines a bit more transparent. 
- We can control that by another option `opacity` which we haven't yet used and an `alt.condition()` function which we also haven't seen yet. 
- Opacity just makes a line more or less opaque by supplying a number between 0 (invisible) and 1 (normally coloured). 

Here, the `alt.condition()` function allows us to specify something to be dependent on a condition. We need to supply three things to this; a condition, a value if the condition is true, and a value is the condition is false. Here our condition is whether the year is selected in the legend coming from our binding `year_select`, the values we specify as numbers using `alt.value()` to make sure the value is properly encoded for altair to grasp it. The value if true is `alt.value(1)` so opacity 1, and if false `alt.value(0.2)` opacity 0.2.

Here's how that looks:

In [17]:
# Create a multi-selection for years, linked to the legend
year_select = alt.selection_multi(fields=['Years'], bind='legend')

# Create a line chart with transformed data for multiple years
alt.Chart(fish).transform_fold(['2014','2015','2016','2017','2018'], as_=['Years','Year']).encode(
    x=alt.X('Product:N'), # Categorical x-axis for products
    y=alt.Y('Year:Q',title='Imports $'), # Quantitative y-axis with custom title
    color='Years:N', # Different colors for each year
    opacity=alt.condition(year_select, alt.value(1), alt.value(0.2))
).mark_line(point=True).add_params(year_select)


- This effectively works through and checks if any selection is made. When a selection is made the selection is true for the selected year - meaning that is marked in full colour, and false for all the other lines - meaning they are marked with opacity 0.2.

- This sort of legend interation can be useful for making sure that lines can be pulled out and individually highlighted, particularly appropriate where lines overlap such as in the above visualisation.

### Sliders for selection

One other method for selection is a slider. This is a similar type of selector as the radio buttons above, but this only works for selecting a number - so a quantitative field.

- We can introduce this using our UK wood import data, in fact we will use a long form of the previous UK wood import data set which was presented in wide form in the exercises previously. 
- For the difference between long and wide forms of data have a look at the following link in the Altair documentation
https://altair-viz.github.io/user_guide/data.html#long-form-vs-wide-form-data.
- In short, both ways of formatting the data present the same things but are easier to index in one form or the other to produce a particular visualisation we might want. 

In [18]:
# Importing the data set and looking at the first few observations
wood = pd.read_excel('UK_wood_imports2.xlsx')
wood.head()

Unnamed: 0,Year,Type,Amount
0,2013,Sawnwood,5488
1,2014,Sawnwood,6425
2,2015,Sawnwood,6323
3,2016,Sawnwood,6646
4,2017,Sawnwood,7580


- The related code snippet for adding a slider looks really quite similar to the previous selection methods so we can work through it here in one example. 
- We'll add a slider to select the year, and make a bar chart from the data for that year.

The main change here is in the selection tool, to obtain a slider we use `alt.binding_range()` and then need to specify a minimum and maximum value for our slider to index, and also the step each time we move the slider. Here we choose to go from 2013 to 2017 in steps of 1 to cover all the possible years.

- The binding looks much as before, and here we use our selection as the input to a filter transform bound to the field 'Year' in order to filter out only the year we select in our data.

In [19]:
year_slider = alt.binding_range(min=2013, max=2017, step=1)
slider_selection = alt.selection_single(bind=year_slider, fields=['Year'], name="Select",value=[{'Year':2015}])

alt.Chart(wood).mark_bar(color='lightblue').encode(
    x=alt.X('Type:N'), 
    y=alt.Y('Amount:Q')
).add_params(
    slider_selection
).transform_filter(
    slider_selection
)



- The main drawback for use of a slider is that they are only permitted in altair for numbers, and in practice we have seen years often are given as strings eg. '2013' rather than 2013, which wouldn't allow us to use one.



## Optional: Selection brushes and linked charts




Another important interactive element available in Altair is linked selections across multiple charts. 

- That is a selection tool (usually called a brush or interval tool) which allows the selection of some data points in a chart using the mouse, which then changes a second linked chart.
- The 'interactive' section of the gallery in the Altair documentation https://altair-viz.github.io/gallery/index.html#interactive-charts has a good few examples of this type of interaction. 
- We will present one more example of this sort of linked interaction below using an example dataset measuring marine conservation as part of the UN sustainable development goals from the UN site [here](https://unstats.un.org/sdgs/UNSDG/IndDatabasePage).

In [20]:
marine = pd.read_excel('UNSD_marine_conservation.xlsx')
marine.head()

Unnamed: 0,Goal,Target,Indicator,SeriesCode,SeriesDescription,GeoAreaCode,GeoAreaName,Nature,Reporting Type,Units,...,Year 2012,Year 2013,Year 2014,Year 2015,Year 2016,Year 2017,Year 2018,Year 2019,Year 2020,Average
0,14,14.5,14.5.1,ER_MRN_MPA,Average proportion of Marine Key Biodiversity ...,8,Albania,C,G,PERCENT,...,63.3863,70.65466,70.65466,70.65466,70.69782,70.69782,70.69782,70.69782,70.69782,47.46875
1,14,14.5,14.5.1,ER_MRN_MPA,Average proportion of Marine Key Biodiversity ...,12,Algeria,C,G,PERCENT,...,76.58298,76.58298,76.58298,76.58298,76.58298,76.58298,76.58298,76.58298,76.58298,71.324868
2,14,14.5,14.5.1,ER_MRN_MPA,Average proportion of Marine Key Biodiversity ...,16,American Samoa,C,G,PERCENT,...,14.3148,14.3148,14.3148,14.3148,14.3148,14.3148,14.3148,14.3148,14.3148,14.311829
3,14,14.5,14.5.1,ER_MRN_MPA,Average proportion of Marine Key Biodiversity ...,24,Angola,C,G,PERCENT,...,66.58023,66.58023,66.58023,66.58023,66.58023,66.58023,66.58023,66.58023,66.58023,66.58023
4,14,14.5,14.5.1,ER_MRN_MPA,Average proportion of Marine Key Biodiversity ...,660,Anguilla,C,G,PERCENT,...,9.34433,9.34433,9.34433,9.34433,9.34433,9.34433,13.43639,13.43639,13.43639,9.92891


- This is essentially the same excel data file available from the UN site apart from here I have added an 'Average' column at the end to average the Yearly columns. 
- Each of these records the percentage of key biodiverse marine areas which are protected, listed by geographic area.

To start to look at this we can plot that new 'Average' column for each of the geographic areas contained in the file.

In [21]:
# Chatting bar graph for the Average value over differentGeoAreaName categories
alt.Chart(marine).mark_bar(color='lightblue').encode(
    x=alt.X('GeoAreaName:N'), 
    y=alt.Y('Average:Q')
)

Given the number of columns it might be interesting for this data to reduce the amount a little. To do that we could sort the dataframe by the 'Average' column in pandas and then take the 30 countries with the largest averages. We could do this with the `.sort_values()` function in pandas. Then to take the first 30 rows of our dataframe we can use `marine[0:30]` to just select them, here's how that would look put together:

In [22]:
# Option 1 without using pandas functions before plotting
alt.Chart(marine[0:30]).mark_bar(color='lightblue').encode(
    x=alt.X('GeoAreaName:N', sort='-y'),  # Sort x-axis based on descending y values
    y=alt.Y('Average:Q')
)

In [23]:
# Option 2 using pandas function before plotting to sort values and then plot
# Sort the DataFrame in descending order based on 'Average'
marine_sorted = marine.sort_values('Average', ascending=False)

# Create the sorted bar chart
alt.Chart(marine_sorted[0:30]).mark_bar(color='lightblue').encode(
    x=alt.X('GeoAreaName:N', sort=None),  # 'sort=None' ensures it respects DataFrame order
    y=alt.Y('Average:Q')
)

**Try the followings and reflect**

- What if you used the `ascending=True` instead ? 
- Can you drop the the input parameter `sort=None` or is it necessary ?
- How about creating your visual in ascending order with Option 1 ?

### Focusing on some areas randomly

- We can also apply our usual fold transformation to make the year columns long rather than wide and allow us to make a visualisation of the time series for different regions. 
- Here we also use a filter transformation to pick out just a few geographic areas chosen at random.

In [24]:
# Create a line chart to visualize marine biodiversity protection percentages over the years
alt.Chart(marine).mark_line().transform_fold(
    # Convert wide-format data into long format for easier visualization
    ['Year 2012', 'Year 2013', 'Year 2014', 'Year 2015',
     'Year 2016', 'Year 2017', 'Year 2018', 'Year 2019', 'Year 2020'],
    as_=['Years', 'Year']  # Renaming the columns: 'Years' (X-axis), 'Year' (Y-axis)
).encode(
    x=alt.X('Years:N', title='Year'),  # Set the x-axis as categorical (N = Nominal)
    y=alt.Y('Year:Q', title='Percentage of marine biodiversity protected'),  # y-axis is quantitative
    color=alt.Color('GeoAreaName:N')  # Different colors for each geographic area
).transform_filter(
    # Filter data to show only selected countries/regions
    alt.FieldOneOfPredicate(field='GeoAreaName', oneOf=['Albania', 'Bermuda', 'China', 'Congo', 'Ireland'])
)


**What this does ?**

- Transforms wide data into a long format (so all years appear under a single column).
- Uses a line plot to show marine biodiversity protection over time.
- Filters data to show only selected countries (Albania, Bermuda, China, Congo, Ireland).
- Uses color encoding to differentiate between countries

### Combining both charts for the linked one!

- Now we can put together the two parts of the visualisation we worked on, and with an interactive selection link them, so that we can select a country on the 'Average' bar chart visualisation and have the time series for that country appear once we select it.
- We will use a brush binding so that the mouse selection allows us to select a single item in the bar chart, and from that selection grab the field 'GeoAreaName' that connects to to the link in our other chart, allowing us to filter for just that area name when we plot the time series. Put together here is how that would work:

In [25]:
brush = alt.selection_single(empty='all', fields=['GeoAreaName'])

a = alt.Chart(marine_sorted[0:30]).mark_bar(color='lightblue').encode(
    x=alt.X('GeoAreaName:N'), 
    y=alt.Y('Average:Q')
).add_params(
    brush
).properties(width=300)

b = alt.Chart(marine[0:30]).mark_line(point=True).transform_fold(
    ['Year 2012','Year 2013','Year 2014','Year 2015',
     'Year 2016','Year 2017','Year 2018','Year 2019','Year 2020'], as_=['Years','Year']).encode(
    x=alt.X('Years:N'), 
    y=alt.Y('Year:Q'),
    color=alt.Color('GeoAreaName:N')
).transform_filter(
    brush
)

alt.hconcat(a,b)



In [26]:
# Create a selection tool (clicking a bar will filter the line chart)
brush = alt.selection_single(empty='all', fields=['GeoAreaName'])

# Create a bar chart showing the average marine protection for the top 30 regions
a = alt.Chart(marine_sorted[0:30]).mark_bar(color='lightblue').encode(
    x=alt.X('GeoAreaName:N'),  # Set x-axis as categorical (GeoAreaName)
    y=alt.Y('Average:Q')  # Set y-axis as quantitative (Average marine protection)
).add_params(
    brush  # Add the interactive selection tool
).properties(width=300)  # Set chart width

# Create a line chart that updates when a region is selected from the bar chart
b = alt.Chart(marine[0:30]).mark_line(point=True).transform_fold(
    # Convert wide-format data into long format for time-series visualization
    ['Year 2012', 'Year 2013', 'Year 2014', 'Year 2015',
     'Year 2016', 'Year 2017', 'Year 2018', 'Year 2019', 'Year 2020'],
    as_=['Years', 'Year']  # Renaming the columns: 'Years' (X-axis), 'Year' (Y-axis)
).encode(
    x=alt.X('Years:N'),  # Set x-axis as categorical (Years)
    y=alt.Y('Year:Q'),  # Set y-axis as quantitative (Yearly protection %)
    color=alt.Color('GeoAreaName:N')  # Different colors for different regions
).transform_filter(
    brush  # Filter data dynamically based on bar chart selection
)

# Display the bar chart and line chart side by side
alt.hconcat(a, b)

**What this above code snippet does?**

- Creates an interactive bar chart for the top 30 regions with the highest average marine protection.
- Creates a line chart that updates dynamically when a user selects a region from the bar chart.
- Uses transform_fold to restructure the data into a long format for the time-series chart.
- Uses add_params(brush) to link the selection tool in the bar chart.
- Uses transform_filter(brush) to make the line chart respond dynamically to selection.
- Horizontally concatenates (hconcat) the bar chart and line chart.

If you select a particular bar on the left the full time series of only that area is shown on the right hand chart.

## Open-Ended Exercises

Here are some more exercises around interactive charts for you to try out and reflect on after covering the above examples. Again any of these are suitable visualisations to add into your portfolio.

1) The penguins dataset contained in `penguins.csv` is a nice set of data for different visualisations that contains measurements of various characteristics of some small penguin species (see [here](https://allisonhorst.github.io/palmerpenguins/) for some details about it and some visualisation ideas). 

- Try making some interactive visualisations of this data.   
- You can mimick the above examples to start with and customize further step by step.

2) The files `county-income.xlsx` and `county-medianhouseprice.xlsx` both come from the [ONS](https://www.ons.gov.uk/peoplepopulationandcommunity/housing/datasets/ratioofhousepricetoworkplacebasedearningslowerquartileandmedian) (Office for National Statistics). 

- They contain data on average incomes and average house prices over time for each English county. 
- They have a very similar format to the marine conservation dataset discussed above. 
- Try making some interactive visualisations of them, maybe try linking two charts (you might want to make a new version of the spreadsheet with an 'Average' column to average across all the years). 

3) Another dataset from the UN concerning their sustainable development goals [UNSD](https://unstats.un.org/sdgs/dataportal) is contained in `AG_PRD_FIESSN.xlsx`. 

- This one is directly from their website with no pre-formatting done to curate the data. It is a dataset containing information about the number of people facing food insecurity in different geographic areas, split by sex, age and year. 
- Again, try making some visualisations of this - you may want to do some pre-processing to clean up the data in Excel and using pandas, it is certainly a good idea to do some filtering to get a reduced version of the data that is more focussed.

## Further Exercising 

- https://altair-viz.github.io/altair-tutorial/README.html can be a good supplementary website to search for certain type of visualizations to inspire from them. 

- You can try to create new versions of the Fringe data set example from the python script `From_Scratch` by improving step by step. 

- Try open ended questions and let us know your journey! Besides, min exercises can be used for later purposes if you create an answer for one specific data, you can adapt your code to tailor for another data set. 