![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

<a href="https://hub.callysto.ca/jupyter/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fcallysto%2Fdata-viz-of-the-week&branch=main&subPath=anthromes/anthromes.ipynb&depth=1" target="_parent"><img src="https://raw.githubusercontent.com/callysto/curriculum-notebooks/master/open-in-callysto-button.svg?sanitize=true" width="123" height="24" alt="Open in Callysto"/></a>


# Callysto’s Weekly Data Visualization

## Anthromes

### Recommended grade levels: 7-12

### Instructions: “Run” the cells to see the graphs
Click “Cell” and select “Run All”. <br>This will import the data and run all the code, so you can see this week's data visualizations (scroll to the top after you’ve run the cells). <br>**You don’t need to do any coding**.

![instructions](https://github.com/callysto/data-viz-of-the-week/blob/main/images/instructions.png?raw=true)

### About the notebook

Callysto's Weekly Data Visualization is a learning resource that aims to develop data literacy skills. We provide grades 5-12 teachers and students with a data visualization, like a graph, to interpret. This companion resource walks learners through how the data visualization is created and interpreted by a data scientist. 

The steps of the data analysis process are listed below and applied to each weekly topic.

1. Question - What are we trying to answer? 
2. Gather - Find the data source(s) you will need. 
3. Organize - Arrange the data so that you can easily explore it. 
4. Explore - Examine the data to look for evidence to answer our question. This includes creating visualizations. 
5. Interpret - Explain how the evidence answers our question. 
6. Communicate - Reflect on the interpretation. 

## 1. Question

Have you ever wondered how human civilizations have used the Earth's land?

For instance, how has farming and industrialization changed land-use patterns. Can we see this in a dataset of land use?


### Goal
Our goal is to show the gradual then sudden change in land-use patterns from prehistory to modern day.

We will use a "stacked line graph" showing the landmasses of different land-use categories varying from urbanized to wild lands.


## 2. Gather
The code below will import the Python programming libraries we need to gather and organize the data to answer our question.

In [None]:
%pip install -r requirements.txt
import pyodide_http
pyodide_http.patch_all()

import pandas as pd #the pandas library is used to organize our data into tables known as "pandas dataframes"
import os #used to create OS agnostic file paths
from plotly.subplots import make_subplots #used to create our interactive plots
import plotly.graph_objects as go #used to create our interactive plots

### About our data

Models of historic land use can provide evidence for future scenarios of land-use change. Our dataset is from the Netherlands Environmental Assessment Agency's [History Database of the Global Environment](https://themasites.pbl.nl/tridion/en/themasites/hyde/). The [particular dataset](https://dataportaal.pbl.nl/downloads/HYDE/HYDE3.2/) we are using is one on anthromes since 10000 BC. Anthromes, or [Anthropogenic Biomes](https://en.wikipedia.org/wiki/Anthropogenic_biome), are a classification system for the Earth's surface on the basis of how human-altered the land is.  

The dataset breaks down into 6 main categories. These are described below.

| Name              | Description                                                                  |
|-------------------|------------------------------------------------------------------------------|
| Dense Settlements | Urban and other nonagricultural dense settlements                            |
| Villages          | Densely populated agricultural settlements                                   |
| Croplands         | Lands used mainly for annual crops                                           |
| Rangelands        | Lands used for pasture and livestock grazing                                 |
| Seminatural       | Inhabited lands with minor use for permanent agriculture and settlements     |
| Wild              | Lands without human populations or substantial land use                      |

It further uses many different time intervals for its measurements. These are described in the table below.

| Interval  | Date Range    |
|-----------|---------------|
| Millennia | 10000-1000 BCE|
| Century   |    0-1600     |
| Decade    |    1700-2000  |
| Year      |    2001-2017  |

Much of the modern data is obtained by recorded  measurements, but the earlier data points were created using historical and archeological methods. [This paper](https://www.mdpi.com/2073-445X/9/5/129) describes the dataset in greater detail.

### Importing our data
This next block of code will read the data file and save it in a dataframe named `df`.

In [None]:
path = os.path.join("data", "anthromes_summary.txt") #create an OS agnostic file path
df = pd.read_csv(path, delim_whitespace=True) # create df
df = df.set_index("Class_id") # set index of df

In [None]:
df.head() # display df

### Comment on the data

The dataframe above has the categories we want encoded in the index and broken down into granular sub categories and has no separation between data points separated by a millennia or a year.

## 3. Organize

The code below will reorganize our code and break it into meaningful subparts so we can create our plots. 

In [None]:
# The lines below sum subcategories into the 6 main catagories we described earlier

df.loc["Dense Settlements"] = df.loc[[11, 12]].sum()
df.loc["Villages"] = df.loc[[21, 22, 23, 24]].sum()
df.loc["Croplands"] = df.loc[[31, 32, 33, 34]].sum()
df.loc["Rangeland"] = df.loc[[41, 42, 43]].sum()
df.loc["Seminatural"] = df.loc[[51, 52, 53, 54]].sum()
df.loc["Wild"] = df.loc[[61, 62, 63]].sum()

In [None]:
df = df[-6:] # remove all the original granular data
df #display the new df

Next, we'll break the df down into 4 sub dataframes based on the time interval between data points.

In [None]:
#these commands create our 4 new dataframes based on time interval
df_millennia = df.iloc[:, :11]
df_centuries = df.iloc[:, 10:28]
df_decades = df.iloc[:, 27:58]
df_years = df.iloc[:, 57:]

## 4. Explore

To explore our dataset, we'll create 4 plots, one for each of the dataframes we created above. If we tried to create one plot with a linear time scale, the more frequent data points would be squished together and it would be hard to see many details.

In [None]:
#this function adds traces to a plotly fig object

def add_traces(fig, df, row, col, traces, showlegend=False):
    colors = ['grey', 'teal', 'yellow', 'brown', 'blue', 'green']
    i = 0;
    x = df.columns
    for trace in traces:
        fig.add_trace(go.Scatter(
        x=x,
        y=df.loc[trace].array,
        legendgroup="group" + str(i),
        name = trace,
        mode='lines+markers',
        stackgroup='one',
        showlegend=showlegend,
        line_color=colors[i]),
                      row=row, 
                      col=col)
        i = i + 1
    return fig

In [None]:
anthromes = ["Dense Settlements", "Villages", "Croplands", "Rangeland", "Seminatural", "Wild"]
#create fig with subplots
fig = make_subplots(rows=2, cols=2,
                   subplot_titles=['Millennia', 'Centuries', 'Decades', 'Years'],
                   x_title="Year (negative dates are BCE)",
                   y_title="km squared")

#use the add_traces function to create the four plots
fig = add_traces(fig, df_millennia, 1, 1, anthromes, True)
fig = add_traces(fig, df_centuries, 1, 2, anthromes)
fig = add_traces(fig, df_decades, 2, 1, anthromes)
fig = add_traces(fig, df_years, 2, 2, anthromes)

#add title
fig.update_layout(title_text='Changes in Anthromes Across Time')
#show plot
fig.show()

## 5. Interpret
Below, we will discuss the plot we created and how to examine it.

First, when looking at these plots we must ensure we are aware of the varying interval between points. The top left plot(where data points are seperated by 1000 year intervals), 10000 BCE to 0, is incredibly stable with very little land devoted to settlements or agriculture. Between 0 and 1700 we see more change than in the previous 10000 years. The plot with the most dynamic elements is the one between 1700 and 2000. Here, we see incredible shifting away from seminatural lands to agricultural lands and the emergence of more villages around the turn of the last century. Finally, between 2000 and 2017 we note little change.

To help us see this change more clearly, the creators of this data created world maps showing land use. Look at the change between 10000 BCE and 2017.

![10000BCE](data/anthromes12k_010000_10000BC.png)
![2017](data/anthromes12k_022017_02017AD.png)

## 6. Communicate
### Reflect on the interpretation
What can data reveal about changes in land use over time?

**Continuity and change**
- What has stayed the same and what has changed across time?

**Cause and effect**
- What human activities and natural phenomena affect change in land use?
- How can human activities contribute to solutions to land-use change?

**Ethics**
- How can personal and societal choices impact change?
- How might land-use change impact society or the economy?

[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)