### Please run this notebook in jupyter's classic interface.
- To launch the classic interface go to `Help` -> `Launch Classic Notebook` and reopen the notebook.
- If you cannot find the `Launch Classic Notebook` option under the `Help` menu, it means you are already using the classic interface.

<img src="images/banner_cc_eng.png" width="890" align="center"/>

<a id='intro'></a>


# Carbon Cycle & Drought

### Programming exercises in Python using ICOS CO$_2$-data

This Jupyter Notebook is dedicated to explaining the carbon cycle and focuses on topics regarding daily and yearly fluctuations in the concentration and uptake of carbon dioxide. It contains definitions, short descriptions, figures and animations that describe the following terms: **carbon dioxide**, **carbon cycle**, **carbon sinks**, **carbon sources**, **photosynthesis** and **Gross Primary Production (GPP)**. You will be able to test your knowledge on carbon by taking the _Carbon Challenge Quiz_. In order to better comprehend the carbon cycle with real-life examples, the notebook also includes exercises using ICOS data from Hyltemossa research station in Sweden.

Calculating statistics over several thousands of values by hand (i.e. using paper and pen) can be a cumbersome task. Not to mention the probability of committing en error in the calculations, which is getting higher as the number of values increases. Computer programming allows us to execute computations over large number of values faster and more accurately. The purpose of the following programming exercises is for you to gain first hand experience of how easy it is to calculate statistics, over relatively *large* datasets, using just a few lines of code. Several step-by-step exercises have been developed to make you acquainted with basic principles in Python-programming. 

Programming-wise you will learn how to:

-  Import and read data from csv-files
-  Organize data in arrays/DataFrames
-  Extract data from arrays/DataFrames with the help of an index
-  Calculate statistics over the data values of an array/DataFrame
-  Create static and interactive plots


The programming exercises build on topics related to the daily and yearly patterns of uptake and release of carbon dioxide amongst different parts of an ecosystem. You will have the chance to test your newly acquired knowledge in  programming and climate science in the final exercise, where you will be asked to help a climate scientist determine whether the drought during the summer of 2018 had any effect on the vegetation near Hyltemossa research station.  

<br>
The notebook is divided into the following parts:

-  [1. What is carbon dioxide (CO$_2$)?](#co2_definition)


-  [2. Carbon cycle](#carbon_cycle_definition)


-  [3. A year in the life of Earth's CO$_2$](#co2_youtube)


-  [4. Quiz: Carbon challenge - How well do you know carbon?](#quiz_environmentalist)


-  [5. Exercises with ICOS-data from Hyltemossa research station](#exercise_icos_htm)
    -  [5.1. Import Python modules](#exercise_import_modules_py)
    -  [5.2. Read data from CSV-files and store them in Python arrays](#exercise_read_csv_to_pandas)
    -  [5.3. Indexing](#add_index_to_pandas)
    -  [5.4. Calculate statistics](#pandas_stats)
    -  [5.5. Plotting](#exercise_plot_data_bokeh)


-  [6. Exercise: Help a climate scientist](#final_exercise_py) 


-  [7. References](#references) 


<br>
<br>
<br>
<br>

## Instructions on  How to Use this Jupyter Notebook

### <span style="color:#cb4154">Run the Notebook</span>
To execute this Jupyter Notebook, go to the top menu and click on **Kernel** and then **Restart & Run All**. 
<br>
<br>

<img src="images/restart_run_all_nb_pic.png" width="290" align="center"/>

<br>
<br>
<br>
<br>

### <span style="color:#cb4154">Run a single code-cell</span>
A Jupyter notebook contains code-cells. You can write Python code in a code-cell and then execute that code by clicking on the **Run**-button in the top menu. 
<br>
<br>

<img src="images/run_code_cell_nb.png" width="580" align="center"/>

<br>
<br>


### <span style="color:#cb4154">Active code-cell</span>
Note that when you click on the **Run**-button in the top menu, only the code inside the currently active code-cell will be executed. Active code-cells are highlighted in blue or green. To activate a code-cell, simply click on it.
<br>
<br>

<img src="images/marked_code_cell_nb_py_eng.png" width="580" align="center"/>

<br>
<br>


### <span style="color:#cb4154">Add a new code-cell</span>

To add a new code-cell under a currently active code-cell, click on the **"+"**-button in the top menu.
<br>
<br>

<img src="images/add_new_code_cell_nb.png" width="580" align="center"/>

<br>
<br>

<br>
<br>
<br>
<br>

<a id='co2_definition'></a>

## 1. What is carbon dioxide (CO$_2$)?


### 1.1. Definition
Carbon dioxide is a chemical compound that is composed of one carbon and two oxygen atoms. It is represented by the chemical formula: **CO$_2$**. In average temperature and pressure conditions, it appears as a colourless gas with a faint sharp odour and a sour taste [[1]](#references). Carbon dioxide is naturally present in the Earth's atmosphere in a low concentration and acts as a heat-trapping (greenhouse) gas [[1]](#references). 


### 1.2. How is CO$_2$ produced
Carbon dioxide is released through human activities (e.g. burning fossil fuels, forest fires, deforestation, etc.) and natural processes (e.g. animal/plant respiration, fermentation and volcanic eruptions) [[2]](#references). At this point, it is important to state that the burning of biomass (i.e. the mass of biological organisms in an ecosystem at a given time) [[3]](#references) or carbon-containing materials does not necessarily lead to an increase of the carbon dioxide concentration in the Earth's atmosphere. This is true, as long as the biomass is allowed to grow back and absorb the same amount of atmospheric carbon dioxide as before. However, burning fossil fuels (such as coal, oil or gas) results in the release of carbon, that has been outside of the carbon cycle for a very long time, back into the atmosphere. When this carbon is not absorbed to produce new biomass (from processes like e.g. plant photosynthesis), then the total concentration of carbon dioxide in the Earth's atmosphere increases.


### 1.3. CO$_2$ and the greenhouse effect
Greenhouse gases like water vapour, carbon dioxide, methane, nitrous oxide or ozone are naturally present in the Earth's atmosphere and allow incoming solar radiation to pass through the atmosphere and reach the Earth's surface. The incoming solar radiation is then absorbed by the land and the oceans, heating the Earth. As a result, heat is radiated from Earth to space. Some of this heat is then trapped by the greenhouse gases in the atmosphere, keeping the Earth warm enough to sustain life (*greenhouse effect*) [[4]](#references). Without the greenhouse effect, the Earth's temperature would be too low to sustain any form of life. 


### 1.4. CO$_2$ and the consequences of the enhanced greenhouse effect
But what happens when the total concentration of greenhouse gases in the Earth's atmosphere begins to increase? Increased levels of greenhouse gases in the atmosphere lead to more heat being trapped, which ultimately causes the temperatures on the Earth's surface to increase too (*enhanced greenhouse effect*) [[5]](#references). A direct consequence of this effect is global warming, which in turn, is responsible for:


- thawing/melting glaciers and permafrost
- rising sea levels
- more frequently occurring extreme cases of precipitation or drought
- changing conditions for agriculture due to
    - desertification
    - change in the start and/or duration of the growing season 

Increased levels of carbon dioxide in the atmosphere are also linked to ocean acidification. Ocean acidification is a process where the ocean's ph-levels are slowly decreasing, as it absorbs increasingly higher amounts of atmospheric carbon dioxide. Ocean acidification reduces the amount of carbonate in oceans, which makes it more difficult for marine organisms like corals or plankton to build their shells or skeletons [[4]](#references). More acidic conditions may also dissolve existing shells and affect the metabolic functions, growth and reproduction of other marine life (e.g. fishes), ultimately disturbing the balance of the oceanic ecosystems. 

<br>
<br>
<br>
<div style="text-align: right"> 
    <a href="#intro">Back to top</a>
</div>

<br>
<br>
<a id='carbon_cycle_definition'></a>


## 2. Carbon Cycle

### 2.1. Definition
Carbon dioxide (CO$_2$), carbon monoxide (CO) and methane (CH$_4$) are all included in the carbon cycle. The figure below shows how different chemical compounds of carbon are being transported from carbon sources to carbon sinks. **Carbon sources** are processes that are responsible for emitting carbon to the atmosphere. On the contrary, processes that absorb carbon from the atmosphere are described as **carbon sinks**. In the figure, carbon sinks are represented by blue arrows while carbon sources are represented by pink arrows.


### 2.2. Carbon sources
Carbon, in the form of carbon dioxide, is emitted to the atmosphere through human, animal and plant respiration. A large portion of carbon emissions to the atmosphere originate from burning fossil fuels. Forest fires are also responsible for emitting carbon, in the form of carbon dioxide, to the atmosphere. Ocean animals and plants do also emit carbon dioxide through respiration. Grazing animals, and especially cows, emit methane by eructation (belching/burping) and/or flatulence (farting). Methane is also emitted by bacteria that can be found in human or animal faeces. Carbon dioxide and methane are emitted when detritivores/decomposers, like fungi or earthworms, decompose dead organic matter (e.g. twigs, leaves, animal parts, etc.) and turn it into soil. 


### 2.3. Carbon sinks
Plants absorb carbon from the atmosphere, in the form of carbon dioxide, when they photosynthesize. The process during which plants absorb carbon dioxide, water and energy from the sun to produce oxygen and energy-rich organic compounds is denoted as **photosynthesis**. This process is performed by terrestrial, aquatic and ocean plants. Photosynthesis can only take place when sunlight is available. Therefore, plants do not photosynthesize during night. Instead, during night time, plants only respire (i.e. breathe in oxygen and exhale carbon dioxide). Plants store carbon in their tissue (biomass), when they use the energy-rich organic compounds that they produced during photosynthesis, to create new branches, leaves and roots or to expand the size of their trunk.


Carbon enters the soil in the form of dead organic matter. All living organisms (e.g. plants, animals and humans) consist of organic matter. When plants drop their leaves or twigs on the ground, they drop organic matter. The same thing happens when humans and animals drop urine or faeces. In marine ecosystems, dead organic matter that has fallen down to the seabed, is turned into sediment. Carbon that is stored in soils can slowly be turned into oil and gas. However, this is a process that can take between 50 million to 500 million years to complete.

<br>

<img src="images/carbon_cycle_eng.png" width="900" align="center">
<br>
<font size="2.9" color="#9F8331"><p style="text-align:center"><b>Figure 1:</b> Carbon sinks & sources </p></font>
<br>
<br>
<br>
<br>
<div style="text-align: right"> 
    <a href="#intro">Back to top</a>
</div>

<br>
<br>
<a id='co2_youtube'></a>

## 3. A year in the life of Earth's carbon dioxide (NASA)
Click on the video below, to watch how the concentration of carbon dioxide in the Earth's atmosphere changes during the different seasons of a year. Observe the differences between the Northern and Southern hemisphere. 


Note that carbon dioxide is shown in a colour scale ranging from dark blue to light pink, whereas carbon monoxide is shown in a black to white colour scale. Higher carbon dioxide values are displayed in shades of reds and pinks, while higher carbon monoxide values are shown in lighter grays or whites.

In [None]:
############################################################################################################
################## Python & Javascript Code - handling code visibility (entire document)####################
############################################################################################################

# Import modules:
from IPython.display import HTML

HTML('''<script> $('div .input').hide()''')

In [None]:
%%javascript
IPython.OutputArea.prototype._should_scroll = function(lines) {
    return false;
}

In [None]:
# Import modules
import numpy as np
import pandas as pd
from datetime import datetime
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource, HoverTool, Label
from bokeh.io import show, reset_output, output_notebook

%matplotlib inline

reset_output()
output_notebook()

In [None]:
############################################################################################################
############################## Python & Javascript Code - Hide code-cell  ##################################
############################################################################################################

# Import modules:
from IPython.display import display, HTML

# Code for hiding a code-cell:
toggle_code_str = '''
<form action="javascript:code_toggle()"><input type="submit" id="toggleButton" value="Hide/Show code"></form>
'''

toggle_code_prepare_str = '''
    <script>
    function code_toggle() {
        if ($('div.cell.code_cell.rendered.selected div.input').css('display')!='none'){
            $('div.cell.code_cell.rendered.selected div.input').hide();
            $('#toggleButton').val('Show code');
        } else {
            $('div.cell.code_cell.rendered.selected div.input').show();
            $('#toggleButton').val('Hide code');
        }
    }
    </script>

'''

display(HTML(toggle_code_prepare_str + toggle_code_str))

# Call function to hide code-cells:
def toggle_code():
    display(HTML(toggle_code_str))
    
############################################################################################################
############################################################################################################
############################################################################################################





# Import module to display video:
from IPython.display import Video

# Show NASA video - "A Year in the Life of Earth's CO2":
Video("video/NASA A Year in the Life of Earths CO2.mp4", width=970, height=576)

<br>
<br>
<div style="text-align: right"> 
    <a href="#intro">Back to top</a>
</div>

<br>
<br>
<a id='quiz_environmentalist'></a>

## 4. Quiz: Carbon challenge
Test your knowledge on carbon by completing the following quiz. Note that questions 2 - 4 may have more than one correct answers.
Once you have answered all questions, click on the
<span style="color:white">
<span style="background-color:#3973ac">  Show results  </span></span>-button. <br>
The result reflects your level of environmental awareness. See if your score is high enough to save the penguins' habitat!

In [None]:
# Import quiz-function:
from tools.carbon_cycle_quiz import create_widget_form

# Call function to display quiz:
create_widget_form()

<br>
<br>
<div style="text-align: right"> 
    <a href="#intro">Back to top</a>
</div>

<br>
<br>
<a id='exercise_icos_htm'></a>


## 5. Exercises with CO$_2$-data from Hyltemossa ICOS station

The purpose of this part is to further introduce you to the patterns followed by the daily and annual cycle of carbon dioxide. To achieve this, you are going to calculate statistics over larger datasets of carbon dioxide concentration values from Hyltemossa research station for 2017 - 2019. You will visualize the time series by creating an interactive diagram and then use the tools provided to extract more information from the plotted values. Before you are able to calculate any statistics over the aforementioned values, you will have to import the corresponding datafile and store its content in an array. Python allows you to import data from a datafile and directly store it in an array, by using ready functions (code). In Python, these ready functions are called modules. A module has to be imported before it can be used. Sometimes it might be interesting to calculate statistics over shorter time-intervals than a year. Data that is stored in an Python array can be filtered in different ways. Here you will learn how to filter data by time and calculate statistics over subsets of data.

You are going to work with data from [ICOS Hyltemossa research station](https://www.icos-sweden.se/station_hyltemossa.html). Before you continue, here is some summarized information about ICOS and ICOS stations. 

<br>
<img src='images/icos_map_2020_v2.png'>
<br>
<font size="2.9" color="#9F8331"><p style="text-align:center"><b>Figure 2:</b> ICOS station network </p></font>
<br>

Nowadays, a lot of research is done in order to comprehend the details behind changes in the carbon cycle. More specifically, scientists are interested in how changes in the vegetation and/or oceans may affect the concentration of CO$_2$ and other greenhouse gases in the atmosphere. [ICOS](https://www.icos-cp.eu/), which is an acronym for Integrated Carbon Observation System, is a European research infrastructure that conducts long term, standardised and high-precision greenhouse-gas measurements to map the carbon balance of Europe. At present, ICOS comprises over 130 stations in 13 countries. Stations may be located on land or onboard ships. [ICOS Sweden](https://www.icos-sweden.se/) is the Swedish contribution to this European effort and is a cooperation of several research institutes. Currently, ICOS Sweden contributes with measurements from 10 stations at 7 different locations.

<br>
<br>
<br>
<img src='images/htm_station_photo_mashup.png'>
<br>
<br>
<font size="2.9" color="#9F8331"><p style="text-align:center"><b>Figure 3:</b> Photos from ICOS Hyltemossa research station [<a href="#visual_element_references">credits</a>]</p></font>
<br>
<br>

Hyltemossa research station is part of the ICOS Sweden station network. The station is located south of the city of Perstorp, in the northwestern part of Scania county, in Sweden. It is surrounded by a 30-year old managed spruce forest. 

Measurements of carbon dioxide can be influenced by what is near the broader vicinity of a station. To minimize this effect, measuring instruments are placed high up on towers (see photo in the middle). Measurements can also be influenced by the wind speed and wind direction. For instance, if the wind blows in direction from a city with high industrial activity, then the measured CO$_2$-concentration might be higher.

Zoom in in the map below to view what is near Hyltemossa research station. Are you able to find the following locations?

**1.** Nedre Sore lake <br>
**2.** Perstorp industrial park <br>
**3.** Ljungbyhed airport <br>



In [1]:
# Import modules:
import folium

# Create map object:
m = folium.Map(location=[56.097991, 13.420181], zoom_start=7)

# Add marker:
folium.Marker(location=[56.097991, 13.420181],
              popup='ICOS <br>Hyltemossa Research Station',
              icon=folium.Icon(color='darkred', icon='cloud')).add_to(m)

# Show map
m

<br>
<br>
<div style="text-align: right"> 
    <a href="#intro">Back to top</a>
</div>
<br>
<br>
<a id='exercise_import_modules_py'></a>


### 5.1. Import Python modules
Python is a programming language that includes built-in methods. A module can be described as set of functions. To use these functions, you need to first import the module they belong to. Usually, modules are imported in the beginning of a Python-program.

The next code-cell shows the syntax of how to import Python modules. It is possible to import a module using the syntax <code style="color:#CD5C5C">import datetime</code>. To import all functions from a module type <code style="color:#CD5C5C">from datetime import *</code>. However, this is considered bad practice, so it is best to avoid that. For importing a single function from a module type <code style="color:#CD5C5C">from datetime import datetime</code>.

When you import a module, it is possible to change its name after the keyword <code style="color:#CD5C5C">as</code>. Usually, the name provided after <code style="color:#CD5C5C">as</code> is an abbreviation of the modules official name. The following piece of code <code style="color:#CD5C5C">import pandas as pd</code>, will import a module called _pandas_ and change its name to _pd_. This way, you do not have to type the full name of the module when you include it in your code. Your code will be easier to read if you decide to follow this practice, and will conform with most examples you will find for both _pandas (pd)_ and _numpy (np)_.

<br>

```python
#Import modules:
import numpy as np
import pandas as pd
from datetime import datetime
```

<br>
<br>
<div style="text-align: right"> 
    <a href="#intro">Back to top</a>
</div>

<br>
<br>
<a id='exercise_read_csv_to_pandas'></a>


### 5.2. Read data from CSV-files and store them in Python arrays

#### <span style="color:#CD5C5C">What is a Pandas DataFrame ?</span>

Arrays are used in programming to store data in a structured way. An array consists of rows and columns. Python has different kinds of arrays. Here we are going to work with pandas DataFrames. This type of array allows you to store data with different data types in the same array (i.e., pandas DataFrame). It is, for example, permitted to store a column with datetime objects, a column with strings and a column with floats in the same DataFrame (see figure below).

<br>
<br>

<img src="images/pandas_py_icos_ex_eng.png" width="550" align="center">
<br>
<br>
<font size="2.9" color="#9F8331"><p style="text-align:center"><b>Figure 4:</b> Example of a pandas DataFrame with CO$_2$-measurements from ICOS Hyltemossa research station </p></font>
<br>
<br>

You can refer to values in a pandas DataFrame by using their corresponding column name and row number. For example, the value *412.985* in the DataFrame above, belongs to the column *co2* and its row number is *2*. Note that, in Python, the numbering of rows begins with *0* instead of *1*.

Row numbers often serve as an index. An index is used to find and retrieve data. For now, it is OK to consider an index as one of the DataFrame's columns that contains unique vales. In the example above, the index is numeric and consists of a set of integers. Every row has a different number (i.e. every row is identified by a different number). An index does not have to be numeric but it should not include duplicates. Columns that contain datetime objects (e.g. *DateTime* column in the figure above) can also be used as an index. This is valid, as it is not possible for two different measurements to have been measured by the same instrument at the exact same timepoint.


Next we are going to explain the syntax of the code that is used to read data from csv-files and store it in pandas DataFrames. A csv-file is a text-file containing comma separated values. This type of file contains data that can be stored in arrays/DataFrames. Values in a csv-file are stored in as many rows as the rows included in its corresponding array/DataFrame. Values that belong to the same row but different columns, are separated by a "," (comma). Other characters (e.g. ";") may also be used as delimiters (i.e. a character that separates values). Usually, the first row in a csv-file contains the column names.


Python uses a built-in function <code style="color:#CD5C5C">read_csv()</code> from the <i>pandas</i> module to read data from a csv-file and store it in a pandas DataFrame. Remember that, in this case, the <i>pandas</i> module was renamed to <i>pd</i> at import.

<br>
<br>

<u>**Syntax:**</u>

```python
pd.read_csv("path to where the file is stored",
            "row-number that contains column names",
            "delimiter",
            "name of column that includes time-related values")
```

<br>

Note! The syntax ```pd.read_csv()``` is only valid if the pandas module was imported using the following syntax ```import pandas as pd```. If the module was imported as ```import pandas```, then the correct syntax to import data from a csv-file would be ```pandas.read_csv()```.


<br>
<br>


#### 5.2.1.  Read ICOS CO$_2$-data from CSV-file and store it in a pandas DataFrame
In order to use the ```pd.read_csv()``` function, you have to provide:

- the **path** to the datafile you wish to read. The path shows the location on your pc or on the server where the datafile is stored.

- the number of the row in the datafile that contains information about the column names, in ```header```. 

- the **delimiter** (i.e. the character that is used to separate values in the datafile), in ```sep```. 

- the name of the column that stores time-values, in ```parse_dates```. Python handles time-values in a specific way. Therefore, if your datafile includes a column with time-values, you have to inform Python that these values should be processed as datetime objects.

<br>

The datafile includes information for the following variables:
- **Site**: Abbreviation of the station name (e.g. *HTM* for Hyltemossa)
- **SamplingHeight**: Elevation (expressed as metres above ground level) where the sensor measures CO$_2$-concentration.
- **InstrumentId**: Id of instrument that takes the measurements. Every instrument is assigned with a unique id.
- **DateTime**: Exact timepoint when a measurement was taken.
- **co2**: CO$_2$-measurement at a given timepoint (expressed as μmol/mol, which is equivalent to ppm).

<br>
<br>

<span style="color:blue">**Example**</span>

```python
# Import csv-file with CO2-data to pandas DataFrame:
co2_df = pd.read_csv('data/carboncycle/co2_concentration/htm_150m_L2_co2',
                     header=0,
                     sep=';',
                     parse_dates=['DateTime'])

# Show the 5 first rows of the DataFrame:
co2_df.head(5)
```

In [None]:
# Import csv-file with CO2-data to pandas DataFrame:
co2_df = pd.read_csv('data/carboncycle/co2_concentration/htm_150m_L2_co2',
                     header=0,
                     sep=';',
                     parse_dates=['DateTime'])

# Show the 5 first rows of the DataFrame:
co2_df.head(5)

<br>

**Exercise 1:** Try to display the 10 first rows of the DataFrame.

**Optional exercise:** Try to only display the 1st row of the DataFrame.

In [None]:
################################
# Add button to hide/show code:
toggle_code()
################################



# Write your own code below:


<br>

#### 5.2.2.  How many rows does a DataFrame have?
With the help of ```len()```, you can find out how many rows a DataFrame has. ```len``` is an abbreviation of the English word *length* and, in this case, has the meaning of: *how long is this DataFrame?*.


**Syntax**:  <br>


```python
len(dataframe_name)
```


<span style="color:blue">**Example:**</span> 

```python
# Return the total number of rows included in the "co2_df" DataFrame:
len(co2_df)
```

In [None]:
# Return the total number of rows included in the "co2_df" DataFrame:
len(co2_df)

<br>
<br>
<div style="text-align: right"> 
    <a href="#intro">Back to top</a>
</div>
<br>

### 5.3.Indexing
A numeric index of integers is automatically assigned to your DataFrame when you create it (see _row number_ in Figure 4). The first row has an index-value that is equal to "**0**". The second row has an index-value that is equal to "**1**". The third row has an index-value that is equal to "**2**", etc. 

Indices are used to select and retrieve values from a DataFrame. For example, you may get all values included in the **1st** row of the DataFrame **co2_df**, by using the command: <code style="color:#CD5C5C">co2_df.loc[0]</code>.


<br>

<span style="color:blue">**Example 1:**</span>

```python
# Return all values from the 1st row of the DataFrame:
co2_df.loc[0]
```

In [None]:
# Return all values from the 1st row of the DataFrame:
co2_df.loc[0]

<br>

<span style="color:blue">**Example 2:**</span>

```python
# Return the CO2-value from the 1st row of the DataFrame:
co2_df.co2.loc[0]
```

In [None]:
# Return the CO2-value from the 1st row of the DataFrame:
co2_df.co2.loc[0]

<br>

**Exercise 2:** Try to extract the CO$_2$-value from the DataFrame's 8th row. <br> 
*Remember that Python's index for row numbers start at **0**.*

**Exercise 3:** Try to extract the CO$_2$-value from the DataFrame's 1001st row.

<br>

-  Click on the button  ```Hide/Show Code``` to write your own code.
-  Execute your code by clicking on ```Run``` in the menubar at the top.
-  Copy-paste or type your answer to a question in the corresponding answer-field below. <br> Click on the ```Check answer```-button to see if your answer is correct.


In [None]:
################################
# Add button to hide/show code:
toggle_code()
################################


# Write your own code below:


In [None]:
# Import modules:
from ipywidgets import VBox
from tools.check_answer_widgets import create_coding_quiz_question, create_coding_quiz_question_dropdown, create_coding_quiz_question_true_false

# Display both answer-control boxes in the same column:
display(VBox([create_coding_quiz_question('Exercise 2', 414.511),
              create_coding_quiz_question('Exercise 3', 402.561)]))

**Define an index**

You can change the index of a DataFrame by using the built-in method <code style="color:#CD5C5C">set_index()</code>. An index can be a column containing numeric values, strings or datetime objects (i.e. special structures that are used to store date and time values). When choosing a column for index, you should consider whether it includes a different value for every row (indices should not include duplicates).  

**Syntax:**
<br>
<br>
$$ dataframe.set\_index(column) $$
<br>

<span style="color:blue">**Example:**</span>

```python
# Set the column "DateTime" as index:
co2_df_ind = co2_df.set_index('DateTime')

# Show DataFrame:
co2_df_ind.head(5)
```

In [None]:
# Set the column "DateTime" as the DataFrame's index:
co2_df_ind = co2_df.set_index('DateTime')

# Show DataFrame:
co2_df_ind.head()

**Filter a DataFrame using a datetime-index**


Use the following code syntax to extract all rows from a DataFrame that contain data for a given date and time:

<br>
<br>
$$dataframe[dataframe.index == datetime(year, month, day)]$$
<br>
<br>

Note that the code above will only work if your DataFrame has a column with datetime objects as index.

<span style="color:blue">**Example:**</span>

```python
# Show all rows with measurements measured on the 1st of June 2017 at 07:00:00
co2_df_ind[co2_df_ind.index==datetime(2017, 6, 1, 7, 0, 0)]
```

In [None]:
# Show all rows with measurements measured on the 1st of June 2017 at 07:00:00:
co2_df_ind[co2_df_ind.index==datetime(2017, 6, 1, 7, 0, 0)]

**Exercise 4:** Try to only display rows with data for the 1st of June 2017 at 15:00:00. 

**Exercise 5:** Try to only display rows with data for the 31st of December 2017 at 21:00:00.


-  Click on the button  ```Hide/Show Code``` to write your own code.
-  Execute your code by clicking on ```Run``` in the menubar at the top.
-  Copy-paste or type your answer to a question in the corresponding answer-field below. <br> Click on the ```Check answer```-button to see if your answer is correct.

In [None]:
################################
# Add button to hide/show code:
toggle_code()
################################


# Write your own code below:


In [None]:
# Import widgets:
from ipywidgets import VBox

# Display both answer-control boxes in the same column:
display(VBox([create_coding_quiz_question('Exercise 4', 407.01),
              create_coding_quiz_question('Exercise 5', 414.307)]))

<br>
<br>
<br>

**Filter DataFrame by time periods using a datetime-index**


It is possible to filter a DataFrame by time periods using a datetime-index. The syntax below shows how to extract values for a given time period.

<br>
<br>

**Syntax:**

$$dataframe[datetime(year_{start}, month_{start}, day_{start}):datetime(year_{end}, month_{end}, day_{end}, hour_{end}, minute_{end})]$$
<br>
<br>

<span style="color:blue">**Example:**</span> 

```python
# Show all rows with data for: 1st of June 2017 at 00:00:00 - 1st of June 2017 at 23:59:00.
co2_df_ind[datetime(2017, 6, 1):datetime(2017, 6, 1, 23, 59)]
```

In [None]:
# Show all rows that contain measurements measured on the 1st of June 2017 (1 day):
co2_df_ind[datetime(2017, 6, 1):datetime(2017, 6, 1, 23, 59)]

<br>

**Exercise 6:** Try to only show rows with measurements measured at the 31st of December 2017 between 08:00 and 17:00. 

**Optional exercise:** Try to only show rows with measurements measured between the 31st of December 2017 at 21:30 and 1st of January 2018 at 12:00.


In [None]:
################################
# Add button to hide/show code:
toggle_code()
################################


# Write your own code below:



<br>
<br>
<div style="text-align: right"> 
    <a href="#intro">Back to top</a>
</div>
<br>

### 5.4. Calculate statistics
It would be easy to estimate the highest and lowest CO$_2$-value in the DataFrame of figure 4. It would also be possible to calculate the mean CO$_2$-value using a paper and a pen. On the contrary, it would not be as simple to calculate statistics over hundreds of thousands of values. This is when computer programming is needed.


Arrays and DataFrames in Python come equipped with a set of built-in functions that can be used to calculate statistics over the values of a column. Minimum ```min()```, maximum ```max()```, average ```mean()``` and summation```sum()```are typical examples of such built-in functions. This is the syntax on how to apply these built-in functions to a column of a pandas DataFrame.


**Syntax:**
<br>
<br>
$$dataframe.column.function()$$
<br>
<br>
<br>
<br>

<span style="color:blue">**Example:**</span> 

```python
# Compute the lowest value of the column "co2":
co2_df_ind.co2.min()
```

In [None]:
# Compute the lowest value of the column "co2":
co2_df_ind.co2.min()

<br>

**Exercise 7:** Try to compute the max value from all values in column *co2*.

**Exercise 8:** Try to compute the mean value of all values in column *co2*.

-  Click on the button  ```Hide/Show Code``` to write your own code.
-  Execute your code by clicking on ```Run``` in the menubar at the top.
-  Copy-paste or type your answer to a question in the corresponding answer-field below. <br> Click on the ```Check answer```-button to see if your answer is correct.

In [None]:
################################
# Add button to hide/show code:
toggle_code()
################################


# Write your own code below:



In [None]:
# Import widgets:
from ipywidgets import VBox

# Display both answer-control boxes in the same column:
display(VBox([create_coding_quiz_question('Exercise 7', 452.815),
              create_coding_quiz_question('Exercise 8', 411.86)]))

<br>

#### Filter a DataFrame by time and calculate statistics 

Python allows you to combine several operations in one line of code. The example below shows how you can filter a DataFrame by a time period and calculate statistics over the values of a column for that time period. Note that the DataFrame has to have an index with datetime objects for the following syntax to be valid.


**Syntax:**
```python

dataframe[datetime(year_start, month_start, day_start, hour_start):datetime(year_end, month_end, day_end, hour_end)].column.function()


```

<span style="color:blue">**Example:**</span>

This is an example of how to compute the average of all CO$_2$-values measured between 08:00 and 17:00 on December 31st 2018:

```python

co2_df_ind[datetime(2018, 12, 31, 8):datetime(2018, 12, 31, 17)].co2.mean()

```


In [None]:
# Calculate the average of the CO2 values ​measured between 08:00 and 17:00 on December 31st 2018:
co2_df_ind[datetime(2018, 12, 31, 8):datetime(2018, 12, 31, 17)].co2.mean()

<br>

**Exercise 9:** Try to compute the minimum value of column *co2*  for measurements taken between 08:00 and 14:00 on June 10th 2018.

**Exercise 10:** Try to compute the minimum value of column *co2*  for measurements taken between 08:00 and 14:00 on December 31st 2018.


-  Click on the button  ```Hide/Show Code``` to write your own code.
-  Execute your code by clicking on ```Run``` in the menubar at the top.
-  Copy-paste or type your answer to a question in the corresponding answer-field below. <br> Click on the ```Check answer```-button to see if your answer is correct.

In [None]:
################################
# Add button to hide/show code:
toggle_code()
################################


# Write your own code below:


In [None]:
# Import widgets:
from ipywidgets import VBox

# Display both answer-control boxes in the same column:
display(VBox([create_coding_quiz_question('Exercise 09', 393.825),
              create_coding_quiz_question('Exercise 10', 418.103)]))

<br>
<br>
<div style="text-align: right"> 
    <a href="#intro">Back to top</a>
</div>
<br>
<br>
<a id='exercise_plot_data_bokeh'></a>

### 5.5. Plotting

In this part, you will visualize the CO$_2$-values from the DataFrame you created above. There are many libraries for visualizing data in Python. These libraries contain ready code to visualize data in different ways. In this case, you will get a ready function to plot data in an interactive diagram.


A function is a block of code that will only be executed once the function is called. It may contain variables, operators (e.g. <code style="color:gray">+</code>, <code style="color:gray">-</code>, <code style="color:gray">*</code>, <code style="color:gray">/</code> or <code style="color:gray">%</code>), if-statements, for-loops, etc.


It is possible to pass values to a function using input parameters. A function may have zero, one or several input parameters. A function can return output as values with the help of the <code style="color:#CD5C5C">return</code>-statement. The return-statement is usually used at the end of a function to return a result.


This is the syntax on how to create a function in Python.


**Syntax:**

```python
def function_name(input_parameter1, input_parameter2):
    
    sum = input_parameter1 + input_parameter2
    
    return sum
```

<br>

In this case, we have already prepared a function for you to plot the CO$_2$-values. The function takes 2 input parameters; the *DataFrame* containing the CO$_2$-values and the *color* you wish the plot to have. Note that the purpose of this notebook is not for you to understand the code inside the function but rather to understand how to use the function. You are meant to become familiar on how to pass variables into a function using input parameters. The function will return different results based on the values of the input parameters. 


```python
# Call function to plot data from the "co2_df_ind" DataFrame.
# Enter a color of your preference for the plot.
plot(co2_df_ind, color='green')
```

In [None]:
# Function that produces an interactive plot of co2-data from a pandas DataFrame in a specific color:
def plot(df_L2, color):
    
    # Import modules:
    from datetime import datetime
    from bokeh.plotting import figure, show
    from bokeh.models import ColumnDataSource, HoverTool, Label
    from bokeh.io import output_notebook

    # Dictionaries with conversions of numbers to corresponding superscript or subscript version:
    SUB = str.maketrans("0123456789", "₀₁₂₃₄₅₆₇₈₉")
    SUP = str.maketrans("0123456789", "⁰¹²³⁴⁵⁶⁷⁸⁹")

    # Create a figure object:
    p = figure(plot_width=900,
               plot_height=500,
               x_axis_label='Time (UTC)', 
               y_axis_label='CO2'.translate(SUB)+' (' +'\u03BC'+ 'mol.mol-1'.translate(SUP) + ')',
               x_axis_type='datetime',
               title = 'CO2'.translate(SUB)+' mixing ratio (dry mole fraction) - Hyltemossa, Sweden, '+str(df_L2.SamplingHeight.iloc[0])+'m' ,
               tools='pan,box_zoom,wheel_zoom,reset,save')


    # Create empty list to store legend-info:
    legend_it = []


    # Extract time- and co2-values from DataFrame:
    x1 = df_L2.index.values
    y1 = df_L2.co2.values

    # Create a circle-glyph:
    r0 = p.circle(x1, y1, radius=.12, color=color)
    
    # Create a line-glyph:
    r1 = p.line(x1, y1,
                line_width=1, color=color)

    # Add tooltip:
    p.add_tools(HoverTool(tooltips=[
        ('Time (UTC)','@x{%Y-%m-%d %H:%M:%S}'),
        ('CO2'.translate(SUB),'@y{0.f}'),
        ],
        formatters={
            '@x'      : 'datetime', 
        },
        # Show tooltip when the mouse is vertically aligned to a glyph
        mode='vline'
        ))  

    # Format plot title:
    p.title.align = 'center'
    p.title.text_font_size = '13pt'
    p.title.offset = 15

    # Format x-axis and y-axis titles:
    p.xaxis.axis_label_text_font_style = 'normal'
    p.yaxis.axis_label_text_font_style = 'normal'
    p.xaxis.axis_label_standoff = 15 # Sets the distance of the label from the x-axis in screen units
    p.yaxis.axis_label_standoff = 15 # Sets the distance of the label from the y-axis in screen units

    # Set location for copyright label:
    label_opts = dict(x=0, y=10,
                      x_units='screen', y_units='screen')

    # Create copyright-label:
    caption1 = Label(text="© ICOS ERIC", **label_opts)
    caption1.text_font_size = '8pt'

    # Inactivate hover-tool:
    p.toolbar.active_inspect = None

    # Add copyright label to plot:
    p.add_layout(caption1, 'below')

    # Define output location:
    output_notebook()
    
    # Show plot:
    show(p)

In [None]:
# Function that produces a static plot of co2-data from a pandas DataFrame in a specific color:
def plot_gpp(df, color='lightgray'):
    
    # Import modules:
    from matplotlib import pyplot as plt
    import matplotlib.dates as mdates
    from pandas.plotting import register_matplotlib_converters
    plt.style.use('seaborn-whitegrid')
    import numpy as np
    import pandas as pd
    
    # Create a python dictionary to transform numbers to corresponding superscript version:
    SUP = str.maketrans("0123456789", "⁰¹²³⁴⁵⁶⁷⁸⁹")

    # Call matplotlib converters to activate them:
    register_matplotlib_converters()

    # Create figure:
    fig = plt.figure(figsize=(20,8))
    
    # Create plot:
    plt.plot(df.index.values, df.GPP.values, color)
    
    # Set ticks on x-axis:
    plt.gca().set_xticks([df.index.values.min(),
                          df.index.values.min() + np.timedelta64(151, 'D'),
                          df.index.values.min() + np.timedelta64(242, 'D'),
                          df.index.values.min() + np.timedelta64(365, 'D'),
                          df.index.values.min() + np.timedelta64(517, 'D'),
                          df.index.values.min() + np.timedelta64(608, 'D'),
                          np.datetime64('2017-01-01T00:30:00.000000000'),
                          np.datetime64('2017-01-01T00:30:00.000000000') + np.timedelta64(151, 'D'),
                          np.datetime64('2017-01-01T00:30:00.000000000') + np.timedelta64(242, 'D'),
                          df.index.values.max() - np.timedelta64(365, 'D'),
                          df.index.values.max() - np.timedelta64(214, 'D'),
                          df.index.values.max() - np.timedelta64(123, 'D'),
                          df.index.values.max()])
    
    
    plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
    plt.gcf().autofmt_xdate()
    plt.xlabel('Time', fontsize=14, labelpad=20)
    plt.ylabel('GPP ('+'umol m-2 s-1'.replace('u', '\u03BC').translate(SUP)+')', fontsize=14, labelpad=20)
    plt.title('Carbon uptake (GPP) from Hyltemossa research station', fontsize=18, pad=20)
    
    # Show plot:
    plt.show()

In [None]:
# Call function to plot data from "co2_df_ind". Type what color you wish the plot to have.
plot(co2_df_ind, color='green')

__Exercise 11:__ <br>
Make sure to become familiar with the tools in the right side of the interactive plot, to answer the questions (2-5) below. You have to activate a tool before using it. To activate a tool, simply click on it. Active tools are highlighted with a blue line in the left side of the tool icon. Use the mouse, to hover over the tools and get a description of their name.


1. Try to create the same plot with another color. Click on the ```Hide/Show Code```-button to write your own code.
2. Use Box-Zoom and Hover to zoom in on the plot and view the measured CO2-value for March 6th 2018, at 06:00.
3. Use Box-Zoom and Hover to zoom in and see during which months the highest values are observed. What season do they belong to? Why are the highest values observed during that period?
4. Use Box-Zoom and Hover to view during which months the lowest values are observed? What season do they belong to? Why are the lowest values observed during this period?
5. Use Box-Zoom and Hover to zoom in and view the observed CO2-values for July 18th 2017. Observe that the values are higher when it is dark and lower when the sun is shining. What process do you think is responsible for this? Keep in mind that Hyltemossa station is surrounded by a spruce forest and that this was not a windy day.

<br>
<br>

-  Click on the button  ```Hide/Show Code``` to write your own code and answer question 11.1.
-  Execute your code by clicking on ```Run``` in the menubar at the top.

In [None]:
################################
# Add button to hide/show code:
toggle_code()
################################

# Write your own code for exercise 11.1. below:




<br>
<br>

#### Check your answers:
Click on the ```Check answer```-button to see if your answer is correct.

In [None]:
# Import modules:
from ipywidgets import VBox
from tools.check_answer_widgets import create_coding_quiz_question, create_coding_quiz_question_dropdown, create_coding_quiz_question_true_false
# Display both answer-control boxes in the same column:
display(VBox([create_coding_quiz_question('Exercise 11.2', 442.9),
              create_coding_quiz_question_dropdown('Exercise 11.3', ['autumn-winter', 'winter-spring', 'spring-summer', 'summer-autumn'], 'winter-spring'),
              create_coding_quiz_question_dropdown('Exercise 11.4', ['autumn-winter', 'winter-spring', 'spring-summer', 'summer-autumn'], 'summer-autumn'),
              create_coding_quiz_question_dropdown('Exercise 11.5', ['photosynthesis', 'respiration', 'car traffic', 'forest fire'], 'photosynthesis')]))

<br>
<br>
<div style="text-align: right"> 
    <a href="#intro">Back to top</a>
</div>
<br>
<br>
<a id='final_exercise_py'></a>

## 6. Help a climate scientist

During the summer of 2018, many places in Sweden experienced record high temperatures. It was also a summer with very little precipitation. When the air temperature gets too high and the soil too dry, many plants respond by limiting or stopping completely their photosynthetic activity (i.e. they stop growing). This is a survival mechanism, as plants tend to loose water during photosynthesis. Consequently, when water resources are scarce, plants focus on maintaining their existent biomass to survive and thus stop growing. When plants stop photosynthesizing, they also stop taking up carbon dioxide from the atmosphere.

Anna is a climate scientist and researcher. She wants to find out if the drought of the summer of 2018 had a negative effect on the vegetation near Hyltemossa research station. One way of discovering that, is to examine how much atmospheric carbon dioxide was taken up by the vegetation near Hyltemossa during that summer. In other words, this is an attempt to estimate how much plants photosynthesized during that summer. Gross Primary Production (GPP)  of an ecosystem can be described as the amount of carbon that has been taken up from the atmosphere by plants during photosynthesis. 

Martin, who is the principal Investigator (PI) responsible for running Hyltemossa station, has sent Anna a datafile with GPP-values covering the time period 2015 - 2018.

Anna was planning to plot these values and carry out a visual inspection, as a first attempt of spotting any differences in the uptake of CO$_2$ between 2018 and the previous years. She was also thinking of calculating the sum of the CO$_2$ uptake during the summer months of 2018 and comparing it to the corresponding sums of the previous years. 

At the moment, Anna is very busy with fieldwork.
Could you help her create the plot and perform the necessary computations?

<br>
<br>
<img src="images/scientist.png" width="180" align="right"/>
<br>

**<span style="color:blue">Step 1:</span>** Import Martin's datafile to an array (pandas DataFrame). Name the array **gpp**. <br>
-  Path to datafile: ```data/carboncycle/co2_uptake/htm_gpp``` <br>
-  Row-number in file containing column names: ```0``` <br>
-  Delimiter: ```;``` <br>
-  Column name containing time information: ```time```

<br>

**<span style="color:blue">Step 2:</span>** Set the column ```time``` as index. Name the new array **gpp_ind**. <br>

<br>

**<span style="color:blue">Step 3:</span>** Plot all values in the array ```gpp_ind``` by using the function: ```plot_gpp(dataframe_name, color)```


**Question 1.** During what season, in principle, is the highest uptake of CO$_2$ (GPP-values) observed?<br>
**Question 2.** Are the values representing the CO$_2$ uptake during 2018 folllowing the same pattern as the corresponding values for other years?


<br>


**<span style="color:blue">Step 4:</span>** Compute the sum of how much CO$_2$ was taken up by the vegetation during the summer months (jun-aug) for each year.

_**Tip!** Python's built-in function ```sum()``` computes the sum of all values in a column. The function is used in the same way as ```min()```, ```max()``` and ```mean()```._



**Question 3.** What is the lowest sum of CO$_2$ uptake during the summer months?<br>
**Question 4.** What year does the lowest sum of CO$_2$ uptake during the summer months belong to?<br>
**Question 5.** Did the drought of the summer of 2018 affect the vegetation near Hyltemossa?

<br>
<br>

-  Click on the button  ```Hide/Show Code``` to write your own code.
-  Execute your code by clicking on ```Run``` in the menubar at the top.
-  Copy-paste or type your answer to a question in the corresponding answer-field below. <br> Click on the ```Check answer```-button to see if your answer is correct.

In [None]:
################################
# Add button to hide/show code:
toggle_code()
################################

# Write your own code below:



In [None]:
# Import widgets:
from ipywidgets import VBox

# Display both answer-control boxes in the same column:
display(VBox([create_coding_quiz_question_dropdown('Question 1', ['Spring', 'Summer', 'Autumn', 'Winter', 'Do not know'], 'Summer'),
              create_coding_quiz_question_true_false('Question 2', 'No'),
              create_coding_quiz_question('Question 3', 35762.356598700004),
              create_coding_quiz_question('Question 4', 2018),
              create_coding_quiz_question_true_false('Question 5', 'Yes')]))

<br>
<br>
<br>
<div style="text-align: right"> 
    <a href="#intro">Back to top</a>
</div>


<a id='references'></a>
<br>

## 7. References

### 7.1. Text references

**1.** The Editors of Encyclopaedia Britannica, "Carbon dioxide." Encyclopædia Britannica, Encyclopædia Britannica, inc., last updated on May 27, 2020, Retrieved April 27, 2020, from https://www.britannica.com/science/carbon-dioxide


**2.** Shaftel H., Jackson R., Callery S. and Bailey D., "Carbon dioxide", NASA Global Climate Change - Vital Signs of the Planet, last updated on April 06, 2020, Retrieved April 27, 2020, from https://climate.nasa.gov/vital-signs/carbon-dioxide/


**3.** IUPAC, Compendium of Chemical Terminology, 2nd ed. (the "Gold Book") (1997). Online corrected version:  (2006–) "biomass". doi:10.1351/goldbook.B00660 


**4.** Doyle, H. (Ed.). (2020, June 18). What Is the Greenhouse Effect? Earth Science Communications Team at NASA's Jet Propulsion Laboratory / California Institute of Technology, Retrieved September 07, 2020, from https://climatekids.nasa.gov/greenhouse-effect/


**5.** Australian Government, Department of Agriculture, Water and the Environment, A. (Ed.). (2020, January 01). Greenhouse effect. Retrieved September 07, 2020, from https://www.environment.gov.au/climate-change/climate-science-data/climate-science/greenhouse-effect

<br>

### 7.2. Data references

Biermann, T., Heliasz, M., Mölder, M., ICOS RI, 2019. ICOS ATC CO2 Release, Hyltemossa (150.0 m), 2017-04-17–2019-04-30, https://hdl.handle.net/11676/l0ysHf3ENUx1MIouesbfFAnG

Heliasz, M. and ICOS Ecosystem Thematic Centre: Drought-2018 ecosystem eddy covariance flux product from Hyltemossa, doi:https://doi.org/10.18160/17ff-96rt, 2020.

<br>
<br>
<br>


<a id='visual_element_references'></a>


###### icon & photo credits
<font size="0.9">CO$_2$-icon made by Freepik from www.flaticon.com</font> <br>
<font size="0.9">scientist-icon made by DataBase Center for Life Science (DBCLS) from http://togotv.dbcls.jp/ja/togopic.2017.18.html</font> <br>
<font size="0.9">Photos of ICOS Hyltemossa station, courtesy of Tobias Biermann tobias.biermann@cec.lu.se</font>


<br>
<br>
<br>
<div style="text-align: right"> 
    <a href="#intro">Back to top</a>
</div>

<br>
<br>
<br>
<br>
<br>
<br>


<img src="logos/ssc_proj_logos_eng.png" width="950"/>
<br>