**Course**: Data Visualization,   **Name**: XXX XXX,   **Date**: XX.XX.2019

# Comments regarding assignments

Each assignment consists of **two pieces**:
1. A jupyter notebook with practical exercises.
2. An OLAT questionaire that contains questions regarding the material of the lecture and the notebook. 

Modalities for credit points:
- To qualify for the exam (Prüfungsvoraussetzung), you have to obtain 80% of points in each assignment.
- Points are only given through the questionaire in OLAT. Many questions will be related to material you learned or practiced in the notebook.
- While questionaires are open, you can retake them until you have enough credit points to pass.

**Submission instructions**:
- Finish the practical exercises in the notebook.
- Fill in the OLAT questionaire (which includes the submission of an HTML export of the notebook)
- No group work allowed! You may discuss strategies and solutions, but every student has to do their own implementation.

# Assignment 1 - Visualizing Data

The **goals** of the first assignment are:
- Get familiar with python programming in the jupyter notebook;
- Be able to create a data visualization using bokeh;
- Recreate an existing visualization and develop an eye for key features;
- Start critical thinking about design options;



To achieve these goals, your task is to create a visualization of the weather in Kaiserslautern in 2018. The visualization needs to be similar to the following chart from the New York Times (Jan. 11, 1981, p. 32; Tufte (1983), p. 30) and needs to be implemented in bokeh+pandas:

![New York city's weather for 1980 from the New York Times](http://euclid.psych.yorku.ca/SCS/Gallery/images/NYweather.jpg)

## 1. Starter Code - Minimal working example

The following pieces of code load the data for this assignment and generate a minimal chart for the temperature data. More details can be found in the [bokeh documentation](https://docs.bokeh.org/en/latest/docs/user_guide/quickstart.html).

First load all necessary python modules:

In [41]:
import pandas as pd

from bokeh.plotting import figure, output_notebook, show
from bokeh.models import Band, ColumnDataSource, PrintfTickFormatter, DatetimeTickFormatter, Label, LabelSet
from bokeh.layouts import column
from bokeh.models.tickers import MonthsTicker
from bokeh.transform import dodge

output_notebook()

Load the data given in csv-file format using the pandas library and display the first lines of the data table.

In [42]:
df_kl = pd.read_csv('KLweather2018.csv', parse_dates=['Timestamp'], index_col='Timestamp')
df_kl_prec = pd.read_csv('KLweather2018_monthlyPrecipitation.csv', parse_dates=['Timestamp'], index_col='Timestamp')

print(df_kl.head())
print(df_kl_prec.head())

            temp_min  temp_max  temp_normal_min  temp_normal_max  rel_humidity
Timestamp                                                                     
2018-01-01       4.8       9.3        -0.031034         4.875862         79.75
2018-01-01       4.8       9.3        -0.031034         4.875862         79.75
2018-01-02       4.3       6.4        -0.300000         4.996552         83.58
2018-01-03       5.4      10.7         0.310345         5.182759         83.46
2018-01-04       4.8      12.4        -0.351724         5.027586         90.50
                      prec  prec_normal
Timestamp                              
2018-01-16 00:00:00  128.4    60.400000
2018-02-14 12:00:00   13.7    48.414286
2018-03-16 00:00:00   41.7    53.444828
2018-04-15 12:00:00   30.5    47.268966
2018-05-16 00:00:00  108.9    65.924138


Plot the temperature minimum as a line chart with bokeh using default settings. 

In [43]:
# create a figure
p = figure(plot_height=400, x_axis_type="datetime")

# define the type of glyph that is rendered and its data. here: a polyline
p.line(source=df_kl, x='Timestamp', y='temp_min')

# render the chart
show(p)

## 2. Customizing the temperature chart

As detailed above your visualization needs to look similar to the one from the New York Times. This can be achieved by changing the graphical elements and styling visual properties. Update the code below to make the temperature chart more similar:

### <font color=deeppink>**Tasks**</font>
- Depict the normal high and low temperatures as polylines.
- Label the two polylines. You may use the legend functionality.
- Depict the daily temperature range as an area.
- Label the y-axis.
- Style visual attributes (color, line style) to your liking.

Helpful ressources:
- [Plotting with basic glyphs](https://docs.bokeh.org/en/latest/docs/user_guide/plotting.html) - Overview of glyph types that are implemented in bokeh; see the examples for all the graphical primitives that can be plotted directly.
- [Styling visual attributes](https://docs.bokeh.org/en/latest/docs/user_guide/styling.html) - See styling options for chart elements

In [61]:
def create_temperature_chart(df, width=900):
    '''Create a bokeh figure for temperature range and normal values.'''
    
    # create figure and data source
    p = figure(plot_width=width, plot_height=400, title='Kaiserslautern\'s Weather for 2018', tools=['xwheel_zoom'], 
           x_axis_type="datetime", x_axis_location="above", y_range=(-15,40))

    source = ColumnDataSource(df)

    # add graphical items
    p.line(source=source, x='Timestamp', y='temp_max', legend="Max Temperature", line_color="red")
    p.line(source=source, x='Timestamp', y='temp_min', legend="Min Temperature", line_color="blue")

    # mark min/max temperature
    p.varea(source=source, x = "Timestamp", y1 = "temp_max", y2 = "temp_min", color = "grey")

    # lines indicatinf normal low and normal high
    p.line(source=source, x='Timestamp', y='temp_normal_max', line_color="black")
    p.line(source=source, x='Timestamp', y='temp_normal_min', line_color="black")

    normal_max = Label(x=350, y=150, x_units='screen', y_units='screen',
                 text='Precipitation in inches', render_mode='css',
                 border_line_color='black', border_line_alpha=1.0,
                 background_fill_color='white', background_fill_alpha=1.0)

    normal_min = Label(x=350, y=150, x_units='screen', y_units='screen',
                 text='Precipitation in inches', render_mode='css',
                 border_line_color='black', border_line_alpha=1.0,
                 background_fill_color='white', background_fill_alpha=1.0)

    p.add_layout(normal_max)
    p.add_layout(normal_min)

    # mark the max and min temperature
    tmax_id = df_kl['temp_max'].idxmax()
    tmin_id = df_kl['temp_min'].idxmin()

    # add citations
    temp = [df_kl.at[tmax_id,'temp_max'], df_kl.at[tmin_id,'temp_min']]
    t_id = [tmax_id, tmin_id]
    source = ColumnDataSource(data = dict(t_id = t_id,\
                                          temp = temp,
                                          labels = ["High, {} : {}".format(t_id[0], temp[0]),\
                                                    "Low, {} : {}".format(t_id[1], temp[1])]))
    p.scatter(x='t_id', y='temp', size=8, source=source)
    labels = LabelSet(x='t_id', y='temp', text='labels', level='glyph',
              x_offset=5, y_offset=5, source=source, render_mode='canvas',
                     border_line_color = "black", background_fill_color = "grey")
    p.add_layout(labels)
    # style visual attributes
    p.xaxis.ticker = MonthsTicker(months=list(range(12))) 
    p.xgrid.ticker = MonthsTicker(months=list(range(12))) 
    p.xaxis.formatter=DatetimeTickFormatter(months=["               %b"])
    p.xaxis.major_label_text_align = 'right'
    p.yaxis[0].formatter = PrintfTickFormatter(format="%2i°")
    p.yaxis.axis_label = "Temperature [°C]"
    p.title.text_font_size = "15pt"
    p.title.align = "center"
    
    return p

p = create_temperature_chart(df_kl)
show(p)

## 3. Filtering data and making annotations
The following piece of code demonstrates how to find maxima in a data column. Use this code to automatically find the highest and lowest temperature values in 2018 and place a mark in the chart above at these positions (e.g. circle the respective data points).

### <font color=deeppink>**Tasks**</font>
- Automaticall filter the highest and lowest temperatures in Kaiserslautern in 2018.
- Integrate the code in the chart computation method above and mark the two detected positions.
- Add text labels to the positions. [Label documentation](https://docs.bokeh.org/en/latest/docs/user_guide/annotations.html#labels) for bokeh.

In [5]:
tmax_id = df_kl['temp_max'].idxmax()
tmin_id = df_kl['temp_min'].idxmin()
print("KL temperature maximum:", tmax_id, df_kl.at[tmax_id,'temp_max'])
print("KL temperature mininum:", tmin_id, df_kl.at[tmin_id,'temp_min'])
p = create_temperature_chart(df_kl)
show(p)

KL temperature maximum: 2018-08-04 00:00:00 35.5
KL temperature mininum: 2018-02-28 00:00:00 -14.0


## 4. Designing additional charts

Now design the charts for precipitation and relative humidity.

### <font color=deeppink>**Tasks**</font>
- Create the chart for precipitation. Try to design a bar chart using the hints below.
- Create the chart for humidity.

Hints for temporal x-axis:
- **Width of bars**: The width is given milliseconds. In order to get the required scaling, you will need to specify the width like: `widthInDays = ndays*24*60*60*1000` (24 hours * 60 minutes * 60 seconds * 1000 milliseconds)
- **Position of bars**: You can shift the bars using the dodge function `x=dodge('prec', value, range=p.x_range)`. Mark that you need to define an appropriate `value` by which to shift the bar.

In [35]:
def create_precipitation_chart(df, width=900):
    '''Create a bokeh figure for monthly precipitation (2018 vs normal values).'''
    source = ColumnDataSource(df)
    p = figure(plot_width=width, plot_height=200, tools=['xwheel_zoom'], x_axis_type="datetime")

    #30 days in a month on average
    ndays = 10
    width = float(ndays*24*60*60*1000)
    # plot the bar graph
    p.vbar(x=dodge('Timestamp', -width/2, range=p.x_range), top='prec', width=width, source=source,
       color="black")
    p.vbar(x=dodge('Timestamp', width/2, range=p.x_range), top='prec_normal', width=width, source=source,
       hatch_pattern = "left_diagonal_line", color = "white", line_color = "black")

    #labels
    citation = Label(x=350, y=150, x_units='screen', y_units='screen',
                 text='Precipitation in inches', render_mode='css',
                 border_line_color='black', border_line_alpha=1.0,
                 background_fill_color='white', background_fill_alpha=1.0)

    p.add_layout(citation)
    p.xaxis.ticker = MonthsTicker(months=list(range(12))) 
    p.xgrid.ticker = MonthsTicker(months=list(range(12)))
    p.xaxis.formatter=DatetimeTickFormatter(months=["               %b"])

    return p

show(create_precipitation_chart(df_kl_prec))

In [36]:
def create_humidity_chart(df, width=900):
    '''Create a bokeh figure for relative humidity.'''
    source = ColumnDataSource(df)

    p = figure(plot_width=width, plot_height=200, tools=['xwheel_zoom'], x_axis_type="datetime")
    p.vbar(x=dodge('Timestamp', 0.0, range=p.x_range), top='rel_humidity', width=10.0, source=source,
       color="grey")

    p.line(source=df_kl, x='Timestamp', y='rel_humidity')
    return p

show(create_humidity_chart(df_kl))

## 5. Combining multiple charts

In this last part, we combine the three charts you designed above.

### <font color=deeppink>**Tasks**</font>
- Create the combined weather chart for Kaiserslautern.
- Save a jpg/png-version or screenshot of this chart that can be uploaded in OLAT.

In [37]:
show(column(create_temperature_chart(df_kl), create_precipitation_chart(df_kl_prec),create_humidity_chart(df_kl)))