**Course**: Data Visualization

<div class="alert alert-info">
    
# Assignment 1 - Visualizing Data
</div>

The **goals** of the first assignment are:
- Get familiar with python programming in the jupyter notebook;
- Be able to create a data visualization using bokeh;
- Recreate an existing visualization and develop an eye for key features;
- Start critical thinking about design options;



To achieve these goals, your task is to create a visualization of the weather in Kaiserslautern in 2018. The visualization should be similar to the following chart from the New York Times (Jan. 11, 1981, p. 32; Tufte (1983), p. 30) and needs to be implemented in bokeh+pandas:

![New York city's weather for 1980 from the New York Times](http://euclid.psych.yorku.ca/SCS/Gallery/images/NYweather.jpg)


<div class="alert alert-danger">

**Important**: While no points will be awarded for typing the correct answers in the notebooks, it is highly advised to solve the tasks thoroughly. They are designed to be encouraging and provide you with valuable learnings for the exam, understanding of the methods and practical coding.
</div>

<div class="alert alert-success">
    
All tasks in this notebook are marked in green.
</div>

<div class="alert alert-info">
    
## 1. Starter Code - Minimal working example
</div>

The following pieces of code load the data for this assignment and generate a minimal chart for the temperature data. More details can be found in the [bokeh documentation](https://docs.bokeh.org/en/latest/docs/user_guide/quickstart.html).

First load all necessary python modules:

In [1]:
import pandas as pd

from bokeh.plotting import figure, output_notebook, show
from bokeh.models import Band, ColumnDataSource, PrintfTickFormatter, DatetimeTickFormatter, Label
from bokeh.layouts import column
from bokeh.models.tickers import MonthsTicker
from bokeh.transform import dodge

output_notebook()

Load the data given in csv-file format using the pandas library and display the first lines of the data table.

In [2]:
df_kl = pd.read_csv('KLweather2018.csv', parse_dates=['Timestamp'], index_col='Timestamp')
df_kl_prec = pd.read_csv('KLweather2018_monthlyPrecipitation.csv', parse_dates=['Timestamp'], index_col='Timestamp')

#df_kl.head()
df_kl_prec

Unnamed: 0_level_0,prec,prec_normal
Timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1
2018-01-16 00:00:00,128.4,60.4
2018-02-14 12:00:00,13.7,48.414286
2018-03-16 00:00:00,41.7,53.444828
2018-04-15 12:00:00,30.5,47.268966
2018-05-16 00:00:00,108.9,65.924138
2018-06-15 12:00:00,81.5,67.137931
2018-07-16 00:00:00,41.6,60.521429
2018-08-16 00:00:00,40.7,57.653571
2018-09-15 12:00:00,27.0,52.596552
2018-10-16 00:00:00,11.1,65.413793


Plot the temperature minimum as a line chart with bokeh using default settings. 

In [3]:
# create a figure
p = figure(plot_height=400, x_axis_type="datetime")

# define the type of glyph that is rendered and its data. here: a polyline
p.line(source=df_kl, x='Timestamp', y='temp_min')

# render the chart
show(p)

<div class="alert alert-info">
    
## 2. Customizing the temperature chart
</div>

As detailed above, your visualization should look like a modern version of the one from the New York Times. This can be achieved by changing the graphical elements and styling visual properties. In the function below some elements are already changed. Update the code to make the temperature chart even more similar:

<div class="alert alert-success">
    
- Depict the normal high and low temperatures as polylines.
- Label the two polylines. You may use the legend functionality.
- Depict the daily temperature range as an area.
- Label the y-axis.
- Style visual attributes (color, line style) to your liking.
    
</div>

Helpful ressources:
- [Plotting with basic glyphs](https://docs.bokeh.org/en/latest/docs/user_guide/plotting.html) - Overview of glyph types that are implemented in bokeh; see the examples for all the graphical primitives that can be plotted directly.
- [Styling visual attributes](https://docs.bokeh.org/en/latest/docs/user_guide/styling.html) - See styling options for chart elements

In [4]:
def create_temperature_chart(df, width=900):
    '''Create a bokeh figure for temperature range and normal values.'''
    
    # create figure and data source
    p = figure(plot_width=width, plot_height=400, title='Kaiserslautern\'s Weather for 2018', tools=['xwheel_zoom'], 
           x_axis_type="datetime", x_axis_location="above", y_range=(-15,40))

    source = ColumnDataSource(df)

    # add graphical items
    p.line(source=source, x='Timestamp', y='temp_max', color='black')
    p.line(source=source, x='Timestamp', y='temp_min', color='black')
    p.varea(source=source, x='Timestamp', y1 ='temp_min' , y2='temp_max', color='black')
    p.line(source=source, x='Timestamp', y='temp_normal_max', legend_label="temp_normal_max", color='orange')
    p.line(source=source, x='Timestamp', y='temp_normal_min', legend_label="temp_normal_min", color='blue')
    

    # mark min/max temperature
    tmax_id = df_kl['temp_max'].idxmax()
    tmax = df_kl.at[tmax_id,'temp_max']
    #print("KL temperature maximum:", tmax_id, df_kl.at[tmax_id,'temp_max'])

    tmin_id = df_kl['temp_min'].idxmin()
    tmin = df_kl.at[tmin_id,'temp_min']
    #print("KL temperature minimum:", tmin_id, df_kl.at[tmin_id,'temp_min'])
    
    max_temp_label = Label(x=tmax_id, y=df_kl.at[tmax_id,'temp_max'], x_units='data', y_units='data',
                 text="Max temp:"+str(tmax)+"°C on "+str(tmax_id), border_line_color='black')
    min_temp_label = Label(x=tmin_id, y=df_kl.at[tmin_id,'temp_min'], x_units='data', y_units='data',
                 text='Min temp:'+str(tmin)+"°C on "+str(tmin_id), border_line_color='black')
    
    # style visual attributes
    p.xaxis.ticker = MonthsTicker(months=list(range(12))) 
    p.xgrid.ticker = MonthsTicker(months=list(range(12))) 
    p.xaxis.formatter=DatetimeTickFormatter(months=["               %b"])
    p.xaxis.major_label_text_align = 'right'
    p.yaxis[0].formatter = PrintfTickFormatter(format="%2i°")
    p.yaxis.axis_label = "Temperature [°C]"
    p.title.text_font_size = "15pt"
    p.title.align = "center"
    p.add_layout(max_temp_label)
    p.add_layout(min_temp_label)
    p.legend.location="top_left"
    
    return p

p = create_temperature_chart(df_kl)
show(p)

<div class="alert alert-info">
    
## 3. Filtering data and making annotations
</div>

The following piece of code demonstrates how to find maxima in a data column. Use this code to automatically find the highest and lowest temperature values in 2018 and place a mark in the chart above at these positions (e.g. circle the respective data points).

<div class="alert alert-success">
    
- Automatically filter the highest and lowest temperatures in Kaiserslautern in 2018.
- Integrate the code in the chart computation method above and mark the two detected positions.
- Add text labels to the positions. [Label documentation](https://docs.bokeh.org/en/latest/docs/user_guide/annotations.html#labels) for bokeh.
    
</div>

In [5]:
tmax_id = df_kl['temp_max'].idxmax()
print("KL temperature maximum:", tmax_id, df_kl.at[tmax_id,'temp_max'])

tmin_id = df_kl['temp_min'].idxmin()
print("KL temperature minimum:", tmin_id, df_kl.at[tmin_id,'temp_min'])

KL temperature maximum: 2018-08-04 00:00:00 35.5
KL temperature minimum: 2018-02-28 00:00:00 -14.0


<div class="alert alert-info">

## 4. Designing additional charts
</div>

Now design the charts for precipitation and relative humidity.

<div class="alert alert-success">
    
- Create the chart for precipitation. Try to design a bar chart using the hints below.
- Create the chart for humidity.
    
</div>

Hints for temporal x-axis:
- **Width of bars**: The width is given milliseconds. In order to get the required scaling, you will need to specify the width like: `widthInDays = ndays*24*60*60*1000` (24 hours * 60 minutes * 60 seconds * 1000 milliseconds)
- **Position of bars**: You can shift the bars using the dodge function `x=dodge('prec', value, range=p.x_range)`. Keep in mind that you need to define an appropriate `value` by which to shift the bar.

In [6]:
def create_precipitation_chart(df, width=900):
    '''Create a bokeh figure for monthly precipitation (2018 vs normal values).'''
    
    df['timestamp']=df.index
    #df['month'] = pd.DatetimeIndex(timestamp).month
    
    #temp1 = pd.to_datetime(timestamp).dt.to_period('M')
    #print(temp1)
    #print(df['month'])
    
    p = figure(plot_width=width, plot_height=200, tools=['xwheel_zoom'], x_axis_type="datetime", title="Precepitation in inches")
    #p.line(source=df, x='Timestamp', y='prec')
    
    
    
    temp_width = 12*24*60*60*1000
    
    for month in df['timestamp']:
        p.vbar(x=dodge('timestamp', -temp_width/2, range=p.x_range), width=temp_width, top='prec',bottom=0, source=df, color="black", legend_label='prec')
        p.vbar(x=dodge('timestamp', temp_width/2, range=p.x_range), width=temp_width, top='prec_normal', bottom=0, source=df, color="gray", legend_label='prec normal')
    
    
    
    #temp = df['prec_normal']
    #print(top1.values)
    #temp_x = df['Timestamp'].map(lambda x: parse(x))
    #p.vbar(x=[1, 2, 3], width=24*60*60*1000, bottom=0, top=temp.values, color="firebrick")

    p.xaxis.ticker = MonthsTicker(months=list(range(12))) 
    p.xgrid.ticker = MonthsTicker(months=list(range(12))) 
    p.xaxis.formatter=DatetimeTickFormatter(months=["               %b"])
    p.yaxis.axis_label = "Inches"
    p.title.align = "center"
    return p

show(create_precipitation_chart(df_kl_prec))

In [7]:
def create_humidity_chart(df, width=900):
    '''Create a bokeh figure for relative humidity.'''
    
    p = figure(plot_width=width, plot_height=200, tools=['xwheel_zoom'], x_axis_type="datetime")
    df['zeros'] = 0
    p.varea(source=df, x='Timestamp', y1='rel_humidity', y2='zeros', color='black')
    p.yaxis.axis_label = "Humidity [%]"
    p.xaxis.ticker = MonthsTicker(months=list(range(12))) 
    p.xgrid.ticker = MonthsTicker(months=list(range(12))) 
    p.xaxis.formatter=DatetimeTickFormatter(months=["               %b"])
    return p

show(create_humidity_chart(df_kl))

<div class="alert alert-info">
    
## 5. Combining multiple charts
</div>

In this last part, we combine the three charts you designed above.

<div class="alert alert-success">
    
- Create the combined weather chart for Kaiserslautern.
- Save a jpg/png-version or screenshot of this chart that can be uploaded in OLAT.
    
</div>

In [9]:
show(column(create_temperature_chart(df_kl), create_precipitation_chart(df_kl_prec),create_humidity_chart(df_kl)))