**Course**: Data Visualization (Prof. Dr. Heike Leitte, Luisa Vollmer, RPTU Kaiserslautern),   **Name**: XXX XXX,   **Date**: DD.MM.YYYY

<div class="alert alert-info">
    
# Assignment 2 - Good Chart Design
</div>

The **goals** of the second assignment are:
- Practice visualization design critiques using a given visualization.
- Decompose a given chart into its components and analyze their design.
- Practice visual encoding theory by detecting marks and channels.
- Design an information rich chart.

To achieve these goals, your task is to analyze, critique, and revise a given visualization that depicts the received and handled tickets (technical issues).

<div class="alert alert-danger">

**Important**: While no points will be awarded for typing the correct answers in the notebooks, it is highly advised to solve the tasks thoroughly. They are designed to be encouraging and provide you with valuable learnings for the exam, understanding of the methods and practical coding.
</div>

<div class="alert alert-success">
    
All tasks in this notebook are marked in green.
</div>

<div class="alert alert-info">

## 1. Scenario and Starter Code
</div>

Imagine that you manage an information technology (IT) team. Your team receives tickets, or technical issues, from employees. In the past year, you've had two team members leave and decided at the time not to replace them. You have heard a rumbling of complaints from the remaining employees about having to "pick up the slack". You've just been asked about your hiring needs for the coming year and are wondering if you should hire a couple more people. First, you want to understand what impact the departure of individuals over the past year has had on your team's overall productivity. You plot the monthly trend of incoming tickets and those processed over the past calender year. You see that there is some evidence your team's productivity is suffering from being short-staffed and now want to turn the quick-and-dirty visual you created into the basis for your hiring request.

Below is the code and chart of the initial ticket visualization. Read through the code and make sure that you understand it. 

In [2]:
import pandas as pd

from bokeh.plotting import figure, output_notebook, show
from bokeh.models import ColumnDataSource, LabelSet, PrintfTickFormatter, DatetimeTickFormatter, Label
from bokeh.models.tickers import MonthsTicker
from bokeh.transform import dodge
import datetime
output_notebook()

In [3]:
month = [i for i in range(1,13)]
received = [160,184,241,149,180,161,132,202,160,139,149,177]
processed = [160,184,237,148,181,150,123,156,126,121,124,140]



df = pd.DataFrame({'month': month, 
                   'received': received, 
                   'processed': processed})

In [4]:
p = figure(title="TICKET TREND", height=400)
p.vbar(source=df, x=dodge('month', -0.175, range=p.x_range), top='received', width=0.3, 
       legend_label="Ticket Volume Received")
p.vbar(source=df, x=dodge('month',  0.175, range=p.x_range), top='processed', width=0.3, 
       color="deeppink", legend_label="Ticket Volume Processed")

# remove the toolbar
p.toolbar.logo = None
p.toolbar_location = None

show(p)

<div class="alert alert-info">
    
## 2. Analysis of the original chart
</div>

### Message

<div class="alert alert-success">
    
Write down the message you want to convey with your chart
    
</div>

 <div class="alert alert-warning">

I want to show that more workers are needet to process all the incoming tickets. It should be visable that not all tickets can be processed.
</div>

### Design critique

<div class="alert alert-success">
   
- How good does the current design support the message your trying to make?
- Can you spot problems with the current design?
    
</div>

<div class="alert alert-warning">

There are is no label on the y axies. The x axies should tell the month name instead of the month number. The bar chart only shows the unsolved tickets per month but not how they accumulate over the year. 
</div>

### Analyze the chart design
Now go through each of the elements of the chart and check if the design is suitable. 

<div class="alert alert-success">
    
Make a list of all visible elements of the chart and problems with the design.
    
</div>

<div class="alert alert-warning">

- The contrast of the colors is unplesand for the eye
- The ticks of the x-Achses are not choosen properly. There should be a tick for each month not for every second month
- A line chart that shows how the tickets accumulate over time would suport my message better. 
- The area under the line should be colord
</div>

<div class="alert alert-info">
    
## 3. Improved design of helper elements
</div>

A major problem of the current chart is that it is too cluttered and full of competing information.
<div class="alert alert-success">
    
Improve all chart elements of the list above except for the data representation (bar elements) to achieve a better "background" design.

</div>

In [5]:
p = figure(title="TICKET VOLUME", height=400, x_axis_type='datetime')
df['month'] = [datetime.date(1900, x, 1) for x in df['month']]
p.vbar(source=df, x=dodge('month', -700000000/1.714, range=p.x_range), top='received', width=700000000, 
       legend_label="Received  Tickets")
p.vbar(source=df, x=dodge('month',  700000000/1.714, range=p.x_range), top='processed', width=700000000, 
       color="deeppink", legend_label="Processed Tickets")
p.xaxis.ticker = MonthsTicker(months=list(range(12))) 
p.xgrid.ticker = MonthsTicker(months=list(range(12))) 
p.xaxis.formatter=DatetimeTickFormatter(months="%b")
# remove the toolbar
p.toolbar.logo = None
p.toolbar_location = None

p.yaxis.axis_label = "Tickes"
p.xaxis.axis_label = "Month"




show(p)

<div class="alert alert-info">
    
## 4. Data Encoding
</div>

### Marks and channels

<div class="alert alert-success"> What are the marks and channels used to encode data? Write down the information as discussed in the lecture. </div>


<div class="alert alert-warning">

Marks
- The type of geometry -> in this example bars are used

Channels
- Colors -> recieved Tickets are blue and processed tickets are 'Pink'
- Vertical length and horizontal positions of the bars
</div>

### Creating new columns

The current dataframe only contains the raw data. Often you need to process the data further to obtain suitable input for your data. Read the [ten minutes to pandas](https://pandas.pydata.org/docs/user_guide/10min.html#min) introduction to pandas programming.

Now you are able to do the following dataframe operations:

In [6]:
import numpy as np

a = [int(i*10) for i in np.random.random(6)]
b = [int(i*10) for i in np.random.random(6)]

df_test = pd.DataFrame({"A": a, "B": b})

# select a column
df_test['A']

# create a new column by adding two column
df_test['A+B'] = df_test['A'] + df_test['B']

# create a new column by applying a function to an existing column
df_test['A+1'] = df_test['A'].apply(lambda x: x+1)
df_test['Size(A)'] = df_test['A'].apply(lambda x: "S" if x < 5 else "M")

df_test

Unnamed: 0,A,B,A+B,A+1,Size(A)
0,0,1,1,1,S
1,8,7,15,9,M
2,9,2,11,10,M
3,0,3,3,1,S
4,1,9,10,2,S
5,9,3,12,10,M


<div class="alert alert-success"> 
    
Use these dataframe operations to create the following columns in the ticket dataframe:
- the difference between received and processed tickets
- the total number of open tickets, i.e., the sum of all tickets that have not been handled yet. See method [`cumsum()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.cumsum.html) for a pandas Series.
    
</div>

In [22]:
df['received-processed'] = df['received'] - df['processed']
df['Unprocessed'] = df['received-processed'].cumsum()
df

Unnamed: 0,month,received,processed,received-processed,Unprocessed
0,1900-01-01,160,160,0,0
1,1900-02-01,184,184,0,0
2,1900-03-01,241,237,4,4
3,1900-04-01,149,148,1,5
4,1900-05-01,180,181,-1,4
5,1900-06-01,161,150,11,15
6,1900-07-01,132,123,9,24
7,1900-08-01,202,156,46,70
8,1900-09-01,160,126,34,104
9,1900-10-01,139,121,18,122


### Alternative data encodings
<div class="alert alert-success"> Now change the presentation of the data. Select two new, suitable chart types for the ticket data. Stick to chart types that use the same x- and y-axis as defined above. Implement (at least) two different data encodings. What is your favorite? </div>

In [14]:
unresolvedChart = figure(title="TICKET VOLUME", height=400, x_axis_type='datetime')
unresolvedChart.line(source=df, x='month', y='Unprocessed', line_width=1, line_color='black')
unresolvedChart.varea(source=df, x='month', y1 = 0, y2='Unprocessed', fill_alpha=0.5, fill_color='red', legend_label="Unresolved Tickets in Backlog")
unresolvedChart.legend.location = 'top_left'
unresolvedChart.xaxis.ticker = MonthsTicker(months=list(range(12))) 
unresolvedChart.xgrid.ticker = MonthsTicker(months=list(range(12))) 
unresolvedChart.xaxis.formatter=DatetimeTickFormatter(months="%b")


# remove the toolbar
unresolvedChart.toolbar.logo = None
unresolvedChart.toolbar_location = None

unresolvedChart.yaxis.axis_label = "Tickes"
unresolvedChart.xaxis.axis_label = "Month"



show(unresolvedChart)

<div class="alert alert-info"> 
    
## 5. Revise the chart 
</div>

<div class="alert alert-success"> 
    
Now combine the results from the previous steps and create a final chart that you would send to your boss to ask for additional staff. Also think about adding additional information to your chart with labels etc.
</div>

In [21]:

unresolvedChart.circle( y= df.at[11,'Unprocessed'], x= df.at[11,'month'],size=7, legend_label = str(df.at[11,'Unprocessed']) + ' unresolved tickets accumulated over the year')
#ll =Label(x=700000, y = 50, text = "i")
#unresolvedChart.add_layout(ll)
show(unresolvedChart)


**Caption**:

As we learned: Each figure needs a caption. <div class="alert alert-success"> Write a caption for your figure. Position it underneath your chart so you can take a screenshot of chart + caption. </div>

In [10]:
show(unresolvedChart)
print("Accumulation of unresolved tickets in the backlog due to lag of workers.")

Accumulation of unresolved tickets in the backlog due to lag of workers
