**Course**: Data Visualization (Prof. Dr. Heike Leitte, Luisa Vollmer, RPTU Kaiserslautern),   **Name**: Faris Abu Ali,   **Date**: 30.11.2000

<div class="alert alert-info">
    
# Assignment 2 - Good Chart Design
</div>

The **goals** of the second assignment are:
- Practice visualization design critiques using a given visualization.
- Decompose a given chart into its components and analyze their design.
- Practice visual encoding theory by detecting marks and channels.
- Design an information rich chart.

To achieve these goals, your task is to analyze, critique, and revise a given visualization that depicts the received and handled tickets (technical issues).

<div class="alert alert-danger">

**Important**: While no points will be awarded for typing the correct answers in the notebooks, it is highly advised to solve the tasks thoroughly. They are designed to be encouraging and provide you with valuable learnings for the exam, understanding of the methods and practical coding.
</div>

<div class="alert alert-success">
    
All tasks in this notebook are marked in green.
</div>

<div class="alert alert-info">

## 1. Scenario and Starter Code
</div>

Imagine that you manage an information technology (IT) team. Your team receives tickets, or technical issues, from employees. In the past year, you've had two team members leave and decided at the time not to replace them. You have heard a rumbling of complaints from the remaining employees about having to "pick up the slack". You've just been asked about your hiring needs for the coming year and are wondering if you should hire a couple more people. First, you want to understand what impact the departure of individuals over the past year has had on your team's overall productivity. You plot the monthly trend of incoming tickets and those processed over the past calender year. You see that there is some evidence your team's productivity is suffering from being short-staffed and now want to turn the quick-and-dirty visual you created into the basis for your hiring request.

Below is the code and chart of the initial ticket visualization. Read through the code and make sure that you understand it. 

In [1]:
import pandas as pd
from bokeh.plotting import figure, output_notebook, show
from bokeh.models import ColumnDataSource, LabelSet
from bokeh.transform import dodge

output_notebook()

In [2]:
month = [i for i in range(1,13)]
received = [160,184,241,149,180,161,132,202,160,139,149,177]
processed = [160,184,237,148,181,150,123,156,126,121,124,140]

df = pd.DataFrame({'month': month, 
                   'received': received, 
                   'processed': processed})

print(df)

    month  received  processed
0       1       160        160
1       2       184        184
2       3       241        237
3       4       149        148
4       5       180        181
5       6       161        150
6       7       132        123
7       8       202        156
8       9       160        126
9      10       139        121
10     11       149        124
11     12       177        140


In [3]:
p = figure(title="TICKET TREND", height=400)
p.vbar(source=df, x=dodge('month', -0.175, range=p.x_range), top='received', width=0.3,
       legend_label="Ticket Volume Received")
p.vbar(source=df, x=dodge('month',  0.175, range=p.x_range), top='processed', width=0.3, 
       color="deeppink", legend_label="Ticket Volume Processed")

# remove the toolbar
p.toolbar.logo = None
p.toolbar_location = None

show(p)

<div class="alert alert-info">
    
## 2. Analysis of the original chart
</div>

### Message

<div class="alert alert-success">
    
Write down the message you want to convey with your chart
    
</div>

<div class="alert alert-warning">

I want to show the bad impact of the departure of the employees, by showing how the gap between the 'received' and 'processed' tickets has increased thorugh the time. Which indicates that the team members are no longer able to process all tickets.
</div>

### Design critique

<div class="alert alert-success">
   
- How good does the current design support the message you're trying to make?
- Can you spot problems with the current design?
    
</div>

<div class="alert alert-warning">
- It shows clearly that the gap between 'received' and 'processed' tickets is increasing.

Problems:
- Doesn't label the axis. It might be clear what is the x-axis from 1 to 12 represents? Is the unit months? days? hours? maybe years? Not clear. Same for the y-axis.
- It doesn't mention data about the number of employees compared to the past. Did the number of employees increase? decrease? or what? 
</div>

### Analyze the chart design
Now go through each of the elements of the chart and check if the design is suitable. 

<div class="alert alert-success">
    
Make a list of all visible elements of the chart and problems with the design.
    
</div>

<div class="alert alert-warning">

- Legends are ok.
- Title exists but not very meaningful. Can be more informative.
</div>

<div class="alert alert-info">
    
## 3. Improved design of helper elements
</div>

A major problem of the current chart is that it is too cluttered and full of competing information.
<div class="alert alert-success">
    
Improve all chart elements of the list above except for the data representation (bar elements) to achieve a better "background" design.

</div>

In [4]:
from bokeh.models.tickers import MonthsTicker
from bokeh.models import DatetimeTickFormatter, PrintfTickFormatter

p = figure(title="The Impact of Employee Shortage on Productivity", height=400)
p.vbar(source=df, x=dodge('month', -0.175, range=p.x_range), top='received', width=0.3,
       legend_label="Ticket Volume Received")
p.vbar(source=df, x=dodge('month',  0.175, range=p.x_range), top='processed', width=0.3, 
       color="deeppink", legend_label="Ticket Volume Processed")

# remove the toolbar
p.toolbar.logo = None
p.toolbar_location = None

p.xaxis.axis_label = "Month"
p.xaxis.ticker= month
p.xaxis.major_label_text_align = 'right'

p.yaxis.axis_label = "Number of Tickets"

p.title.text_font_size = "12pt"
p.title.align = "center"

show(p)


<div class="alert alert-info">
    
## 4. Data Encoding
</div>

### Marks and channels

<div class="alert alert-success"> What are the marks and channels used to encode data? Write down the information as discussed in the lecture. </div>


<div class="alert alert-warning">
<strong>Marks</strong>:
Marks represent the basic geometric elements used to encode the data:

- Bars (rectangles/areas): These are used to represent the two datasets, "Ticket Volume Received" and "Ticket Volume Processed."

<strong>Channels</strong>:
Channels define the visual properties applied to the marks to encode data:

- Position (Vertical/Height): The height of each bar encodes the quantity of tickets (a quantitative variable).

- Color:
    - Blue represents "Ticket Volume Received."
    - Pink represents "Ticket Volume Processed."

- Orientation: The bars are vertically oriented.

- Grouping (Spatial Layout): Bars are grouped by month (along the horizontal axis), helping viewers compare data within and across months.

</div>

### Creating new columns

The current dataframe only contains the raw data. Often you need to process the data further to obtain suitable input for your data. Read the [ten minutes to pandas](https://pandas.pydata.org/docs/user_guide/10min.html#min) introduction to pandas programming.

Now you are able to do the following dataframe operations:

In [5]:
import numpy as np

a = [int(i*10) for i in np.random.random(6)]
b = [int(i*10) for i in np.random.random(6)]

df_test = pd.DataFrame({"A": a, "B": b})

# select a column
df_test['A']

# create a new column by adding two column
df_test['A+B'] = df_test['A'] + df_test['B']

# create a new column by applying a function to an existing column
df_test['A+1'] = df_test['A'].apply(lambda x: x+1)
df_test['Size(A)'] = df_test['A'].apply(lambda x: "S" if x < 5 else "M")

df_test

Unnamed: 0,A,B,A+B,A+1,Size(A)
0,3,5,8,4,S
1,0,1,1,1,S
2,9,3,12,10,M
3,0,5,5,1,S
4,3,3,6,4,S
5,4,4,8,5,S


<div class="alert alert-success"> 
    
Use these dataframe operations to create the following columns in the ticket dataframe:
- the difference between received and processed tickets
- the total number of open tickets, i.e., the sum of all tickets that have not been handled yet. See method [`cumsum()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.cumsum.html) for a pandas Series.
    
</div>

In [6]:
df['diff'] = df['received'] - df['processed']

df['total_open'] = df['diff'].cumsum()

df

Unnamed: 0,month,received,processed,diff,total_open
0,1,160,160,0,0
1,2,184,184,0,0
2,3,241,237,4,4
3,4,149,148,1,5
4,5,180,181,-1,4
5,6,161,150,11,15
6,7,132,123,9,24
7,8,202,156,46,70
8,9,160,126,34,104
9,10,139,121,18,122


### Alternative data encodings
<div class="alert alert-success"> Now change the presentation of the data. Select two new, suitable chart types for the ticket data. Stick to chart types that use the same x- and y-axis as defined above. Implement (at least) two different data encodings. What is your favorite? </div>

In [18]:
from bokeh.palettes import Blues8, Reds8

# 1. Line Chart (Alternative representation)
line_chart = figure(title="Recieved vs Processed Tickets Over Time", height=400, x_axis_label="Month", y_axis_label="Number of Tickets")
line_chart.line(source=df, x='month', y='received', legend_label="Received", line_width=2)
line_chart.line(source=df,x='month', y='processed', legend_label="Processed", color="deeppink", line_width=2)
line_chart.scatter(df['month'], df['received'], size=6)
line_chart.scatter(df['month'], df['processed'], size=6, color="deeppink")

# 2. Stacked Bar Chart (Alternative representation)
stacked_bar_chart = figure(title="Recieved vs Processed Tickets Over Time", height=400, x_axis_label="Month", y_axis_label="Number of Tickets")
stacked_bar_chart.varea_stack(['processed', 'received'], source=df, x='month', color=("deeppink", Blues8[1]), legend_label=["Processed", "Received"])

# Representing the total open tickets over time
open_tickets_chart = figure(title="Total Open Tickets Over Time", height=400, x_axis_label="Month", y_axis_label="Number of Tickets")
open_tickets_chart.line(source=df, x='month', y='total_open', legend_label="Total Open", line_width=2, color="orange")
open_tickets_chart.scatter(df['month'], df['total_open'], size=6, color="orange")
open_tickets_chart.line(source=df, x='month', y='diff', legend_label="Received - Processed", line_width=2, color="lightgrey")
open_tickets_chart.scatter(df['month'], df['diff'], size=6, color="lightgrey")
open_tickets_chart.legend.location = "top_left"

diff_chart = figure(title="Recieved vs Processed Tickets Over Time", height=400, x_axis_label="Month", y_axis_label="Number of Tickets")
diff_chart.vbar(source=df, x='month', top='diff', width=0.5, color="orange")

for chart in [line_chart, stacked_bar_chart, open_tickets_chart, diff_chart]:
    chart.xaxis.axis_label = "Month"
    chart.xaxis.ticker= month
    chart.xaxis.major_label_text_align = 'right'

    chart.title.align = "center"
    # chart.title.text_font_size = "12pt"

    # remove the toolbar
    chart.toolbar.logo = None
    chart.toolbar_location = None


show(line_chart)
show(stacked_bar_chart)
show(open_tickets_chart)
show(diff_chart)

<div class="alert alert-info"> 
    
## 5. Revise the chart 
</div>

<div class="alert alert-success"> 
    
Now combine the results from the previous steps and create a final chart that you would send to your boss to ask for additional staff. Also think about adding additional information to your chart with labels etc.
</div>

In [19]:
from bokeh.layouts import row, column, Spacer

padding = 20 # padding between the charts in pixels

# update the height of the charts
for chart in [line_chart, stacked_bar_chart, open_tickets_chart, diff_chart]:
    chart.height = 300
    chart.width = 420

# Create a layout with the charts
layout = column(
    row(line_chart, Spacer(width=padding), stacked_bar_chart),
    Spacer(height=padding),
    row(diff_chart, Spacer(width=padding), open_tickets_chart),
    # sizing_mode="stretch_both",
)

show(layout)

**Caption**:

As we learned: Each figure needs a caption. <div class="alert alert-success"> Write a caption for your figure. Position it underneath your chart so you can take a screenshot of chart + caption. </div>

In [33]:
from bokeh.layouts import row, column, Spacer
from bokeh.models import Div

padding = 20 # padding between the charts in pixels

# update the height of the charts
for chart in [line_chart, stacked_bar_chart, open_tickets_chart, diff_chart]:
    chart.height = 300
    chart.width = 500

caption = """
<style>
    .caption-text {
        font-size: 16px;
        line-height: 1.5;
        color: #333333;
    }
</style>
<div class="caption-text">
    The charts above illustrate the impact of employee shortage on ticket processing over the past year. 
    The line chart and stacked bar chart show the monthly trend of received and processed tickets, highlighting the increasing gap between them. 
    The total open tickets chart indicates the cumulative number of unresolved tickets, while the bar chart visualizes the monthly difference between received and processed tickets.
</div>
"""

caption_div = Div(text=caption, width=1000)

# Create a layout with the charts
layout_with_caption = column(
    layout,
    Spacer(height=5),
    caption_div,
)

show(layout_with_caption)