# Staff allocation

In this notebook we define different functions to **query staff allocation** and plot **teaching distribution** in the School.

**Table of content:**

+ [**1. Load dataset**](#load)
+ [**2. Individual staff**](#def)
+ [**3. All staff**](#all)
    + [**3a. Units taught per year**](#nb)
    + [**3b. Units taught per semester**](#nbsem)
    + [**3c. Total teaching load**](#nbtot)

In [None]:
import pandas as pd
import numpy as np

import seaborn as sns
import networkx as nx
import matplotlib.pyplot as plt

from matplotlib.pyplot import pie, axis, show
from IPython.display import display, HTML

%matplotlib inline
%config InlineBackend.figure_format = 'svg'
plt.rcParams['mathtext.fontset'] = 'cm'

# 1. Load dataset <a id='load'></a>

We first load the different files created with the following notebooks:

+ [**Listing Staff**](listingStaff.ipynb)
+ [**Listing Units**](listingUnits.ipynb)

In [None]:
# Staff file:
staff = pd.read_pickle('staff')

# Unit description files:
unitsS1 = pd.read_pickle('unitsS1')
unitsS2 = pd.read_pickle('unitsS2')
unitsOLE = pd.read_pickle('unitsOLE')
unitsHonors = pd.read_pickle('unitsHonors')

# 2. Individual staff <a id='def'></a>

The first function called **`getStaff`** plot for a teaching staff `name`, the different units of study he/she is involved in over the year. 

The function plots the staff load in 2 tables:

1. the first one shows the **number of units taught over the semesters**
2. the second shows the **load for each unit taught**

In [None]:
def getStaff(name=None, plot=True, staff=staff, unitsS1=unitsS1, unitsS2=unitsS2,
             unitsOLE=unitsOLE, unitsHonors=unitsHonors):
    '''
    This function returns individual teaching staff allocations over the considered year 
    '''
    viewstaff = staff[staff['name']==name]
    listunits = viewstaff['units'].values[0]
    listperc = viewstaff['perc'].values[0]
    position = viewstaff['position'].values[0]
    
    sumload = 0.
    staffdata = pd.DataFrame(columns=['name','position','code','lvl','load','weight',
                                      'cumload','semester','coordinator'])
    for k in range(len(listunits)):
        if len(unitsS1[unitsS1['code']==listunits[k]])>0:
            dd = unitsS1[unitsS1['code']==listunits[k]]
            listsemester = 'S.1'
        
        if len(unitsS2[unitsS2['code']==listunits[k]])>0:
            dd = unitsS2[unitsS2['code']==listunits[k]]
            listsemester = 'S.2'    

        if len(unitsOLE[unitsOLE['code']==listunits[k]])>0:
            dd = unitsOLE[unitsOLE['code']==listunits[k]]
            listsemester = 'OLE'

#         if len(unitsHonors[unitsHonors['code']==listunits[k]])>0:
#             dd = unitsHonors[unitsHonors['code']==listunits[k]]
#             listsemester = 'Hon'

        listlevel = dd['level'].values[0]
        listcoordinator = dd['coordinator'].values[0]
        sumload += (listperc[k]*dd['load'].values[0]/100.)/100.
        weight = (listperc[k]*dd['load'].values[0]/100.)/100.
        staffdata = staffdata.append({'name':name,'position':position,'code':listunits[k], 
                                      'lvl': listlevel,'load': listperc[k],'weight':weight,
                                      'cumload':sumload,'semester':listsemester,
                                      'coordinator':listcoordinator}, 
                                      ignore_index=True)
    if plot:
        fig, ax = plt.subplots(figsize=(6,4), ncols=2, nrows=1,
                               gridspec_kw = {'width_ratios':[2, 5]})
        g0 = sns.countplot(x='semester', data=staffdata, ax=ax[0], palette='Blues')
        g0.set_xticklabels(g0.get_xticklabels(),rotation=30)
        sns.despine()

        g1 = sns.barplot(x='code', y='load', data=staffdata, 
                         ax=ax[1], palette='RdBu_r')
        g1.set_xticklabels(g1.get_xticklabels(),rotation=30)
        plt.show()
    
    return staffdata

Example of **`getStaff`** usage:

In [None]:
name = 'Bruce'
df = getStaff(name)

A second function **`plotStaffGraph`** is defined to plot for the chosen staff its teaching load in a _network graph_... 

In [None]:
def plotStaffGraph(df=None, name=None):
    '''
    Plotting for individual staff the teaching load in a network graph
    '''
    G = nx.from_pandas_edgelist(df, 'name', 'code', ['weight'])
    pos = nx.circular_layout(G)

    edges = G.edges()
    weights = [G[u][v]['weight']*5. for u,v in edges]

    plt.figure(1,figsize=(7,4)) 
    
    nx.draw_networkx_nodes(G,pos,
                           node_color='k',
                           node_size=4200)
    
    nx.draw_networkx_nodes(G,pos,
                           node_color='#A0CBE2',
                           node_size=4000)
    
    nx.draw_networkx_nodes(G,pos,nodelist=[name],
                           node_color='k',
                           node_size=5200)
    
    nx.draw_networkx_nodes(G,pos,nodelist=[name],
                           node_color='r',
                           node_size=5000)

    labels = nx.get_edge_attributes(G,'weight')

    nx.draw_networkx_edges(G,pos,
            width=weights, edge_cmap=plt.cm.Blues, with_labels=True)
    nx.draw_networkx_edge_labels(G,pos,edge_labels=labels)
    nx.draw_networkx_labels(G,pos,font_size=11)
    plt.ylim(-1.5,1.5)
    plt.axis('off')
    plt.tight_layout()
    plt.show()

    return

Example of **`plotStaffGraph`** usage, note that the thickness of the connecting lines are proportional to the teaching load...

In [None]:
plotStaffGraph(df, name)

Other example, showing also an _HTML display_ of the staff allocation table...

In [None]:
name = 'Rey'
df = getStaff(name,plot=True)
display(HTML(df.to_html()))
plotStaffGraph(df,name)

# 3. All staff <a id='all'></a>

We know create a **`pandas dataframe`** containing the _total number of units_ and their associated _weights_ for all staff.

This `dataframe` is called **`allstaff`**

In [None]:
allstaff = pd.DataFrame(columns=['name','unit','weight'])

for k in range(len(staff)):
    name = staff['name'][k]
    df = getStaff(name, plot=False)
    for p in range(len(df)):
        allstaff = allstaff.append({'name':df['name'][p],'position':df['position'][p],'unit':df['semester'][p],
                                'weight':df['weight'][p]},ignore_index=True)

We can visualise the content of **`allstaff`** with the following line:

In [None]:
# Uncomment next line by deleting the # in front...
#display(HTML(allstaff.to_html()))

## 3a. Units per year <a id='nb'></a>

We now plot for all staff the number of UoS that are taught over the year...

In [None]:
ax = allstaff.groupby('unit')['name'].value_counts().unstack(0).plot.bar(stacked=True, width=0.7, 
                                                                    colormap='Set3', figsize=(10,4))
plt.legend(loc='center left', bbox_to_anchor=(1.0, 0.8))
ax.set_xlabel("Teaching staff")
ax.set_ylabel("Number of UoS")
plt.show()

Another interesting plot can be done like **permanent versus contracted teaching staff**...

In [None]:
permanents = allstaff[allstaff['position']=='Permanent']
others = allstaff[allstaff['position']=='Other']

ax = permanents.groupby('unit')['name'].value_counts().unstack(0).plot.bar(stacked=True, width=0.7, 
                                                                    colormap='Set2', figsize=(8,4))
plt.legend(loc='center left', bbox_to_anchor=(1.0, 0.8))
ax.set_xlabel("Permanent Academics")
ax.set_ylabel("Number of UoS")
plt.show()

ax = others.groupby('unit')['name'].value_counts().unstack(0).plot.bar(stacked=True, width=0.7, 
                                                                    colormap='Set2', figsize=(4,4))
plt.legend(loc='center left', bbox_to_anchor=(1.0, 0.8))
ax.set_xlabel("Others Academics")
ax.set_ylabel("Number of UoS")
plt.show()

## 3b. Load per semester <a id='nbsem'></a>


### Semester 1:

We can also add a **5% load for coordination of a UoS** as shown in the next cell...

In [None]:
unitsS1['coordload'] = unitsS1['load']*0.05

tmp = unitsS1.groupby( ["coordinator"],as_index=False).agg({"coordload": "sum"})

coordload = tmp.groupby(["coordinator"],as_index=False).agg({"coordload": "sum"})
coordload['coordload'] = coordload['coordload']/100.
coordload = coordload.rename(columns = {'coordinator':'name','coordload':'weight'})

It comes to the following loads for the coordinators of units in **semester 1**

In [None]:
colors = np.zeros((40,4))
colors[:20] =  plt.cm.tab20b(np.arange(20))
colors[20:] =  plt.cm.tab20c(np.arange(20))

totcoord = coordload.groupby(['name'])['weight'].sum()
ax = totcoord.sort_values(ascending=True).plot.bar(figsize=(8,5),legend=False, width=0.7, color=colors)
ax.set_xlabel("Teaching staff")
ax.set_ylabel("Coordination load")
plt.show()

In [None]:
# Uncomment next line by deleting the # in front...
#display(HTML(unitsS1.to_html()))

We now extract the staff teaching in semester 1, note that OLEs and Honours are not considered in this calculation...

In [None]:
staffS1 = pd.DataFrame(columns=['name','unit','weight'])

for k in range(len(staff)):
    name = staff['name'][k]
    df = getStaff(name, plot=False)
    for p in range(len(df)):
        if df['semester'][p] == 'S.1':
            staffS1 = staffS1.append({'name':df['name'][p],'position':df['position'][p],'unit':df['semester'][p],
                                    'weight':df['weight'][p]},ignore_index=True)

In [None]:
# Uncomment next line by deleting the # in front...
#display(HTML(staffS1.to_html()))

We now add to the teaching load the coordination load which will give us the **total load**!

In [None]:
teachingload = staffS1.groupby(['name'],as_index=False).agg({"weight": "sum"})
teachingload = teachingload.append(coordload,ignore_index=True)
totload = teachingload.groupby(['name'])['weight'].sum()

And we plot our results:

In [None]:
#ax = totload.plot.bar(figsize=(10,5),legend=False, width=0.7, color=colors)
#ax.set_xlabel("Teaching staff for semester 1")
#ax.set_ylabel("Total load")
#plt.show()

ax = totload.sort_values(ascending=True).plot.bar(figsize=(10,5),legend=False, width=0.7, color=colors)
ax.set_xlabel("Teaching staff for semester 1")
ax.set_ylabel("Total load")
plt.show()

If we want to plot the staff load without the coordination load we can do it this way:

In [None]:
all_load = staffS1.groupby(['name'],as_index=False).agg({"weight": "sum"})
acadload = all_load.groupby(['name'])['weight'].sum()

ax = acadload.sort_values(ascending=True).plot.bar(figsize=(10,5),legend=False, width=0.7, color=colors)
ax.set_xlabel("All academic staff for semester 1")
ax.set_ylabel("Teaching load")
plt.show()

### Semester 2:

This is the same thing as above...

In [None]:
unitsS2['coordload'] = unitsS2['load']*0.05

tmp = unitsS2.groupby( ["coordinator"],as_index=False).agg({"coordload": "sum"})

coordload = tmp.groupby(["coordinator"],as_index=False).agg({"coordload": "sum"})
coordload['coordload'] = coordload['coordload']/100.
coordload = coordload.rename(columns = {'coordinator':'name','coordload':'weight'})

In [None]:
colors = np.zeros((40,4))
colors[:20] =  plt.cm.tab20b(np.arange(20))
colors[20:] =  plt.cm.tab20c(np.arange(20))

totcoord = coordload.groupby(['name'])['weight'].sum()
ax = totcoord.sort_values(ascending=True).plot.bar(figsize=(8,5),legend=False, width=0.7, color=colors)
ax.set_xlabel("Teaching staff")
ax.set_ylabel("Coordination load")
plt.show()

In [None]:
staffS2 = pd.DataFrame(columns=['name','unit','weight'])

for k in range(len(staff)):
    name = staff['name'][k]
    df = getStaff(name, plot=False)
    for p in range(len(df)):
        if df['semester'][p] == 'S.2':
            staffS2 = staffS2.append({'name':df['name'][p],'position':df['position'][p],'unit':df['semester'][p],
                                    'weight':df['weight'][p]},ignore_index=True)
            
teachingload = staffS2.groupby(['name'],as_index=False).agg({"weight": "sum"})
teachingload = teachingload.append(coordload,ignore_index=True)
totload = teachingload.groupby(['name'])['weight'].sum()

ax = totload.sort_values(ascending=True).plot.bar(figsize=(10,5),legend=False, width=0.7, color=colors)
ax.set_xlabel("Teaching staff for semester 2")
ax.set_ylabel("Total load")
plt.show()

## 3c. Total staff teaching load <a id='nbtot'></a>


#### Adding load for coordination

In [None]:
unitsS1['coordload'] = unitsS1['load']*0.05
unitsS2['coordload'] = unitsS2['load']*0.05
unitsOLE['coordload'] = unitsOLE['load']*0.05
unitsHonors['coordload'] = unitsHonors['load']*0.05

tmp1 = unitsS1.groupby(["coordinator"],as_index=False).agg({"coordload": "sum"})
tmp2 = unitsS2.groupby( ["coordinator"],as_index=False).agg({"coordload": "sum"})
tmp1 = tmp1.append(tmp2,ignore_index=True)
tmp2 = unitsOLE.groupby( ["coordinator"],as_index=False).agg({"coordload": "sum"})
tmp1 = tmp1.append(tmp2,ignore_index=True)
tmp2 = unitsHonors.groupby( ["coordinator"],as_index=False).agg({"coordload": "sum"})
tmp1 = tmp1.append(tmp2,ignore_index=True)

coordload = tmp1.groupby(["coordinator"],as_index=False).agg({"coordload": "sum"})
coordload['coordload'] = coordload['coordload']/100.
coordload = coordload.rename(columns = {'coordinator':'name','coordload':'weight'})

In [None]:
colors = np.zeros((40,4))
colors[:20] =  plt.cm.tab20b(np.arange(20))
colors[20:] =  plt.cm.tab20c(np.arange(20))

totcoord = coordload.groupby(['name'])['weight'].sum()
ax = totcoord.sort_values(ascending=True).plot.bar(figsize=(8,5),legend=False, width=0.7, color=colors)
ax.set_xlabel("Teaching staff")
ax.set_ylabel("Coordination load")
plt.show()

Then we can plot the total teaching load for all staff...

In [None]:
teachingload = allstaff.groupby(['name'],as_index=False).agg({"weight": "sum"})
teachingload = teachingload.append(coordload,ignore_index=True)
totload = teachingload.groupby(['name'])['weight'].sum()

#ax = totload.plot.bar(figsize=(10,5),legend=False, width=0.7, color=colors)
#ax.set_xlabel("Teaching staff")
#ax.set_ylabel("Total load")
#plt.show()

ax = totload.sort_values(ascending=True).plot.bar(figsize=(10,5),legend=False, width=0.7, color=colors)
ax.set_xlabel("Teaching staff")
ax.set_ylabel("Total load")
plt.show()

Or for **full teaching academics** only

In [None]:
academicload = permanents.groupby(['name'],as_index=False).agg({"weight": "sum"})
acadload = academicload.groupby(['name'])['weight'].sum()

ax = acadload.sort_values(ascending=True).plot.bar(figsize=(10,5),legend=False, width=0.7, color=colors)
ax.set_xlabel("Full teaching academic staff")
ax.set_ylabel("Total load")
plt.show()

Or for the others...

In [None]:
othersload = others.groupby(['name'],as_index=False).agg({"weight": "sum"})
othload = othersload.groupby(['name'])['weight'].sum()

ax = othload.sort_values(ascending=True).plot.bar(figsize=(6,5),legend=False, width=0.7, color=colors)
ax.set_xlabel("Teaching academic staff")
ax.set_ylabel("Total load")
plt.show()