This notebook is in html. To be able to run it, please click: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/gunerilhan/gunerilhan.github.io/blob/master/img/uni.ipynb)

In [1]:
import pandas as pd
import numpy as np


### Number of people self-isolating to positive tests

The goal of this code is to convert the Covid-19 [numbers](https://www.kent.ac.uk/coronavirus/update-on-covid-19-cases) provided by the University to a metric that can be comparable with government data. There are numerous issues with the University's reporting including bunching of different campuses and reliability of the information. However, I want to focus on the issue of reporting the number of people self-isolating, instead of the number of people tested positive as in the government [data](https://coronavirus.data.gov.uk/). The University describes its data as follows:
> Each week we publish the number of cases of students and staff who are **currently** self-isolating following a **positive test** for Covid-19 that have been **reported** to us.

Let's try to convert the number of self-isolating (a stock variable) to number of positives in a week (a flow variable). To achieve this goal, I'll make a bunch of assumptions (hey, I'm a macroeconomist!).

- Let $x_t$ be the number of positive cases at week $t$.

- Let $m$ be the required days of isolation after a positive tests. Assume $7<m \leq 14$.

- Let $K_t$ be the number of people isolating at the beginning of week $t$.

- Suppose $K_0=0$ at the beginning of the term.

Then, $K_1 = x_0$, $K_2=x_0+x_1-\frac{(14-m)}{7}x_0$, ... (Implicit assumption: equal number of people test positive each day in a week.
)

The law of motion of $K$ is

$$K_{t+1}=K_t+x_t-\frac{(14-m)}{7}x_{t-1}$$

Here are the numbers reported by the University:


|Date 	|Students living on-campus |	Students living off-campus |	Staff |
|:---   | :---: | :---: | :---: |
|Monday 23 November |23 | 14 |3 | 
|Monday 16 November  |	35 |	22 |	2|
|Monday 9 November 	|31 |18 |0 |
|Monday 2 November 	|16 	|16 |	3|
|Monday 26 October 	|24 	|5 |	1|
|Monday 19 October 	|19 	|7 |	0|
|Monday 12 October 	|4 	|2 	|0|
|Wednesday 7 October |	2 |	6 |	1|

Using these numbers, lets calculate avearge number of daily positive tests. 

First, we need to create some variables:

In [2]:
dates = ['23/11/2020','16/11/2020','9/11/2020','2/11/2020','26/10/2020','19/10/2020','12/10/2020','7/10/2020']
on_campus, off_campus, staff = [23,35,31,16,24,19,4,2],[14,22,18,16,5,7,2,6],[3,2,0,3,1,0,0,1]

In [3]:
isolating = pd.DataFrame({'Students living on-campus':on_campus,
             'Students living off-campus':off_campus,
             'Staff':staff},index=pd.to_datetime(dates,format="%d/%m/%Y"))
isolating['Total'] = isolating.sum(axis=1)

In [4]:
isolating

Unnamed: 0,Students living on-campus,Students living off-campus,Staff,Total
2020-11-23,23,14,3,40
2020-11-16,35,22,2,59
2020-11-09,31,18,0,49
2020-11-02,16,16,3,35
2020-10-26,24,5,1,30
2020-10-19,19,7,0,26
2020-10-12,4,2,0,6
2020-10-07,2,6,1,9


In [5]:
# Now, assume people isolate 10 days after a positive test
m = 10

In [6]:
# Now calculate the weekly case numbers using the above formula
columns = isolating.columns
cases   = pd.DataFrame(index=isolating.index[1:])    
for col in columns:
    K_t    = isolating[col].values[::-1]
    n = len(K_t)
    x_t    = np.zeros(n)
    x_t[0] = K_t[0]
    for i in range(1,n):
        x_t[i]=K_t[i]-K_t[i-1]+(14-m)*x_t[i-1]/7
    cases[col]=x_t[::-1][:-1]

In [7]:
# Calculate total cases among students
cases['Student']=cases[['Students living on-campus','Students living off-campus']].sum(axis=1)

In [8]:
# The number of students and the number of staff
student_number = 19303
staff_number   = 6028

In [9]:
cases['Student, rate'] = cases.Student/student_number*100000
cases['Staff, rate'] = cases.Staff/student_number*100000

In [10]:
# Here are weekly case numbers 
cases.astype(int)

Unnamed: 0,Students living on-campus,Students living off-campus,Staff,Total,Student,"Student, rate","Staff, rate"
2020-11-16,-4,-2,1,-6,-7,-39,8
2020-11-09,12,8,1,22,21,111,5
2020-11-02,15,8,-1,22,23,122,-8
2020-10-26,0,11,2,14,11,60,12
2020-10-19,14,0,0,16,15,79,4
2020-10-12,16,4,0,21,21,111,-1
2020-10-07,3,0,0,2,2,13,-2


Oopsss! There are negative number of cases! It does not make sense, right. Maybe, my simple model is wrong. What might solve the problem? If the University starts reporting the number of positive cases, a comparison of rates across the surrounding area would be much more easier.

For comparison, here is a graph of Canterbury [case](https://coronavirus.data.gov.uk/details/cases?areaType=ltla&areaName=Canterbury) rates

![Image](https://gunerilhan.github.io/img/canterbury.png)
    