# Futher Programming Project Report

Bayan, Elias, Margherita and Vandam.

In [3]:
from functions import showCV, showP, showCh

In [4]:
# showCV('Essex', '2020-12-30', '2022-04-27')
# showP('Essex')
# showCh()

# Introduction

The main motivation of this project was to gain the ability to compare the cases of COVID-19 with the number of vaccinations for different cities across the UK. 

In order to do this, we created a Python program that outputs graphs showcasing the number of new and cumulative cases caused by COVID-19, and the number of various vaccine doses taken using data provided by the government. A pie chart is also shown to help visualise the proportion of doses taken. An animated Choropleth Map is shown to display the intensity of the COVID-19 in each city for the entire time range.

To obtain optimal results that were both insightful and aesthetically appealing, we decided to use Plotly. We implemented Dash to make the Python program more interactive, this way we were able to host the program on a website and make it more accessible for everyone.

Using the output of our code, we decided to analyse the effects of vaccination rollout on the number of COVID-19 cases. We chose to analyse Essex, comparing two waves. The time range selector was used to identify the different waves, while the choropleth was used to identify the cities with the darkest shade of red (displaying the highest cases).

### Choropleth Map

In [None]:
showCh()

# First Wave

In [None]:
showCV('Essex', '2020-11-29', '2021-02-13')

Looking at the 'New Cases' line plot for Essex, we determined that the period of the first wave would be from 29/11/20 until 13/02/21. 

The range slider was used to isolate this period and get specific values for the start, end and peak of this particular wave. This also gave us a clear look at the exponential growth of the vaccine plot lines.

On 29/12/20, the peak occurs with 3,738 new COVID-19 cases on that day, showing a 77.3% increase from the preceding day. Subsequently, the rate of change of Covid cases starts decreasing at a slow and fluctuating rate, taking around 2 weeks to become relatively stable. 

As for the vaccines, the growth was significantly slower when the two doses were first introduced showing a general plateau in the number of cases, until around December 29th when we can see a dramatic acceleration. 

This highlights an inverse proportional relationship between the rate of change of COVID-19 cases and vaccine doses, suggesting that the regression of COVID-19 cases after the peak could be due to the significant growth in vaccination numbers.

# Second Wave

In [None]:
showCV('Essex', '2021-12-04', '2022-02-26')

We used the ‘New Cases’ line plot again to determine the time range of the second wave. For this instance, the time range lies approximately between 04/12/21 and 26/02/22. 

Using the top graph titled ‘COVID-19 Cases’, it can be seen that the second-highest peak took place at around the same amount of time as the first wave reaching its peak for ‘New Cases’. We were able to use the time range selector in order to zoom in on the wave and analyze its properties effectively. 

The graph shows a rapid increase in the number of cases per day, reflecting a high daily rate of change. As we approach the end of the year, the effects of the holiday period become increasingly brutal. New cases reach their peak on 29/12/21, which coincides with New Year’s Eve, a time period that usually causes agglomerations. On 29/12/21, the number of cases reached 6390, increasing by approximately 18% from the previous day. 

However, the graph appears to decrease extremely rapidly to a general plateau of roughly 1000-2000 cases per day in approximately 1 week. It is possible to justify this extremely rapid decrease due to the relatively high number of people that were vaccinated around this time frame. In fact, the ‘Vaccine Doses’ graph shows that the different types of vaccines have similar trends, lying at around 0.5 to 1 million vaccinated people. 

Compared to the previous wave analyzed, the second peak was almost twice as high. Despite this, the time taken for the peak to decrease to a gradual and low trend was reduced by almost a week in this second wave. This shows the positive effect of vaccines on the mitigation and the response to a high peak. 

Based on this, it is possible to conclude that as the number of vaccines increases, the mitigation of high peaks becomes faster and more efficient. This analysis was made possible thanks to the code created which enabled us to output and organize relevant information that led to significant conclusions. Not only the features of the graphs but also their variety, and how each one served a specific purpose that contributed to the final conclusions of the analysis.

# How does our code work?

##### The majority of our Python program can be found in app.py, and does not call upon any functions due to the nature of Dash. But for this Jupyter Notebook, we have created functions.py. The functions in this file are using code from our main code, app.py. The only changes made are aesthetic changes and the way how the function takes an input.

### Where did we get our data from?
- We used data provided by the government. Which can be found at https://coronavirus.data.gov.uk/ and downloaded from https://coronavirus.data.gov.uk/details/download

### How did we import that data?
- When downloading the data from the government website, a CSV is presented to the user.
- We used pandas to read the CSV from the link.
- Instead of reading a local CSV, we chose to read the CSV from a URL. 
- Since the COVID-19 Data is updated daily, this allowed us to present the most up to date data.

### What's in the Spreadsheet? (Column Names)
- areaCode - This is a unique code used to identify a particular city.
- areaName - The name of a city (e.g. Bristol, City of).
- areaType - The same for all cities since we chose to use the Area Type of Upper Tier Local Authority
- date - The particular date of data provided.
- cumCasesBySpecimenDate - The cumulative number of COVID-19 Cases
- cumPeopleVaccinatedFirstDoseByVaccinationDate - The number of 1st vaccine doses.
- cumPeopleVaccinatedSecondDoseByVaccinationDate - The number of 2nd vaccine doses.
- cumPeopleVaccinatedThirdInjectionByVaccinationDate - The number of 3rd vaccine doses.
- newCasesBySpecimenDate - The number of new cases on a particular day.

### How did we process that data?
- List of Cities:
  - In order to determine the list of cities, we compiled the values in the ‘areaName’ column of the CSV into a list. 
  - Converted the list into a set to remove duplicates.
  - Then sorted it to make it alphabetical.


- Date Range:
  - We grabbed the date values from the ‘date’ column of the spreadsheet.
  - Then we used the .min() and .max() functions to determine the oldest and most recent dates we had data for.


- Line Graph 1 (Cumulative and New COVID-19 Cases):
  - In order to plot the graph of COVID-19 Cases for a particular city, we filtered the data for a city's name. This way, we would only have data for that city.
  - We used the filtered ‘date’ values for the x-axis
  - We used the filtered ‘cumCasesBySpecimenDate’ for the number of cumulative cases for that day. (y-axis)
  - We used the filtered ‘newCasesBySpecimenDate’ for the number of new cases for that day. (second y-axis)
  - Then those columns were plotted against each other using Plotly’s Scatter function.


- Line Graph 2 (Vaccine Doses):
  - Similar to the first line graph, using the filtered data.
  - We used the filtered ‘cumPeopleVaccinatedFirstDoseByVaccinationDate’ for the number of 1st Dose Vaccines. (y-axis)
  - We used the filtered ‘cumPeopleVaccinatedSecondDoseByVaccinationDate’ for the number of 2nd Dose Vaccines. (y-axis)
  - We used the filtered ‘cumPeopleVaccinatedThirdInjectionByVaccinationDate’ for the number of 3rd Dose Vaccines. (y-axis)
  - And ‘date’ for the x-axis
  - The data was plotted using Plotly’s Scatter function.


- Pie Charts (Vaccine Doses):
  - The filtered data’s index was reset so that the IDs of the rows ascended correctly.
  - The most recent number of vaccine doses was obtained using the .loc[] function for each of the vaccination columns.
  - The most recent value for each dose was then put into a list called ‘values’.
  - Plotly’s Pie function was then able to plot ‘values’.


- Choropleth Graph:
  - A geojson file of the UK was needed to split each city on an interactive map.
  - We obtained the geojson file by browsing through the geoportal statistics website (https://geoportal.statistics.gov.uk/search?collection=Dataset&sort=name&tags=all(BDY_CTYUA%2CDEC_2021). This allows us to download various file formats of geographical boundaries of the UK. We selected the “Counties and Unitary Authorities (December 2021) UK BUC” map and pushed it to GitHub in order to avoid making the path of the file local.
  - Using the ‘areaCode’ data values, Plotly’s choropleth_mapbox function could interact with the ‘areaCode’ values that were also found in the geojson file.
  - The choropleth doesn't use the filtered data used before, it uses data for all the cities across the UK.
  - Due to the sheer amount of data, the data was filtered on the first day of every month to reduce the number of plots.
  - Specifying the animation frames to be the filtered ‘date’ values allowed the choropleth map to be animated.

# Discussion & Conclusion

INSERT TEXT HERE