## Exploration of COVID-19 vaccines

I have separated them into two categories:

- mRNA vaccines: More information [here](https://www.youtube.com/watch?v=WOvvyqJ-vwo) and [here](https://doi.org/10.1038/nrd.2017.243)

- Viral vector vaccines: More information [here](https://www.youtube.com/watch?v=Q6qk6Wh6cXU) and [here](https://doi.org/10.3390/vaccines2030624)

More information on vaccines can be found in these two scientific articles: [DNA](https://doi.org/10.1046/j.1365-2796.2003.01140.x), [RNA](https://doi.org/10.1016/j.vaccine.2012.04.060). A summary of most vaccines can also be found [here](https://doi.org/10.1016/S0140-6736(21)00306-8)

Lastly I added results from vaccines that are still not peer-reviewed. Each vaccine's data comes from a press release offered by the company developing the vaccine. Since the data has not been reviewed by other scientists, it cannot be fully trusted to be accurate.

Each vaccine's common name is given (usually the company's name), along with their scientific name for easier investigation of the vaccine in search engines.

In particular, this notebook will explore the efficacy of the vaccines by replicating their results given the data from phase 3 clinical trials.

#### Helper Functions

In [37]:
def efficacy_risk_ratio(vaccine_n, infected_vaccine, placebo_n, infected_placebo, risk_only=False):
    vaccine_risk = infected_vaccine/vaccine_n
    placebo_risk = infected_placebo/placebo_n
    print("Vaccine Efficacy using relative risk:")
    risk_ratio = vaccine_risk / placebo_risk
    if risk_only:
        ve = risk_ratio
    else:
        ve = 1-risk_ratio
    print(ve * 100)

In [2]:
def efficacy_simplified(infected_vaccine, infected_placebo):
    ratio = infected_vaccine / infected_placebo
    ve = 1-ratio
    print(ve * 100)

In [3]:
def efficacy_relative_risk(vaccine_n, infected_vaccine, placebo_n, infected_placebo):
    vaccine_risk = infected_vaccine/vaccine_n
    placebo_risk = infected_placebo/placebo_n
    print("Vaccine Efficacy using relative risk:")
    ve = (placebo_risk - vaccine_risk) / placebo_risk
    print(ve * 100)

In [4]:
from scipy.stats import beta
def clopper_pearson(i,n,confidence=0.95):
    k = n-i
    alpha = (1 - confidence) / 2
    lo = beta.ppf(alpha, k, n-k+1)
    hi = beta.ppf(1 - alpha, k+1, n-k)
    print(f"Confidence Intervals = ({lo * 100},{hi * 100})")

In [5]:
from scipy.stats.mstats import mquantiles
def probabilistic_simulation(vaccine_n, infected_vaccine, plabebo_n, infected_placebo, confidence=0.95):
    prob_infection_vaccine = beta.rvs(infected_vaccine + 1, vaccine_n - infected_vaccine + 1, size=1000)
    prob_infection_placebo = beta.rvs(infected_placebo + 1, plabebo_n - infected_placebo + 1, size=1000)
    simulated_cases = 100 * ( 1 - prob_infection_vaccine / prob_infection_placebo )
    alpha = (1 - confidence) / 2
    lo_c, hi_c = [alpha, 1 - alpha]
    lo, med, hi = mquantiles(simulated_cases, prob=[lo_c, 0.5, hi_c])
    print("Vaccine Efficacy using simulation:")
    print(med)
    print(f"Confidence Intervals = ({lo},{hi})")

# mRNA Vaccines

## Pfizer-BioNTech vaccine - BNT162b2

Link to the scientific article [here](https://www.nejm.org/doi/full/10.1056/NEJMoa2034577)

## Data summary

Published on December 31, 2020

**Trial timeframe**: July 27, 2020, and November 14, 2020 (110 days)

- Interim analysis of the trial. There will be further publications with longer follow-up time and segregated by specific groups.

### Initial data

Doses were given 21 days apart.

**Participants**: 43,448 people. Age: 16 years or older

**Vaccinated**: 21,720

**Placebo**: 21,728

--------

### Median follow-up of 2 months

5,742 people were lost in the process (e.g. did not show for vaccination,forgot about the follow-up, etc.)

#### First dose (day 1):

**Participants**: 37,706 people.

**Received dose 1**: 18,860 people.

**Received placebo dose 1**: 18,846 people.

#### People lost after first dose:

620 people were lost in the process after dose 1.

**Vaccinated**: 304 people.

- People with adverse effects: 28

- People who died: 1

**Placebo**: 316 people.

- People with adverse effects: 18

- People who died: 1

--------

#### Second dose (day 22):

**Participants**: 37,086 people.

**Received dose 2**: 18,556 people.

**Received placebo dose 2**: 18,530 people.

#### People lost after second dose:

143 people were lost in the process after dose 2.

**Vaccinated**: 48 people.

- People who died: 1

**Placebo**: 95 people.

- People who died: 2

People died from causes unrelated to COVID-19.

**Severe COVID-19 cases**: 10 (1 in vaccine group)


## Vaccine Efficacy (VE) of BNT162b2

To calculate the VE, the trial tracked the number of COVID-19 incidences that occured 7 days after administering the second dose. Keep in mind that estimated median viral incubation period is 5 days. Also, the data used to calculate VE is slightly different from above as the data they used for their results required specific criteria:

- Ages between 16 and 64
- Excluded some people with preconditions (illness)

The excluded participants will be analysed separately by Pfizer at a later date.

In [6]:
vaccine_n = 18198
infected_vaccine = 8
placebo_n = 18325
infected_placebo = 162

In [7]:
efficacy_risk_ratio(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using relative risk:
95.02726524010913


In [8]:
efficacy_relative_risk(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using relative risk:
95.02726524010914


In [9]:
efficacy_simplified(infected_vaccine, infected_placebo)

95.06172839506173


Computing the confidence intervals:

The article says they used Clopper-Pearson method and references a book.

I have taken the liberty to suggest a more intuitive [scientific article](https://doi.org/10.1111/sjos.12021) to obtain the method which I have coded below.

The confidence interval that we obtained is not exactly the one obtained by Pfizer, as they adjust it using surveillance time. This method is not explained and as such I could not replicate it.

In [10]:
clopper_pearson(infected_vaccine, infected_placebo)

Confidence Intervals = (90.50171730823362,97.84418800946054)


In [11]:
probabilistic_simulation(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using simulation:
94.73012874728929
Confidence Intervals = (89.97904400940322,97.56048127504116)


## Moderna vaccine - mRNA-1273

Link to the scientific article [here](https://www.nejm.org/doi/10.1056/NEJMoa2035389)

## Data summary

Published on February 4, 2021

**Trial timeframe**: July 27, 2020, and November 25, 2020 (121 days)

- This is also an interim analysis of the trial.

### Initial data

Doses were given 28 days apart.

**Participants**: 30,420 people. Age: 18 years or older

**Vaccinated**: 15,210

**Placebo**: 15,210

--------

### Median follow-up of 63 days

69 people were lost in the process (e.g. did not show for vaccination,forgot about the follow-up, etc.)

#### First dose (day 1):

**Participants**: 30,351 people.

**Received vaccine dose 1**: 15,181 people.

**Received placebo dose 1**: 15,170 people.

#### People lost after first dose:

1,203 people were excluded in the process after dose 1. This means that they were positive for COVID-19 virus at baseline, had missing data, or were lost in the process.

**Vaccinated**: 631 people.

**Placebo**: 572 people.

--------

#### Second dose (day 29):

**Participants**: 29,148 people.

**Received vaccine dose 2**: 14,550 people.

**Received placebo dose 2**: 14,598 people.

#### People lost after second dose (Per-protocol analysis):

941 people were lost in the process after dose 2.

**Vaccinated**: 416 people.

**Placebo**: 525 people.

All the people who got the second dose (including lost cases) were included in a separate analysis (Modified intention-to-treat analysis).

**Fatalities**: 1 person.

**Severe COVID-19 cases**: 30 people.

All severe cases and fatalities happened in the placebo group.


## Vaccine Efficacy (VE) of mRNA-1273

To calculate the VE, the trial tracked the number of COVID-19 incidences that occured 7 days after administering the second dose for the per-protocol analysis. The trial tracked incidences that occured 14 days after administering the second dose for the intention-to-treat analysis. Keep in mind that estimated median viral incubation period is 5 days. The data used to calculate VE comes from the per-protocol population. It includes data after the loss of people from the second dose.

This data also includes people with age above 65, contrary to Pfizer. Moderna also utilizes the [Cox proportional hazards model](https://dx.doi.org/10.1128%2FAAC.48.8.2787-2792.2004), which unfortunately requires the data from the trial to create the regression. This model provides the 'Hazard ratio', which can be calculated $1 - VE$. The model also provides confidence intervals, which we cannot replicate and therefore will use the more simple Clopper-Pearson method to approximate them and also do a probablistic simulation of the data.

### Per-protocol analysis

In [12]:
vaccine_n = 14134
infected_vaccine = 11
placebo_n = 14073
infected_placebo = 185

efficacy_risk_ratio(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using relative risk:
94.07971577067374


In [13]:
clopper_pearson(infected_vaccine, infected_placebo)

Confidence Intervals = (89.61026975138763,96.9947520495872)


In [14]:
probabilistic_simulation(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using simulation:
93.58831595971176
Confidence Intervals = (88.99624925771101,96.74944138428421)


### Intention-to-treat analysis

In [15]:
vaccine_n = 14550
infected_vaccine = 12
placebo_n = 14598
infected_placebo = 204

efficacy_risk_ratio(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using relative risk:
94.09824135839902


In [16]:
clopper_pearson(infected_vaccine, infected_placebo)

Confidence Intervals = (89.95021537999554,96.92389201601304)


In [17]:
probabilistic_simulation(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using simulation:
93.73372589914821
Confidence Intervals = (89.13599095360979,96.75586694726483)


# Viral vector Vaccines

## Oxford-AstraZeneca vaccine (ChAdOx1) - AZD1222

Link to the scientific article [here] (https://doi.org/10.1016/S0140-6736(20)32661-1)

Latest preprint (non peer-reviewed) article [here] (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3777268)

## Data summary

Published on February 4, 2021

The article includes details of phase 1 trials that are not included in the analysis. Therefore, we focus on phase 3 cohorts (COV002 and COV003) which take part in the UK and Brazil.

**Timeframe for all cohorts**: May 28, 2020, and Nov 21, 2020 (177 days)

### Initial data

**Participants**: 23,848 people

**Vaccinated**: 12,082 people

**Placebo**: 11,766 people

- This is also an interim analysis of the trial.

### Interim data

12,935 people were excluded from the interim analysis.

### **Cumulative cohorts**:

**Participants**: 11,636 people. Age: 18 years or older

**Vaccinated**: 5,807 people.

**Placebo**: 5,829 people.

### **COV002** (UK):

Two dosage groups: LD/SD and SD/SD

- LD stands for Low Dose and SD for Standard Dose

**LD/SD timeframe**: Started on May 31, 2020

**SD/SD timeframe**: Started on June 9, 2020

**LD/SD Participants**: 2,741 people. Age: 18 years or older

- **Vaccinated**: 1,367 people.

- **Placebo**: 1,374 people.

**SD/SD Participants**: 4,807 people. Age: 18 years or older

- **Vaccinated**: 2,377 people.

- **Placebo**: 2,430 people.

Median follow-up of 84 days (LD/SD group) and 69 days (SD/SD group)

### **COV003** (Brazil):

**Trial timeframe**: Started on June 23, 2020

**Participants**: 4,088 people. Age: 18 years or older

**Vaccinated**: 2,063 people.

**Placebo**: 2,025 people.

Median follow-up of 36 days

--------

**Fatalities**: 1 person.

**Severe COVID-19 cases**: 2 people.

**Hospitalisation**: 18 people.

2 hospitalisations happened in the vaccine group. All severe COVID-19 cases and fatalities happened in the placebo group.

## Vaccine Efficacy (VE) of AZD1222

To calculate the VE, the trial tracked the number of COVID-19 incidences that occured 14 days after administering the second dose. Keep in mind that estimated median viral incubation period is 5 days. The data used to calculate VE comes from the all the cohorts.

This data also includes people with age above 56. AstraZeneca utilizes a [Poisson regression with robust variance analysis](https://doi.org/10.1093/aje/kwh090), which unfortunately requires the data from the trial to create the regression. The model provides confidence intervals, which we cannot replicate and therefore will use the more simple Clopper-Pearson method to approximate them.

In [38]:
vaccine_n = 5807
infected_vaccine = 30
placebo_n = 5829
infected_placebo = 101

efficacy_risk_ratio(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using relative risk:
70.18449907673737


In [19]:
clopper_pearson(infected_vaccine, infected_placebo)

Confidence Intervals = (60.3852871352311,78.98077935048298)


In [20]:
probabilistic_simulation(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using simulation:
69.45996322948899
Confidence Intervals = (55.87004490098043,80.40055640437863)


## We have also seen the AstraZeneca vaccine be the target of much doubt in the media.

### We will give the example of the thrombosis events (blood clots)

The data can be found on the Supplementary Material section of the scientific article linked above. The data can be directly found on page 20 of this [linked document](https://www.thelancet.com/cms/10.1016/S0140-6736(20)32661-1/attachment/75bff1ea-804f-4c66-adc1-2f7d0f9b4550/mmc1.pdf) 

Previously we talked about Vaccine Efficacy in preventing the COVID-19 disease. Now the Vaccine Efficacy relates to giving you blood clots.

In [47]:
vaccine_n = 5807
thrombosis_vaccine = 4
placebo_n = 5829
thrombosis_placebo = 8

efficacy_risk_ratio(vaccine_n, thrombosis_vaccine, placebo_n, thrombosis_placebo)

Vaccine Efficacy using relative risk:
49.81057344584122


In [48]:
clopper_pearson(thrombosis_vaccine, thrombosis_placebo)

Confidence Intervals = (15.701277048705803,84.2987229512942)


In [49]:
probabilistic_simulation(vaccine_n, thrombosis_vaccine, placebo_n, thrombosis_placebo)

Vaccine Efficacy using simulation:
44.40061000186899
Confidence Intervals = (-62.12550409115327,85.19205170053736)


## Sputnik V vaccine - Gam-COVID-Vac (rAd26,rAd5)

Link to the scientific article [here](https://doi.org/10.1016/S0140-6736(21)00234-8)

## Data summary

Published on February 2, 2021

**Trial timeframe**: Sept 7, 2020 and Nov 24, 2020 (78 days)

- This is also an interim analysis of the trial.

### Initial data

Doses were given 21 days apart.

**Participants**: 21,977 people. Age: 18 years or older (max 87)

**Vaccinated**: 16,501

**Placebo**: 5,476

--------

### Median follow-up of 48 days

115 people were lost in the process (e.g. did not show for vaccination,forgot about the follow-up, etc.)

#### First dose (day 1):

**Participants**: 21,862 people.

**Received vaccine dose 1**: 16,427 people.

**Received placebo dose 1**: 5,435 people.

#### People lost after first dose:

1,996 people were excluded in the process after dose 1. This means that they were positive for COVID-19 virus at baseline, had missing data, or were lost in the process.

**Vaccinated**: 1,463 people.

**Placebo**: 533 people.

--------

#### Second dose (day 22):

**Participants**: 19,866 people.

**Received vaccine dose 2**: 14,964 people.

**Received placebo dose 2**: 4,902 people.

#### People lost after second dose:

7,570 people were lost in the process after dose 2.

**Vaccinated**: 5,706 people.

**Placebo**: 1,864 people.

All the people who got the second dose (including lost cases) were included in the efficacy analysis.

**Severe COVID-19 cases**: 20 people.

All severe cases happened in the placebo group.


## Vaccine Efficacy (VE) of Gam-COVID-Vac

To calculate the VE, the trial tracked the number of COVID-19 incidences that occured 21 days after administering the first dose (Day of second dose). Keep in mind that estimated median viral incubation period is 5 days.

This data also includes people with age above 60. I don't know which model they used to calculate the confidence intervals and therefore will use the more simple Clopper-Pearson method to approximate them.

In [21]:
vaccine_n = 14964
infected_vaccine = 16
placebo_n = 4902
infected_placebo = 62

efficacy_risk_ratio(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using relative risk:
91.54616240266962


In [22]:
clopper_pearson(infected_vaccine, infected_placebo)

Confidence Intervals = (61.5027546671032,84.47300065357939)


As you can see, the Vaccine Efficacy lies outside of the clopper pearson confidence intervals. This happens because the vaccine and placebo groups significantly differ. Therefore a more conservative efficacy could be obtained within this range.

In [23]:
print("Conservative Vaccine Efficacy:")
low_ci = 0.615027546671032
high_ci = 0.8447300065357939
cve = (low_ci + high_ci) * 50
print(cve)

Conservative Vaccine Efficacy:
72.9878776603413


And we can compare it to the simulated data

In [24]:
probabilistic_simulation(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using simulation:
91.34818721914768
Confidence Intervals = (85.88302629677906,94.97242928659664)


# Non peer-reviewed vaccines (Press releases)

## Johnson & Johnson Janssen vaccine - Ad26.COV2.S (JNJ-78436735)

Link to the press release [here](https://www.jnj.com/johnson-johnson-announces-single-shot-janssen-covid-19-vaccine-candidate-met-primary-endpoints-in-interim-analysis-of-its-phase-3-ensemble-trial)

Protocol [here](https://www.jnj.com/coronavirus/ensemble-1-study-protocol)

### Mild-Moderate

In [25]:
vaccine_n = 21895
infected_vaccine = 66
placebo_n = 21888
infected_placebo = 193

efficacy_risk_ratio(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using relative risk:
65.8140418175773


In [26]:
clopper_pearson(infected_vaccine, infected_placebo)

Confidence Intervals = (58.64776232985921,72.46444585712413)


In [27]:
probabilistic_simulation(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using simulation:
65.94608572116161
Confidence Intervals = (54.54184747042848,74.16487342476853)


### Severe

In [28]:
vaccine_n = 21895
infected_vaccine = 5
placebo_n = 21888
infected_placebo = 34

efficacy_risk_ratio(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using relative risk:
85.2988192308209


In [29]:
clopper_pearson(infected_vaccine, infected_placebo)

Confidence Intervals = (68.94342696048746,95.04715447438227)


In [30]:
probabilistic_simulation(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using simulation:
83.57070756582849
Confidence Intervals = (63.412322267664024,93.54657358982605)


## Novavax vaccine - NVX-CoV2373

Link to the press release [here](https://ir.novavax.com/news-releases/news-release-details/novavax-covid-19-vaccine-demonstrates-893-efficacy-uk-phase-3)

In [31]:
vaccine_n = 15000/2
infected_vaccine = 6
placebo_n = 15000/2
infected_placebo = 56

efficacy_risk_ratio(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using relative risk:
89.28571428571428


In [32]:
clopper_pearson(infected_vaccine, infected_placebo)

Confidence Intervals = (78.12435281547205,95.96520538091778)


In [33]:
probabilistic_simulation(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using simulation:
88.31627685641676
Confidence Intervals = (75.67528290381448,95.34086845240574)


## Covaxin vaccine - BBV152

Link to the press release [here](https://www.bharatbiotech.com/images/press/covaxin-phase3-efficacy-results.pdf)

In [34]:
vaccine_n = 25800/2
infected_vaccine = 7
placebo_n = 25800/2
infected_placebo = 36

efficacy_risk_ratio(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using relative risk:
80.55555555555556


In [35]:
clopper_pearson(infected_vaccine, infected_placebo)

Confidence Intervals = (63.97519838249748,91.80563602347911)


In [36]:
probabilistic_simulation(vaccine_n, infected_vaccine, placebo_n, infected_placebo)

Vaccine Efficacy using simulation:
79.67622786143498
Confidence Intervals = (57.065395614050686,91.98938970807016)
