#Correlation between universal BCG vaccination policy and reduced mortality for COVID-19

Authors: Aaron Miller, Mac Josh Reandelar, Kimberly Fasciglione, Violeta Roumenova, Yan Li, Gonzalo H Otazu
doi: https://doi.org/10.1101/2020.03.24.20042937 - Posted September 14, 2020.

The authors propose that national differences in COVID-19 impact could be partially explained by different national policies with respect to Bacillus Calmette-Guerin (BCG) vaccination. 

BCG vaccination has been reported to offer broad protection from other respiratory infections besides tuberculosis. They compared BCG vaccination policies with the morbidity and mortality for COVID-19 for middle-high and high-income countries. They found that countries without universal policies of BCG vaccination (Italy, the Netherlands, USA) have been more severely affected compared to countries with universal and long-standing BCG policies.

The difference cannot be accounted for by differences in disease onset, adoption of early social distancing policies, state of health services, nor income level. Reduced mortality suggests BCG vaccination could be a potential new tool in the fight against COVID-19.https://www.medrxiv.org/content/10.1101/2020.03.24.20042937v2

#BCG Vaccination Protects against Experimental Viral Infection in Humans through the Induction of Cytokines Associated with Trained #Immunity


Authors: Rob J.W.Arts, Simone J.C.F.M.Moorlag, BorisNovakovic, YangLi Shuang-YinWang, MarijeOosting, VinodKumar, Ramnik J.Xavier, CiscaWijmenga, Leo A.B.Joosten1Chantal B.E.M.Reusken, Christine S.Benn, PeterAaby, Marion P.Koopmans, Hendrik G.Stunnenberg, Reinoutvan Crevel, Mihai G.Netea. https://doi.org/10.1016/j.chom.2017.12.010



BCG vaccination of humans induces genome-wide epigenetic reprogramming in monocytes.


BCG-induced changes correlate with protection against experimental virus infection.


Viremia reduction correlates with IL-1β upregulation, indicative of trained immunity.


SNPs in IL1B affect the induction of trained immunity by BCG.

![](https://ars.els-cdn.com/content/image/1-s2.0-S1931312817305462-fx1.jpg)https://www.sciencedirect.com/science/article/pii/S1931312817305462

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt 
import seaborn as sns
%matplotlib inline
import plotly.express as px
import plotly.graph_objects as go
import plotly.offline as py
import plotly.express as px

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

Although COVID-19 deaths are concentrated in the elderly population,recent  BCG vaccination in children might play a role in reducing mortality by decreasing the transmission of the disease to the vulnerable population. Asymptomatic children in Germany,
where BCG is not applied to children since 1998, had viral loads that were as high as adults, which suggests that unvaccinated children may be as infectious as adults. A reduction of the viral load in children might act as a plausible mechanism by which BCG vaccination in children could reduce infections and mortality in a country.

A recent study determined that BCG vaccination in childhood did not reduce COVID-19 infection in young adults, a population with reduced mortality due to COVID-19. Patients with bladder cancer that have been treated with multiple doses of BCG constitute a population that is worth studying for possible protective effects against COVID-19.

If BCG were protective for COVID-19, why did COVID-19 spread in China despite having an universal BCG policy since the 1950s?

There is still not proof that BCG inoculation at old age would boost defenses in elderly humans against COVID-19, but it seems to do so in Guinea pigs against M. tuberculosis.
https://www.medrxiv.org/content/10.1101/2020.03.24.20042937v2.full.pdf

In [None]:
from colorama import Fore, Style

nRowsRead = 1000 # specify 'None' if want to read whole file
# ham_lyrics.csv has 3634 rows in reality, but we are only loading/previewing the first 1000 rows
df = pd.read_csv('../input/hackathon/task_2-BCG_strain_per_country-8Nov2020.csv', delimiter=',', nrows = nRowsRead)
df.dataframeName = 'task_2-BCG_strain_per_country-8Nov2020.csv'
nRow, nCol = df.shape
print(f'There are {nRow} rows and {nCol} columns')
print(Fore.RED + 'Data shape: ',Style.RESET_ALL,df.shape)
df.head()

BCG vaccination has been shown to produce broad protection against viral infections and sepsis,raising the possibility that the protective effect of BCG may not be directly related to actions on COVID-19 but associated co-occurring infections or sepsis.

The authors also found that BCG vaccination correlated with a reduction in the number of COVID-19 reported infections in a country suggesting that BCG might confer some protection specifically against COVID-19. The broad use of the BCG vaccine across a population could reduce the number of carriers, and combined with other measures could act to slow down or stop the spread of COVID-19.

However, an alternative explanation would be that COVID-19 in a person with BCG vaccination would have a milder presentation, reducing the possibility that such a case would be even detected in the first place. 
https://www.medrxiv.org/content/10.1101/2020.03.24.20042937v2.full.pdf

In [None]:
df.isnull().sum()

#Handling Missing Values

In [None]:
# categorical features with missing values
categorical_nan = [feature for feature in df.columns if df[feature].isna().sum()>0 and df[feature].dtypes=='O']
print(categorical_nan)

In [None]:
# replacing missing values in categorical features
for feature in categorical_nan:
    df[feature] = df[feature].fillna('None')

In [None]:
df[categorical_nan].isna().sum()

In [None]:
#Code from Gabriel Preda
#plt.style.use('dark_background')
def plot_count(feature, title, df, size=1):
    f, ax = plt.subplots(1,1, figsize=(4*size,4))
    total = float(len(df))
    g = sns.countplot(df[feature], order = df[feature].value_counts().index[:20], palette=('bone'))
    g.set_title("Number and percentage of {}".format(title))
    if(size > 2):
        plt.xticks(rotation=90, size=8)
    for p in ax.patches:
        height = p.get_height()
        ax.text(p.get_x()+p.get_width()/2.,
                height + 3,
                '{:1.2f}%'.format(100*height/total),
                ha="center") 
    plt.show()

#Other columns are plotted (01 week ago) in https://www.kaggle.com/mpwolke/azerbaijan-bcg-strain-1nov2020

In [None]:
plot_count("Source_apart_from_bcgatlas", "Source_apart_from_bcgatlas", df,4)

#Codes by Carl McBride Ellis https://www.kaggle.com/carlmcbrideellis/absolute-beginners-titanic-eda-using-dabl

In [None]:
!pip install dabl
import dabl

In [None]:
dabl.detect_types(df)

#dabl Not plotting anything for 50 classes or more. So I choose not what I wanted but what is possible to plot.

In [None]:
dabl.plot(df, target_col="Source_apart_from_bcgatlas")

In [None]:
dabl.plot(df, target_col="mandatory_bcg_strain_2015-2020")

In [None]:
dabl.plot(df, target_col="mandatory_bcg_strain_1950-1960")

In [None]:
ax = df['mandatory_bcg_strain_2015-2020'].value_counts().plot.barh(figsize=(14, 6), color='orange')
ax.set_title('Mandatory BCG Strain 2015-2020 Distribution',color='green', size=18)
ax.set_ylabel('Mandatory BCG Strain 2015-2020', size=14)
ax.set_xlabel('Count', size=14)

In [None]:
ax = df['mandatory_bcg_strain_1950-1960'].value_counts().plot.barh(figsize=(16, 8), color='green')
ax.set_title('Mandatory BCG Strain 1950-1960 Distribution', size=18, color='orange')
ax.set_ylabel('Mandatory BCG Strain 1950-1960', size=10)
ax.set_xlabel('Count', size=10)

In [None]:
ax = df['BCG Atlas: Timing of 1st BCG?'].value_counts().plot.barh(figsize=(16, 8), color='purple')
ax.set_title('Timing of 1st BCG Distribution', size=18, color='red')
ax.set_ylabel('Timing of 1st BCG', size=10)
ax.set_xlabel('Count', size=10)

In [None]:
ax = df['Year of changes to BCG schedule'].value_counts().plot.barh(figsize=(16, 8), color='pink')
ax.set_title('Year of changes to BCG schedule Distribution', size=18, color='blue')
ax.set_ylabel('Year of changes to BCG schedule', size=10)
ax.set_xlabel('Count', size=10)

In [None]:
ax = df['BCG Atlas: BCG Recommendation Type'].value_counts().plot.barh(figsize=(16, 8), color='grey')
ax.set_title('BCG Recommendation Type Distribution', size=18, color='green')
ax.set_ylabel('BCG Recommendation Type', size=10)
ax.set_xlabel('Count', size=10)

In [None]:
from collections import Counter
BCG_data = df['BCG Atlas: BCG Strain']
BCG_count = pd.Series(dict(Counter(','.join(BCG_data).replace(' ,',',').replace(
    ', ',',').split(',')))).sort_values(ascending=False)

In [None]:
top20BCG = BCG_count.head(20)

In [None]:
#Code by Mohammad Inran Shaikh https://www.kaggle.com/shikhnu/data-analysis-and-visualization-netflix-data
from matplotlib import gridspec

fig = plt.figure(figsize=(20, 7))
gs = gridspec.GridSpec(nrows=1, ncols=2, height_ratios=[6], width_ratios=[10, 5])

ax = plt.subplot(gs[0])
sns.barplot(top20BCG.index, top20BCG, ax=ax, palette="RdGy")
ax.set_xticklabels(top20BCG.index, rotation='90')
ax.set_title('20 BCG Strains', fontsize=15, fontweight='bold')

ax2 = plt.subplot(gs[1])
ax2.pie(top20BCG, labels=top20BCG.index, shadow=True, startangle=0, colors=sns.color_palette("RdGy", n_colors=20),
       autopct='%1.2f%%')
ax2.axis('equal') 

plt.show()

#Codes by Ünal Köroglu  https://www.kaggle.com/nalkrolu/optimization-of-r-sugar-with-genetic-algorithm/notebook

In [None]:
pip install mglearn

In [None]:
import mglearn

mglearn.plots.plot_grid_search_overview()
plt.show()

#I should have trained the models before. But I didn't.

In [None]:
mglearn.plots.plot_cross_validation();
plt.show()

In [None]:
X = df.drop(["BCG Atlas: Timing of 1st BCG?","BCG Atlas: BCG Strain"],axis=1) 
y = df["BCG Atlas: BCG Strain"]

In [None]:
for i in X.columns:
    print(X[i].min(),X[i].max())

#I thought I had removed Nan. It seems that I did not. Therfore I couldn't make Logistic Regression.

In [None]:
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression(max_iter=1000)
lr.fit(X_train,y_train)
print("Model Installed!")
print("Please Wait for Results..")
model(lr)

lr_disp = plot_roc_curve(lr, X_test, y_test)
plt.plot([0,1],[0,1],"--",color="k",alpha=0.7)
plt.show()

In [None]:
def ObjectiveFunction(X):
    global lr
    if lr.predict(list([X]))==1:
        return X[3]
    return 16

#Optimization of Residual Sugarwith Genetic Algorithm (Attempt since I wasn't able to make LR (due to Nan).

In [None]:
pip install geneticalgorithm

#lr  Logistic Regression is not defined. I didn't remove Nan.

In [None]:
from geneticalgorithm import geneticalgorithm as ga

varbound=np.array([[4.6,15.9],[0.12,1.58],[0,1],[0.9,15.5],[0.012,0.611],[1.0,72.0],[6.0,289.0],[0.99007,1.00369],[2.74,4.01],[0.33,2.0],[8.4,14.9]])
vartype=np.array([['real'],['real'],['int'],['real'],['real'],['real'],['int'],['real'],['real'],['real'],['real']])
model=ga(function=ObjectiveFunction,dimension=11,variable_type_mixed=vartype,variable_boundaries=varbound)

model.run()

#No Genetical Algorithm for now.

In [None]:
#Code by Olga Belitskaya https://www.kaggle.com/olgabelitskaya/sequential-data/comments
from IPython.display import display,HTML
c1,c2,f1,f2,fs1,fs2=\
'#2B3A67','#42a7f5','Akronim','Smokum',30,15
def dhtml(string,fontcolor=c1,font=f1,fontsize=fs1):
    display(HTML("""<style>
    @import 'https://fonts.googleapis.com/css?family="""\
    +font+"""&effect=3d-float';</style>
    <h1 class='font-effect-3d-float' style='font-family:"""+\
    font+"""; color:"""+fontcolor+"""; font-size:"""+\
    str(fontsize)+"""px;'>%s</h1>"""%string))
    
    
dhtml('Programming is more than an important practical art. It is also a gigantic undertaking in the foundations of knowledge, Grace Hopper quote' )