#The “swine flu”, The “bird flu” and Now The "bat flu".

#Lessons for monitoring Covid-19 vaccine safety from the H1N1 pandemic

By DANIEL SALMON and JOSHUA M. SHARFSTEIN   OCTOBER 29, 2020

As the country and the world eagerly await vaccines to curb the Covid-19 pandemic and allow us to return to normal social and economic activities, preparing to monitor these vaccine for safety is critical task.

Why is safety monitoring needed after vaccines are found to be very safe and effective in the rigorous process of testing them in randomized controlled trials? Those trials establish which vaccines are effective and identify their most common adverse effects. But adverse reactions that are uncommon, appear only after a delay, or occur in subpopulations excluded from or inadequately included in clinical trials may not emerge until the vaccine is being widely used.

Another reason is that when vaccinating a large number of people, some will have heart attacks or strokes, develop diabetes or dementia, experience new-onset epilepsy, or have other health issues. Rapid and rigorous science is needed to determine if these events were caused by the vaccine or would have happened regardless of vaccination.

The process for separating real adverse reactions from coincidental events must be rapid and credible.
https://www.statnews.com/2020/10/29/lessons-h1n1-monitoring-covid-19-vaccine-safety/

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

#Rigor, objectivity, and transparency.

By the time the pandemic had faded, the H1N1 vaccine helped millions of people stay healthy without a safety problem — real or coincidental — derailing the program. This system to identify safety issues was developed using the principles of good governance and public health practice to ensure trust: rigor, objectivity, and transparency.

The need for monitoring the safety of SARS-CoV-2 vaccines after they are licensed and are being widely used is far greater today than it was for H1N1. Public health officials had decades of experience with influenza and influenza vaccines, and much was already known about the safety of these vaccines. In contrast, no coronavirus vaccine has ever existed. And some of the new vaccines are based on novel technologies, such as mRNA or DNA, and new vectors, such as the adenovirus.
https://www.statnews.com/2020/10/29/lessons-h1n1-monitoring-covid-19-vaccine-safety/

In [None]:
# Plotly Packages
from plotly import tools
import plotly.figure_factory as ff
import plotly.graph_objs as go
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)

# Matplotlib and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns
from string import ascii_letters

# Statistical Libraries
from scipy.stats import norm
from scipy.stats import skew
from scipy.stats.stats import pearsonr
from scipy import stats

#Vaccine hesitancy

Add in vaccine hesitancy, which has only grown as a problem over the last decade. In 2019, the World Health Organization declared vaccine hesitancy to be one of the top 10 global health threats. Politicization of the U.S.’s Covid-19 response, coupled with increasing distrust of the government, increases the risk of a Covid-19 vaccine safety scare undermining the vaccine program and public confidence.

We must prepare now for the launch of Covid-19 vaccines by setting up in advance these systems for vaccine safety monitoring to make sure the vaccines are very safe and to inspire the confidence necessary for a successful vaccination program.
https://www.statnews.com/2020/10/29/lessons-h1n1-monitoring-covid-19-vaccine-safety/

In [None]:
df = pd.read_csv("../input/flu-data/H1N1_Flu_Vaccines.csv")
df.head()

#COVID-19 vaccination lessons from Canada's largest-ever mass immunization effort

By Jackie Dunham

In the fall of 2009, Canada launched its largest-ever vaccination campaign to protect against an outbreak of influenza A, or H1N1, with varying degrees of success. There were delays in production, supply shortages, and difficulties administering the new vaccine.

According to Statistics Canada, by April 2010, the majority of Canadians (59 per cent or 16.5 million people) had not been vaccinated against the H1N1 virus, which was also known as the “swine flu.”

There were a number of reasons why there were delays in the rollout of the vaccine in Canada – timing being a key one – as the H1N1 virus spread in the spring of 2009 and the vaccine wasn’t ready until months later, in the fall, when health-care providers were trying to administer seasonal flu shots.

Earl Brown, a virologist and a former member of Canada’s H1N1 Pandemic Vaccine Task Group, added that manufacturers were also preoccupied with creating a vaccine for avian influenza H5N1, or the “bird flu,” when they were forced to suddenly turn their attention to the new swine flu outbreak.

While Canada’s ambitious vaccination program for H1N1 may have had some hiccups along the way, the experience may provide some valuable lessons for the administration of future vaccines, such as those already in the works to combat against COVID-19.
https://www.ctvnews.ca/health/covid-19-vaccination-lessons-from-canada-s-largest-ever-mass-immunization-effort-1.5138256

In [None]:
df.isnull().sum()

#“Vaccine Nationalism.”

During the H1N1 pandemic, the federal government was criticized for relying on only one domestic vaccine supplier, GlaxoSmithKline (GSK), to manufacture the vaccine.

Countries often prefer to produce their own vaccines domestically, in case there are border closures or what can be called “vaccine nationalism.” However, the dependence on only one supplier comes with its own risks because any disruptions or interruptions in the production line can cripple the whole’s country’s supply.

In the case of GSK, there were difficulties bottling the vaccine at their Quebec plant, which caused delays.

An internal review of the Public Health Agency of Canada and Health Canada’s response to the H1N1 pandemic addressed the government’s use of a sole vaccine supplier. The review stated that, at the time, there was only one manufacturer “interested in establishing sufficient domestic capacity to manufacture enough vaccine to inoculate the entire population in the event of an influenza pandemic.”
https://www.ctvnews.ca/health/covid-19-vaccination-lessons-from-canada-s-largest-ever-mass-immunization-effort-1.5138256

In [None]:
plt.rcParams['figure.figsize'] = (15, 10)
plt.style.use('ggplot')

sns.heatmap(df.corr(), annot=True)
plt.title('Correlation Plot', fontsize = 20)
plt.show()

#Codes by Shivangi Garg  https://www.kaggle.com/shivangigarg160997/pandas-profiling-and-to-choose-best-model

In [None]:
import pandas_profiling as pp 
profile = pp.ProfileReport(df) 
profile

#Handling Missing Values

In [None]:
# categorical features with missing values
categorical_nan = [feature for feature in df.columns if df[feature].isna().sum()>0 and df[feature].dtypes=='O']
print(categorical_nan)

#Vaccine Distribution

"In preparation for a vaccine for the coronavirus, the Canadian government has already been pre-ordering supplies, such as vials and syringes, and working on logistics to store and transport the vaccine once it is ready.

“You have to be ready to get, store, distribute that vaccine and so that is something the public health addresses and they will have to be ready to achieve that,”

It’s a considerable undertaking with respect to Canada’s population of more than 35 million people.

“Not everyone will want a vaccine, not everyone will get it, but in broad numbers, 35 million vaccines would be needed or 20 million, and then if you need two doses, that’s double that.” 
https://www.ctvnews.ca/health/covid-19-vaccination-lessons-from-canada-s-largest-ever-mass-immunization-effort-1.5138256

In [None]:
# replacing missing values in categorical features
for feature in categorical_nan:
    df[feature] = df[feature].fillna('None')

In [None]:
df[categorical_nan].isna().sum()

In [None]:
# Lets first handle numerical features with nan value
numerical_nan = [feature for feature in df.columns if df[feature].isna().sum()>1 and df[feature].dtypes!='O']
numerical_nan

In [None]:
df[numerical_nan].isna().sum()

In [None]:
## Replacing the numerical Missing Values

for feature in numerical_nan:
    ## We will replace by using median since there are outliers
    median_value=df[feature].median()
    
    df[feature].fillna(median_value,inplace=True)
    
df[numerical_nan].isnull().sum()

In [None]:
from sklearn import linear_model
from sklearn.model_selection import train_test_split
import warnings
warnings.filterwarnings("ignore")

In [None]:
from sklearn.preprocessing import LabelEncoder

#fill in mean for floats
for c in df.columns:
    if df[c].dtype=='float16' or  df[c].dtype=='float32' or  df[c].dtype=='float64':
        df[c].fillna(df[c].mean())

#fill in -999 for categoricals
df = df.fillna(-999)
# Label Encoding
for f in df.columns:
    if df[f].dtype=='object': 
        lbl = LabelEncoder()
        lbl.fit(list(df[f].values))
        df[f] = lbl.transform(list(df[f].values))
        
print('Labelling done.')

In [None]:
df = pd.get_dummies(df)

In [None]:
X, y = df.iloc[:,:-1],df.iloc[:,-1]

#Splitting data into Training and Testing data

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)
print('\n \n There are {} samples in the training set and {} samples in the test set'.format(X_train.shape[0], X_test.shape[0]))
print('\n \n There are {} samples in the training set and {} samples in the test set'.format(y_train.shape[0], y_test.shape[0]))

#Applying Linear Regression to Training Data

In [None]:
from sklearn.linear_model import LinearRegression
model = LinearRegression()
model = model.fit(X_train, y_train)

In [None]:
#Print the coefecients/weights for each feature/column of our model
print(model.coef_)

In [None]:
print(model.intercept_)

In [None]:
#print our predictions on our test data
y_pred = model.predict(X_test)
print(y_pred)

#Plotting a graph between Testing and Prediction Data

In [None]:
x_ax = range(len(X_test))
plt.scatter(x_ax, y_test, s=5, color="blue", label="original")
plt.plot(x_ax, y_pred, lw=0.8, color="red", label="predicted")
plt.legend()
plt.show()

#Getting Score for Model

In [None]:
from sklearn.metrics import r2_score
r2 = r2_score(y_test, y_pred)
print(r2)

#Calculating Errors

In [None]:
from sklearn.metrics import mean_absolute_error, mean_squared_error
print('Mean Squared Error:', mean_squared_error(y_test, y_pred)) 
print('Root Mean Squared Error:', np.sqrt(mean_squared_error(y_test, y_pred)))
print('Mean Absolute Error:', mean_absolute_error(y_test, y_pred))

#After trying Linear Regression, to get more accuracy (try to) make a function.

In [None]:
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import ShuffleSplit
from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.svm import SVR

#After the function below, the session stopped and didn't render anything else. I saved for the future.

In [None]:
#def find_best_model(X,Y):
#    models = {
#        'linear_regression' :{
#            'model': LinearRegression(),
 #           'parameters':{
                
  #          }
   #     },
        
    #    'decison_tree_regressor':{
     #       'model': DecisionTreeRegressor(splitter='best'),
      #      'parameters' :{
       #         'max_depth' :[5,10]
        #    }
        #},
        
        #'random_forest': {
         #   'model': RandomForestRegressor(),
            
          #  'parameters' :{
           #     'n_estimators': [1,5,10,15,20,30,40,50,60,70,80,90,100]
            #}
        #},

         #'svc' : {
          #  'model' : SVR(gamma= 'auto'),
            
          #  'parameters': {
          #      'kernel' : ['rbf','linear'],
           #     'C': [1,10,20]
          #  }
        #}
    #}
    
 #   scores = []
 #   cv_shuffle = ShuffleSplit(n_splits=5,test_size= 0.33,random_state=0)
    
 #   for model_names,model_params in models.items():
  #      gc = GridSearchCV(model_params['model'],model_params['parameters'],cv = cv_shuffle,return_train_score= False)
   #     gc.fit(X,Y)
    #    scores.append({
     #       'model': model_names,
      #      'parameters' : gc.best_params_,
       #     'score' : gc.best_score_
       # })
        
    #return pd.DataFrame(scores, columns=['model','best_parameters','score'])

#find_best_model(X_train, y_train)

#Random Forest Model

In [None]:
#model = RandomForestRegressor(n_estimators= 1)
#model.fit(X_train,y_train)

In [None]:
#predictions = model.predict(X_test)

In [None]:
#predictions

In [None]:
#x_ax = range(len(X_test))
#plt.scatter(x_ax, y_test, s=5, color="blue", label="original")
#plt.plot(x_ax, predictions, lw=0.8, color="red", label="predicted")
#plt.legend()
#plt.show()

In [None]:
#Code by Olga Belitskaya https://www.kaggle.com/olgabelitskaya/sequential-data/comments
from IPython.display import display,HTML
c1,c2,f1,f2,fs1,fs2=\
'#eb3434','#eb3446','Akronim','Smokum',30,15
def dhtml(string,fontcolor=c1,font=f1,fontsize=fs1):
    display(HTML("""<style>
    @import 'https://fonts.googleapis.com/css?family="""\
    +font+"""&effect=3d-float';</style>
    <h1 class='font-effect-3d-float' style='font-family:"""+\
    font+"""; color:"""+fontcolor+"""; font-size:"""+\
    str(fontsize)+"""px;'>%s</h1>"""%string))
    
    
dhtml('Marília Prata, @mpwolke was Here' )