#Vaccine Innovation and Global Sustainability: Governance Challenges for Sustainable Development Goals

Authors: Akira Homma, Cristina Possas, Reinaldo Martins

August 2020 - DOI: 10.1093/oso/9780190949501.003.0011
In book: Science, Technology, and Innovation for Sustainable Development Goals (pp.219-242)Publisher: Oxford University Press

Innovative preventive vaccines against emerging and neglected infectious diseases, such as Zika, dengue, influenza, and HIV/AIDS, are examined from a global sustainability perspective, aiming to integrate public health and innovation governance approaches.

Innovation-intensive vaccines with reduced adverse effects can have an enormous impact on life expectancy and on the quality of life of the global population, but in contrast only one of the SDGs (Sustainable Development Goals), SDG3, refers directly to vaccines. However, this chapter also identifies seven other SDGs (Sustainable Development Goals) strongly related to vaccines and six additional SDGs related to vaccines, leading to a total of 14 vaccine-related goals in 17 SDGs.

Two of these goals are related to innovation and technological development of vaccines (SDG9 and SDG17). The authors examine vaccine performance indicators and current technological and regulatory obstacles to achieve these goals, particularly affecting developing countries, and propose STI governance strategies to overcome these gaps and increase access to vaccines. Policy recommendations for vaccine funding and incentives for innovation, development, and vaccine production are made. Recommendations are also given for specific vaccine STI (Sexually transmitted infections) performance indicators and strategies to achieve the 14 vaccine-related SDGs (Sustainable Development Goals).
https://www.researchgate.net/publication/343763444_Vaccine_Innovation_and_Global_Sustainability_Governance_Challenges_for_Sustainable_Development_Goals

VACCINE UPRISING

The Vaccine Uprising was a popular riot between November 10 and 16, 1904 in the city of Rio de Janeiro , then the capital of Brazil . Its immediate pretext was a law that made vaccination against smallpox mandatory , but it is also associated with deeper causes, such as the urban reforms being carried out by Mayor Pereira Passos and the sanitation campaigns led by doctor Oswaldo Cruz .

Oswaldo Cruz, who took over the General Directorate of Public Health in 1903, was responsible for the city's sanitation campaign, which aimed to eradicate yellow fever, bubonic plague and smallpox. To this end, in June 1904, the government made a proposal for a law that made vaccination of the population mandatory. The law sparked heated debates between lawmakers and the population, and despite the strong opposition campaign, it was passed on. 

The trigger of the revolt was the publication of a project to regulate the application of the mandatory vaccine in the newspaper A Notícias , on January 9, 1904.

On November 16, a state of siege and the suspension of mandatory vaccination was declared . Given the systematic and extinct repression of the triggering cause, the movement was ebbing. In the repression that followed the revolt, the police forces arrested a number of suspects and individuals considered disorderly, whether they were related to the revolt or not. The total balance was 945 people imprisoned on Ilha das Cobras , 30 dead, 110 injured and 461 deported to the state of Acre . https://translate.google.com.br/translate?hl=en&sl=pt&u=https://pt.wikipedia.org/wiki/Revolta_da_Vacina&prev=search&pto=aue

![](https://1.bp.blogspot.com/-KqXI33x9tmk/XEeqOl_82VI/AAAAAAAAXCQ/x5IoyAaf1dk5oD9TvMp7iZ_uUdyVEB_UwCLcBGAs/s640/charge%2Brevolta%2Bda%2Bvacina%2B7.PNG)suportegeografico77.blogspot.com

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import plotly.express as px
import seaborn as sns
plt.style.use('fivethirtyeight')

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

#Vaccines: Biotechnology Market, Coverage, and Regulatory Challenges for Achieving Sustainable Development Goals

Authors: Akira Homma, Cristina Possas, Adelaide Maria de Souza Antunes, Jorge Magalhães,et al.

That article provides an overview, from bioeconomic and global sustainability perspectives, of the main constraints to the current global vaccine innovation system for achieving Sustainable Development Goals – SDGs. Biotechnology market trends, gaps in vaccine coverage against emerging and neglected diseases, and patent protection and regulation are discussed. A structured long-term “public-return-driven” innovation model to overcome vaccine market failure is proposed.
https://www.researchgate.net/publication/336565744_Vaccines_Biotechnology_Market_Coverage_and_Regulatory_Challenges_for_Achieving_Sustainable_Development_Goals

In [None]:
# Plotly Packages
from plotly import tools
from plotly.subplots import make_subplots
import plotly.figure_factory as ff
import plotly.graph_objs as go
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)

# Matplotlib and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns
from string import ascii_letters

# Statistical Libraries
from scipy.stats import norm
from scipy.stats import skew
from scipy.stats.stats import pearsonr
from scipy import stats

#Vaccines for neglected and emerging diseases in Brazil by 2030: the “valley of death” and opportunities for RD&I in Vaccinology 4.0

Authors: Akira Homma, Marcos da Silva Freire, Cristina Possas.

The authors examined the implications of the very low competitiveness of the Brazilian vaccine RD&I system, which precludes the development of all the important vaccines required by the National Immunization Program (NIP), severely impacting the healthcare of the population. 

In a country dramatically affected by COVID-19 pandemic and by an exponential increase in emerging and neglected diseases, particularly the poor, these RD&I constraints for vaccines become crucial governance issues. Such constraints are aggravated by a global scenario of limited commercial interest from multinational companies in vaccines for neglected and emerging diseases, which are falling into a “valley of death,” with only two vaccines produced in a pipeline of 240 vaccines. We stress that these constraints in the global pipeline are a window of opportunity for vaccine manufacturers in Brazil and other developing countries in the current paradigm transition towards Vaccinology 4.0. We conclude with recommendations for a new governance strategy supporting Brazilian public vaccine manufacturers in international collaborations for a sustainable national vaccine development and production plan by 2030.

In [None]:
df = pd.read_csv('../input/flu-data/H1N1_Flu_Vaccines.csv')
pd.set_option('display.max_columns', None)
df.head()

#Codes by Janio Martinez https://www.kaggle.com/janiobachmann/patient-charges-clustering-and-regression/notebook

In [None]:
#Code by Janio Martinez https://www.kaggle.com/janiobachmann/patient-charges-clustering-and-regression/notebook

# Determine the distribution of charge
h1n1_dist = df["h1n1_vaccine"].values
logcharge = np.log(df["h1n1_vaccine"])



trace0 = go.Histogram(
    x=h1n1_dist,
    histnorm='probability',
    name="H1N1 Vaccine Distribution",
    marker = dict(
        color = '#FA5858',
    )
)
trace1 = go.Histogram(
    x=logcharge,
    histnorm='probability',
    name="H1N1 Vaccine Distribution using Log",
    marker = dict(
        color = '#58FA82',
    )
)

fig = tools.make_subplots(rows=2, cols=1,
                          subplot_titles=('H1N1 Vaccine Distribution','Log H1N1 Vaccine Distribution'),
                         print_grid=False)



fig.append_trace(trace0, 1, 1)
fig.append_trace(trace1, 2, 1)


fig['layout'].update(showlegend=True, title='H1N1 Vaccine Distribution', bargap=0.05)
iplot(fig, filename='custom-sized-subplot-with-subplot-titles')

#Caution is needed when evaluating vaccines', by Ana Lucia Azevedo


According to Akira Homma (senior scientific adviser at Fiocruz), to control Covid-19 in Brazil, it is not enough to have effective vaccines. It will be necessary to ensure ample vaccine coverage and caution in the assessment of safety,  the most. 


THE FAST-TRACK PROCESS

The fast-track process is necessary due to the urgency of the pandemic. But we will not have all the important information with this acceleration. The approval for the emergency use of vaccines responds to the anomalous situation that we live in, but caution is needed with the safety assessment.

Efficacy results can be evaluated in a shorter time, using statistical instruments and comparing results in vaccinated and non-vaccinated people. The same is valid for safety data that refer to mild events, such as fever, pain located in the place of inoculation and tiredness. But this is not the case for rare events, which can be more serious and only become apparent with a much larger number of people vaccinated.

For this, there is stage four of vaccine development, when pharmacovigilance is carried out. For example, 10 million people are vaccinated, evaluated in a certain period of time. This is necessary because such effects, if any, will be detectable. We also do not know the duration of the immunity conferred. That is why vaccines, even if approved on an emergency basis, will continue to be evaluated.

Vaccines are among humanity's greatest achievements, but they are not trivial to develop. That is why there are so few. But so much money has never been invested, so much research has been done and there has been so much involvement by governments and institutions. I see with great hope the arrival of a first wave of vaccines against Covid-19.
https://oglobo.globo.com/sociedade/coronavirus/e-preciso-cautela-na-avaliacao-das-vacinas-afirma-especialista-24772505

In [None]:
#Code by Janio Martinez https://www.kaggle.com/janiobachmann/patient-charges-clustering-and-regression/notebook

# Determine the distribution of charge
seasonal_dist = df["seasonal_vaccine"].values
logcharge = np.log(df["seasonal_vaccine"])



trace0 = go.Histogram(
    x=seasonal_dist,
    histnorm='probability',
    name="Seasonal Vaccine Distribution",
    marker = dict(
        color = '#FA5858',
    )
)
trace1 = go.Histogram(
    x=logcharge,
    histnorm='probability',
    name="Seasonal Vaccine Distribution using Log",
    marker = dict(
        color = '#58FA82',
    )
)

fig = tools.make_subplots(rows=2, cols=1,
                          subplot_titles=('Seasonal Vaccine Distribution','Log Seasonal Vaccine Distribution'),
                         print_grid=False)



fig.append_trace(trace0, 1, 1)
fig.append_trace(trace1, 2, 1)


fig['layout'].update(showlegend=True, title='Seasonal Vaccine Distribution', bargap=0.05)
iplot(fig, filename='custom-sized-subplot-with-subplot-titles')

#To plot the Basic Distplot we need to remove Nan 

In [None]:
# categorical features with missing values
categorical_nan = [feature for feature in df.columns if df[feature].isna().sum()>0 and df[feature].dtypes=='O']
print(categorical_nan)

#What can delay production in Brazil, by Diego Junqueira 

One of the most complicated parts is production, “Pharmaceutical companies want to get this product to market quickly. Their forecast is ultra-optimistic ”, says Ricardo Gazzinelli (researcher at Fiocruz Minas and coordinator of the National Institute of Science and Technology for Vaccines), who heads the study of a Brazilian vaccine for covid-19 - and has no involvement with the Oxford project.

When a vaccine is approved, there will be a demand for all inputs for many places in the world. Will there be enough inputs to produce all the necessary vaccine?

BRAZILIAN FACTORIES WILL HAVE TO BE ADAPTED

Both Fiocruz and Butantan need to renovate their factories not only to fill covid-19 vaccines, but mainly for the production of the raw material. These adaptations may delay the forecast to start vaccination in January.

Fiocruz, which started the year with a R 300 million budget cut, convinced the federal government to spend almost R 2 billion to buy Oxford technology and renovate its vaccine factory. R $ 95.6 million will be spent on the renovation alone . In addition to making the new vaccine viable, Fiocruz will use the resources to expand its capacity, since its factory is at the limit.
https://translate.google.com.br/translate?hl=en&sl=pt&u=https://reporterbrasil.org.br/2020/08/vacina-da-covid-19-em-janeiro-saiba-o-que-pode-atrasar-a-producao-no-brasil/&prev=search&pto=aue

In [None]:
# replacing missing values in categorical features
for feature in categorical_nan:
    df[feature] = df[feature].fillna('None')

"Our final processing plants are at the top of capacity,” said Akira Homma, scientific advisor and former president of Fiocruz, in an interview with Repórter Brasil in February, before the pandemic. 

This happened due to the various vaccines incorporated in recent years at Fiocruz (such as rotavirus, pneumococcal and tetraviral). In addition, yellow fever and measles vaccines have increased production due to the recent disease outbreaks in São Paulo. 

Marco Krieger, Vice President of Production and Innovation in Health at Fiocruz, says that the problems have been solved with the extension of work shifts, and that Fiocruz will be able to process 40 million monthly doses of the Oxford vaccine “without compromising any other projects."

In the case of Butantan, the adaptation of the factory for the filling of the Chinese vaccine will be simpler, since it is a technology dominated by the institute. "Butantan has a productive structure that is not adapted for this vaccine from Covid-19, so it needs to be adapted for large-scale production, "

The greatest difficulty for the institute will be to renovate the factory to produce the raw material. The expectation is to spend R $ 130 million on the project from private donations, but the funds have not yet arrived. 
https://translate.google.com.br/translate?hl=en&sl=pt&u=https://reporterbrasil.org.br/2020/08/vacina-da-covid-19-em-janeiro-saiba-o-que-pode-atrasar-a-producao-no-brasil/&prev=search&pto=aue

In [None]:
df[categorical_nan].isna().sum()

#The 5 Stages of COVID-19 Vaccine Development, by Hallie Levine

PRECLINICAL STAGE: HOW WILL THIS VACCINE WORK?

This research-intensive stage is designed to find natural or synthetic antigens—foreign substances that induce an immune reaction in your body—that trigger the same reaction an actual virus or bacteria would. Identifying the right antigen or antigens can often take up to four years.

PHASES 1/2A AND 2B: IS IT SAFE, AND WHAT'S THE RIGHT DOSE?

Phase 1 testing marks the first time the vaccine is tested in a small group of adults, usually between 20 to 80 people, to evaluate its safety and measure the immune response it generates. Phase 2a studies aim to determine the most effective dose, and expand the safety experience with the vaccine.

PHASE 3: HOW EFFECTIVE IS THE VACCINE?

In this stage of the clinical trial, even more volunteers receive the vaccine to study whether it's effective.
Before volunteers are vaccinated, they will be tested to make sure they currently do not have the SARS-CoV-2 virus. Half of the group will be assigned to receive the vaccine; the other half will receive a placebo. Then they will all be followed closely for up to two years to see if they do develop COVID-19-related symptoms, such as fever, headache, shortness of breath, dry cough or gastrointestinal distress.

It may be that some people do go on to develop COVID-19, even after having been vaccinated, but they may have substantially milder symptoms than those who develop COVID-19 in a control group.

REGULATORY APPROVAL AND LICENSURE: IS IT READY FOR THE WORLD?

After a successful Phase 3 trial, vaccine manufacturers submit an application to regulatory bodies such as the European Commission or the U.S. Food and Drug Administration (FDA). At this stage, clinical trial data is reviewed to make sure the vaccine is safe and effective.

PHASE 4: WILL IT STAY SAFE DOWN THE ROAD?

Even after the vaccine is approved and licensed, regulatory agencies stay involved, continuing to monitor production; inspecting manufacturing facilities; and testing vaccines for potency, safety and purity.
https://www.jnj.com/innovation/the-5-stages-of-covid-19-vaccine-development-what-you-need-to-know-about-how-a-clinical-trial-works

In [None]:
# Lets first handle numerical features with nan value
numerical_nan = [feature for feature in df.columns if df[feature].isna().sum()>1 and df[feature].dtypes!='O']
numerical_nan

#Dependence on raw materials can lead to trade war 

#A market struggle that involves marketing, sales force, lobbying and political influence.

To start producing the ampoules, it is necessary to receive the raw material. While Butantan is waiting for partner Sinovac to build its factory to send the first shipments, Fiocruz still does not know where the first tons of Oxford vaccine concentrate will come from. 

Fiocruz and Astrazeneca (which has the vaccine production rights) recognized that the issue has not yet been defined. The problem in this case is that we may face a commercial dispute between countries for the raw material for the vaccine, which will be manufactured in different places around the world.

Astrazeneca has already sold at least 1.2 billion doses of the Oxford vaccine, 800 million to rich countries, 100 million to Brazil and 300 million to Gavi - an international alliance that provides vaccines to poor countries. Despite this, the multinational will not produce a single ampoule. Instead of renovating its own factory, the company chose to close partnerships with laboratories in other countries, including the United States and India.

What is certain is that the first lots will go to England, where Astrazeneca is based, and to the United States, which made the largest investment. “This is a market struggle, it involves marketing, sales force, lobbying and political influence. WHO has tried to civilize this beating a bit, ”says Reinaldo Guimarães, vice president of Abrasco (Brazilian Association of Public Health).https://translate.google.com.br/translate?hl=en&sl=pt&u=https://reporterbrasil.org.br/2020/08/vacina-da-covid-19-em-janeiro-saiba-o-que-pode-atrasar-a-producao-no-brasil/&prev=search&pto=aue

In [None]:
df[numerical_nan].isna().sum()

#In past pandemics, history has shown that Rich countries have access to vaccines before

WHO is discussing with the 194 countries of the organization that, with a safe and effective product, everyone has access to a parcel at first. Thus seeking a balance because, in past pandemics, history has shown that wealthy countries have access to vaccines before , said the assistant director-general for access to medicines, vaccines and pharmaceutical products of WHO, the Brazilian Mariângela Simão. 

“This is not going to be a free and unimpeded path, not least because there is a production capacity that is finite and it is necessary to know how to produce all doses. And it is not just a vaccine at the factory, it is not just a substance, there must be glass and various accessory inputs that are essential. It is possible that the same thing happened with respirators and that it is lacking in the market due to the great demand ”, says Guimarães. An association of manufacturers of hospital products has already warned of the risk of missing needles. 

Mazzei, from Astrazeneca, says that the raw material delivery schedule should respect the date of payments by countries. "We know that globally there is a race for the acquisition of vaccines, but we will respect if necessary the order in which the efforts took place, working so that countries receive the products in the shortest possible time," 
https://translate.google.com.br/translate?hl=en&sl=pt&u=https://reporterbrasil.org.br/2020/08/vacina-da-covid-19-em-janeiro-saiba-o-que-pode-atrasar-a-producao-no-brasil/&prev=search&pto=aue

In [None]:
## Replacing the numerical Missing Values

for feature in numerical_nan:
    ## We will replace by using median since there are outliers
    median_value=df[feature].median()
    
    df[feature].fillna(median_value,inplace=True)
    
df[numerical_nan].isnull().sum()

#Average technology transfer time is 10 years

#Two stages: The manufacture of the concentrated “syrup” and Final Step of diluting/filling/labeling/testing/distributing. 

Comparing to a soft drink factory, the production of a vaccine can be divided into two stages: first the manufacture of the concentrated “syrup”, then the final step of diluting, filling, labeling, testing and distributing. 

The 220 million doses guaranteed so far in Brazil will go through formulation and packaging, which is far from simple because it involves a product sensitive, for example, to temperature variation. “The final processing of a vaccine can be as complex as the production of the IFA raw material itself”, says Krieger, from Fiocruz.

Once this stage of filling is over, Brazil will learn to produce the raw material for the two vaccines. Fiocruz and Butantan promise to dominate this process in 12 months. However, it would be an unprecedented feat in Brazil, given that the two laboratories took an average of 10 years to assimilate the technology of other vaccines. 

Today, Fiocruz participates in the transfer of technology for the development of rotavirus and pneumococcal vaccines. These partnerships started in 2008 and 2010, respectively, and have not yet been concluded. At the Butantan Institute, the agreement to produce the flu vaccine started in 1999 and was only concluded in 2013. "The transfer of technology is not as fast as we would like, but it depends on the interest of the parties involved", says Homma.


#Innovative technology never applied commercially

The Oxford vaccine brings an extra difficulty: it uses innovative technology never applied commercially, which can bring challenges that factories are unaware of. “The production of this type of vaccine is carried out today by universities and research laboratories, but nothing on a large scale”, says biomedical researcher Diego Moura Tanajura, professor at the Federal University of Sergipe.
https://translate.google.com.br/translate?hl=en&sl=pt&u=https://reporterbrasil.org.br/2020/08/vacina-da-covid-19-em-janeiro-saiba-o-que-pode-atrasar-a-producao-no-brasil/&prev=search&pto=aue

In [None]:
chronic = [df["chronic_med_condition"].values.tolist()]
group_labels = ['Chronic Medical Condition Distribution']

colors = ['#FA5858']

fig = ff.create_distplot(chronic, group_labels, colors=colors)
# Add title
fig['layout'].update(title='Normal Distribution <br> Central Limit Theorem Condition')

iplot(fig, filename='Basic Distplot')

#Inactivated virus technology

China's Sinovac vaccine uses inactivated virus technology, similar to others manufactured by Butantan, such as flu and rabies, which may represent a productive advantage. Despite this, the São Paulo institute is unable to produce on a large scale all stages of coronavac (name of the Chinese vaccine). It is as if Butantan had the right oven but lacked an ingredient to make the complete recipe, in an analogy used by Covas . That missing ingredient is the raw material.

#The longer Brazil takes to master all stages of production, the longer dependence on foreign companies will last. 

And that will make all the difference, because, in the case of the Oxford vaccine, the raw material will only be sold at cost while the pandemic declaration by the WHO lasts. China, on the other hand, has already declared that it will not charge royalties if the country finds a vaccine against covid-19 (the product will be considered a global public good). 

Almost R 2 billion will be invested to enable the production of the Oxford vaccine in Brazil. About R 1.3 billion will be paid to Astrazeneca in exchange for the complete technology and raw material for the 100 million doses, according to the Ministry of Health - the multinational says it will receive a lower value, close to R 1 billion.

NO REFUNDS IF THE VACCINE FAILS HUMAN TESTING.

It is a risky purchase. If the vaccine fails human testing, there will be no doses, and the money will not be returned. The investment is worthwhile, according to Fiocruz, because even if the covid-19 vaccine does not work, its technology could be used to research vaccines against other diseases, such as malaria, HIV and other types of coronavirus.
https://translate.google.com.br/translate?hl=en&sl=pt&u=https://reporterbrasil.org.br/2020/08/vacina-da-covid-19-em-janeiro-saiba-o-que-pode-atrasar-a-producao-no-brasil/&prev=search&pto=aue

In [None]:
corr = df.corr()

hm = go.Heatmap(
    z=corr.values,
    x=corr.index.values.tolist(),
    y=corr.index.values.tolist()
)


data = [hm]
layout = go.Layout(title="Correlation Heatmap")

fig = dict(data=data, layout=layout)
iplot(fig, filename='labelled-heatmap')

#The tests are not over yet

It is important to remember that there is still no vaccine for covid-19. There are 165 candidates, according to the WHO, 26 of them in human trials, six of which in the last stage: including the Oxford and Sinovac vaccines.

There is little information about the preliminary results of the Chinese vaccine , which began to be released. The last participant in the tests will be evaluated by Butantan only in October 2021, according to Piauí magazine .

Regarding the Oxford vaccine, previous results were celebrated in July , which indicated good protection for people aged 18 to 55 years. But there is still no consolidated information on performance in the elderly, which is the main risk group for Covid-19. 
https://translate.google.com.br/translate?hl=en&sl=pt&u=https://reporterbrasil.org.br/2020/08/vacina-da-covid-19-em-janeiro-saiba-o-que-pode-atrasar-a-producao-no-brasil/&prev=search&pto=aue

In [None]:
from sklearn.preprocessing import LabelEncoder

#fill in mean for floats
for c in df.columns:
    if df[c].dtype=='float16' or  df[c].dtype=='float32' or  df[c].dtype=='float64':
        df[c].fillna(df[c].mean())

#fill in -999 for categoricals
df = df.fillna(-999)
# Label Encoding
for f in df.columns:
    if df[f].dtype=='object': 
        lbl = LabelEncoder()
        lbl.fit(list(df[f].values))
        df[f] = lbl.transform(list(df[f].values))
        
print('Labelling done.')

#How many doses will guarantee immunity?

Another doubt refers to the amount of doses that will be necessary to guarantee protection to the organism. Preliminary results from the Oxford vaccine indicated that two injections guarantee more safety than one. And how long will this protection last? Will it be necessary to vaccinate the population every year? These are questions that will only be answered at the end of clinical tests - and which may require even greater effort from Brazilian laboratories.

Regardless of which vaccine will hit the market, experts warn that they are facing a feat never achieved. “There is no factory in the world capable of producing 8 billion doses. So, we really have to make an unprecedented effort 

“What needs to enter people's minds is that, at first, there will not be a vaccine for everyone,” “I have been working for a long time in the area of access to medicine, I would say that the barriers in these first two years will be related to having the necessary quantities to vaccinate the most vulnerable groups”
https://translate.google.com.br/translate?hl=en&sl=pt&u=https://reporterbrasil.org.br/2020/08/vacina-da-covid-19-em-janeiro-saiba-o-que-pode-atrasar-a-producao-no-brasil/&prev=search&pto=aue

In [None]:
df = pd.get_dummies(df)

#Must Encode to avoid errors

#Comparing Independent Categorical Variables (ANOVA)

P-value: The p-value being higher than 0.05 tells us that we take the Null hypothesis, meaning that there is no a significant change between the age group when it comes to Income Poverty.

In [None]:
import statsmodels.api as sm
from statsmodels.formula.api import ols


moore_lm = ols("age_group ~ income_poverty", data=df).fit()
print(moore_lm.summary())

In [None]:
# Create subpplots
f, (ax1, ax2, ax3) = plt.subplots(ncols=3, figsize=(18,8))

# I wonder if the cluster that is on the top is from older people
sns.stripplot(x="age_group", y="h1n1_vaccine", data=df, ax=ax1, linewidth=1, palette="Reds")
ax1.set_title("Relationship H1N1 Vaccine and Age ")


sns.stripplot(x="age_group", y="h1n1_vaccine", hue="chronic_med_condition", data=df, ax=ax2, linewidth=1, palette="Set2")
ax2.set_title("Medical Condition , Age  and H1N1 Vaccine")

sns.stripplot(x="h1n1_concern", y="h1n1_vaccine", hue="chronic_med_condition", data=df, ax=ax3, linewidth=1, palette="Set2")
ax3.legend_.remove()
ax3.set_title("H1N1 Concern and H1N1 Vacine")

plt.show()

In [None]:
fig = ff.create_facet_grid(
    df,
    x='chronic_med_condition',
    y='h1n1_vaccine',
    color_name='age_group',
    show_boxes=False,
    marker={'size': 10, 'opacity': 1.0},
    #colormap={'55 - 64 Years': 'rgb(208, 246, 130)', '65+ Years': 'rgb(166, 246, 130)',  '18 - 34 Years': 'rgb(251, 232, 238)'}
)
#251, 232, 238


fig['layout'].update(title="Age vs H1N1 Vaccine", width=800, height=600, plot_bgcolor='rgb(251, 251, 251)', 
                     paper_bgcolor='rgb(255, 255, 255)')


iplot(fig, filename='facet - custom colormap')

In [None]:
# Let's store the original dataframe in another variable.
original_df = df.copy()



from sklearn.cluster import KMeans
from yellowbrick.cluster import KElbowVisualizer

fig = plt.figure(figsize=(12,8))

# KNears Neighbors 
df.head()
original_df.head()

X = df[["h1n1_concern", "h1n1_vaccine"]]


# Instantiate the clustering model and visualizer
model = KMeans()
visualizer = KElbowVisualizer(model, k=(2,6))

visualizer.fit(X)    # Fit the data to the visualizer
visualizer.poof()

In [None]:
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt

kmeans = KMeans(n_clusters=3)  
kmeans.fit(X)

In [None]:
# Printing the Centroids
print(kmeans.cluster_centers_)

In [None]:
print(kmeans.labels_)

In [None]:
fig = plt.figure(figsize=(12,8))

plt.scatter(X.values[:,0], X.values[:,1], c=kmeans.labels_, cmap="Set1_r", s=25)
plt.scatter(kmeans.cluster_centers_[:,0] ,kmeans.cluster_centers_[:,1], color='black', marker="x", s=250)
plt.title("Kmeans Clustering \n Finding Unknown Groups in the Population", fontsize=16)
plt.show()

In [None]:
from sklearn.cluster import AgglomerativeClustering

X = df[["h1n1_concern", "h1n1_vaccine"]]

agglomerative_clustering = AgglomerativeClustering(n_clusters=4).fit(X)
agglomerative_clustering

In [None]:
from scipy.cluster.hierarchy import dendrogram, linkage

# 5% of the data 
sample_df = df.sample(frac=.05)

sample_X = sample_df[["h1n1_concern", "h1n1_vaccine"]]

sample_agglomerative_clustering = AgglomerativeClustering(n_clusters=4).fit(sample_X)
sample_agglomerative_clustering


linked = linkage(sample_agglomerative_clustering.children_, 'single')

In [None]:
agglomerative_clustering.labels_

In [None]:
print(plt.style.available)

In [None]:
plt.style.use("Solarize_Light2")

fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(16,6))

ax1.scatter(X.values[:,0], X.values[:,1], c=agglomerative_clustering.labels_, cmap="Set1_r", s=25)
ax1.set_title("Agglomerative Clustering", fontsize=16)

dendrogram(linked,  
            orientation='top',
            labels=sample_agglomerative_clustering.labels_,
            distance_sort='descending',
            show_leaf_counts=False,
          ax=ax2)

ax2.set_title("Dendogram on Agglomerative Clustering")

plt.show()

That dendrogram looks like a doodle

In [None]:
corr = df.corr()

In [None]:
mask = np.zeros_like(corr, dtype=np.bool)
mask[np.triu_indices_from(mask)] = True

# Set up the matplotlib figure
f, ax = plt.subplots(figsize=(12, 10))

# Generate a custom diverging colormap
cmap = sns.diverging_palette(220, 10, as_cmap=True)

# Draw the heatmap with the mask and correct aspect ratio
sns.heatmap(corr, mask=mask, cmap=cmap, vmax=.3, center=0,
            square=True, linewidths=.5, cbar_kws={"shrink": .5})

plt.title("Diagonal Correlation Matrix", fontsize=20)

plt.show()

In [None]:
fig = plt.figure(figsize=(12,8))

corr = df.corr()
ax = sns.heatmap(corr, linewidths=.5, cmap="RdBu", annot=True, fmt="g")
plt.title("Correlation Plot \n to Decide which features to drop", fontsize=16)
plt.show()

#Not very helping. Not helping at all!

In [None]:
model_without_census_msa = ols("h1n1_vaccine ~ h1n1_concern + age_group", data=df).fit()
print(model_without_census_msa.summary())

#Without census msa

In [None]:
# Age in out X-axis since it has a higher correlation with charges.
fig = plt.figure(figsize=(12,8))
fig = sm.graphics.plot_regress_exog(model_without_census_msa, "age_group", fig=fig)

In [None]:
fig, ax = plt.subplots(figsize=(12, 8))
fig = sm.graphics.plot_fit(model_without_census_msa, "age_group", ax=ax)

In [None]:
model_with_census_msa = ols("h1n1_vaccine ~ census_msa + h1n1_concern + age_group", data=df).fit()
print(model_with_census_msa.summary())

#Without census msa

In [None]:
# Age in out X-axis since it has a higher correlation with charges.
fig = plt.figure(figsize=(12,8))
fig = sm.graphics.plot_regress_exog(model_with_census_msa, "age_group", fig=fig)

In [None]:
fig, ax = plt.subplots(figsize=(12, 8))
fig = sm.graphics.plot_fit(model_with_census_msa, "age_group", ax=ax)

In [None]:
# Let's see the skewness of charges
not_normalized = skew(df['h1n1_vaccine'].values.tolist())
normalized = skew(np.log(df['h1n1_vaccine'].values.tolist()))



trace0 = go.Bar(
    x=['Not Normalized', 'Normalized'],
    y=[not_normalized, normalized],
    text=['Not Normalized Skewness', 'Normalized Skewness'],
    marker=dict(
        color='rgb(158,202,225)',
        line=dict(
            color='rgb(8,48,107)',
            width=1.5,
        )
    ),
    opacity=0.6
)

data = [trace0]
layout = go.Layout(
    title='H1N1 Vaccine Skewness \n Normalized vs Not Normalized',
    yaxis=dict(
        title='Skewness Coeficient',
        titlefont=dict(
            size=16,
            color='rgb(107, 107, 107)'
        )
))

fig = go.Figure(data=data, layout=layout)

iplot(fig, filename='bar-direct-labels')

In [None]:
df['log_charges'] = np.log(df["h1n1_vaccine"])

model_with_logcharges = ols("log_charges ~ census_msa + h1n1_concern + age_group", data=df).fit()
print(model_with_census_msa.summary())

In [None]:
# Using age to predict charges
plt.style.use("dark_background")

fig = plt.figure(figsize=(12,8))
fig = sm.graphics.plot_regress_exog(model_with_logcharges, "age_group", fig=fig)

The end result: A potentially life-saving vaccine for COVID-19 that’s “been developed in months, which is something that, up until now, has been virtually unheard of,”“But there is an urgent medical need for us to do this, and do it safely.”
https://www.jnj.com/innovation/the-5-stages-of-covid-19-vaccine-development-what-you-need-to-know-about-how-a-clinical-trial-works

In [None]:
fig, ax = plt.subplots(figsize=(12, 8))
fig = sm.graphics.plot_fit(model_with_logcharges, "age_group", ax=ax)

#The time it will take to get the vaccine out and reach everyone is "the time for patience"

The time it will take to get the vaccine out and reach everyone is "the time for patience". Until that happens, our main allies against the new coronavirus will continue to be the use of the mask, the hygiene of hands, objects and environments and social detachment.
https://translate.google.com.br/translate?hl=en&sl=pt&u=https://reporterbrasil.org.br/2020/08/vacina-da-covid-19-em-janeiro-saiba-o-que-pode-atrasar-a-producao-no-brasil/&prev=search&pto=aue

In [None]:
#Code by Olga Belitskaya https://www.kaggle.com/olgabelitskaya/sequential-data/comments
from IPython.display import display,HTML
c1,c2,f1,f2,fs1,fs2=\
'#eb3434','#eb3446','Akronim','Smokum',30,15
def dhtml(string,fontcolor=c1,font=f1,fontsize=fs1):
    display(HTML("""<style>
    @import 'https://fonts.googleapis.com/css?family="""\
    +font+"""&effect=3d-float';</style>
    <h1 class='font-effect-3d-float' style='font-family:"""+\
    font+"""; color:"""+fontcolor+"""; font-size:"""+\
    str(fontsize)+"""px;'>%s</h1>"""%string))
    
    
dhtml('Bless Fiocruz and Butantan Institut, @mpwolke Was here' )