<a href="https://colab.research.google.com/github/everestso/Fall2021/blob/main/guinea_bissau_bcg_covid19.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Low birth weight infants and Calmette-Guérin bacillus vaccination at birth: community study from Guinea-Bissau

By Adam Roth , Henrik Jensen, May-Lill Garly, Queba Djana, Cesário Lourenco Martins, Morten Sodemann, Amabelia Rodrigues, Peter Aaby
PMID: 15194836 DOI: 10.1097/01.inf.0000129693.81082.a0

In developing countries, low birth weight (LBW) children are often not vaccinated with Calmette-Guérin bacillus (BCG) at birth. Recent studies have suggested that BCG may have a nonspecific beneficial effect on infant mortality. They evaluated the consequences of not vaccinating LBW (low birth weight) children at birth in Guinea-Bissau.

Between 1989 and 1999, 7138 children born at the central hospital had a birth weight registered. They assessed BCG coverage until 3 years of age. Data on tuberculin skin test (TST) for 297 children and BCG scar for 1319 children in the study population were reanalyzed for differences between normal birth weight (NBW) children and LBW children. They assessed the effect of early BCG vaccination on mortality to 12 months of age.

Among LBW children there were 1.5- to 3-fold more unvaccinated individuals than among NBW children up to 4 months of age. There was no overall difference between LBW and NBW children in TST or BCG scarring; LBW children vaccinated early may have had slightly reduced reactions to tuberculin. Among 845 LBW children, 182 had received BCG within the first week of life. Controlling for background factors and censoring at first diphtheria-tetanuspertussis vaccination, measles vaccination or at 6 months of age (whichever came first), the mortality rate ratio for BCG-vaccinated versus -unvaccinated LBW children was 0.17 (95 per cent confidence interval, 0.06-0.49), with an even stronger effect for LBW children vaccinated in the first week of life (mortality rate ratio, 0.07; 95 per cent confidence interval, 0.01-0.62).

The policy of not vaccinating with BCG at birth had a negative impact on vaccination coverage for LBW children. Early BCG vaccination had no large negative impact on TST and BCG scarring. Mortality was lower for BCG-vaccinated than for unvaccinated LBW children controlling for available background factors. BCG vaccination of LBW children may have a beneficial effect on survival that cannot be explained by protection against tuberculosis. Future studies should examine possible adverse effects from equalizing BCG policy for LBW and NBW children.https://pubmed.ncbi.nlm.nih.gov/15194836/

![](https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcTQXRC8iSNx0JbH7N0C_bA-bmOPqbJtVN1gG_3bi0536aQL9LEnM3_5eCRG0r8wOyxo0hMTP4NsQHrM00oac5MNIQ&usqp=CAU&ec=45687378)covid19gb.com

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
import plotly.offline as py
import plotly.express as px

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

#Coronavirus Drugs—To Have and Have Not by Joelle Dountio O.
Africa's least developed countries should apply a WTO patent waiver, to import generic versions of future COVID-19 drugs

![](https://www.thinkglobalhealth.org/sites/default/files/2020-05/Joelle.Ofimboudem-CoV-Patents-4.30.20-RTX7EKXB-THREE-TWO.jpg)https://www.thinkglobalhealth.org/tag/guinea-bissau

#Covid-19: State of emergency extended until August 24 in Guinea-Bissau

President Umaro Sissoco Embaló decided to prolong the state of emergency for the seventh time, "considering the evolution of the epidemiological situation in the country, translated by the increase in the number of confirmed cases".


The state of emergency ended this Saturday (25.07) at 00:00 local time. The head of state also took into account the fact that the country is in the rainy season, when there is an "increase in morbidity and mortality in the population" and when there is "a significant increase in respiratory infections and malaria".

"Although the situation described above deserves special attention, it should be noted that positive results were achieved in the fight against the Covid-19 pandemic thanks to a great and multiple human solidarity. The gains achieved must be maintained, consolidated and increased", says Umaro Sissoco Embaló in the presidential decree released to the press this Saturday.

International circulation

The Guinean President also announced that he decided to lift the suspension of international circulation, taking into account the decision of the Economic Community of West African States (ECOWAS), an organization of which Guinea-Bissau is a part, to open up space for cross-border circulation. .

In the decree, the head of state maintains the mandatory use of an individual protection mask and respect for physical distance.

Umaro Sissoco Embaló declared the state of emergency in the country for the first time in March, after the first cases of infection with the new coronavirus had been confirmed.

Guinea-Bissau, with around two million inhabitants, has almost an accumulated total of 2,000 cases of infections, including 26 fatalities.https://translate.google.com.br/translate?hl=en&sl=pt&u=https://www.dw.com/pt-002/coronav%25C3%25ADrus-na-guin%25C3%25A9-bissau/t-52892587&prev=search&pto=aue

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
pd.options.display.float_format = '{:.2f}'.format
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
from sklearn.metrics import roc_auc_score
from sklearn.metrics import plot_roc_curve
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import classification_report
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.metrics import precision_recall_curve

In [None]:
df = pd.read_csv('../input/hackathon/task_2-owid_covid_data-21_June_2020.csv')
df.head()

In [None]:
guinea = df[(df['location']=='Guinea-Bissau')].reset_index(drop=True)
guinea.head()

#Codes by Tanmay Deshpande https://www.kaggle.com/tanmay111999/handling-imbalanced-datasets-for-high-scores/comments

In [None]:
cases = len(df[df['total_cases'] == 1])/len(df)*100
no_cases = len(df[df['total_cases'] == 0])/len(df)*100
cases_percentage = [cases,no_cases]

#There are complaints of lack of conditions in a hospital dedicated to tuberculosis.

In Bissau, staff and patients at Raoul Follereau Hospital, dedicated to the treatment of tuberculosis, complain about the lack of conditions and protective materials. Hospital administration says that is not so.

On Monday (20.07), employees boycotted the operation of Raoul Follereau hospital, complaining about the lack of materials and minimum conditions for work, without protective masks and gloves.

The hospital is the only center specialized in tuberculosis treatment in Guinea-Bissau and is administered by the Italian NGO Aid to Health and Development (AHEAD), which has assumed the guarantee of free consultations, hospitalizations and treatments for patients with tuberculosis, entitled to five meals a day. , all for free.

But patients admitted to the hospital heard by DW Africa report another reality. They prevented food from leaving home, but they put a liter of lemon in their food from the hospital  and, if they eat it, they feel stomach and chest pain. So, what is the use of preventing food from coming home? prevent us from eating inside, "says a patient.

"Not everyone has a mosquito net here, I ordered it at home. What they gave me here is not good. But the breakfast they prepare for us is good and we eat," said another patient.

#Government abandons hospitals?

In Guinea-Bissau, there are several centers specialized in the treatment of different pathologies that are under the tutelage of NGOs or have been abandoned.

Doctor Hedwis Martins asks the Government to assume its responsibilities: "It is NGOs that help in sustainability, in the administrative functioning and in the functioning of hospital structures".

"Government should take over specialized institutions and see them as a source of revenue”, calls on Martins.

The doctor also defends that "it is only to organize and equip them, to create conditions so that the analyzes that will be requested are made in the hospital center and the patients can have a decent hospitalization".https://translate.google.com.br/translate?hl=en&sl=pt&u=https://www.dw.com/pt-002/coronav%25C3%25ADrus-na-guin%25C3%25A9-bissau/t-52892587&prev=search&pto=aue


In [None]:
fig,ax = plt.subplots(nrows = 1,ncols = 3,figsize = (20,5))
plt.subplot(1,3,1)
plt.pie(cases_percentage,labels = ['Cases','No Cases'],autopct='%1.1f%%',startangle = 90,)
plt.title('Guinea-Bissau Total Cases Percentage')

plt.subplot(1,3,2)
sns.countplot('total_cases',data = guinea,)
plt.xticks(rotation=45)
plt.title('Guinea-Bissau Total Cases')

plt.subplot(1,3,3)
sns.scatterplot('total_cases_per_million','total_deaths',data = guinea,hue = 'total_cases')
plt.title('Guinea-Bissau Total Cases Per Million vs Total Deaths w.r.t (with respect to) Total Cases')
plt.show()

In [None]:
sns.heatmap(guinea.corr(),cmap = 'RdBu',cbar = True)

#w.r.t = with respect to

In [None]:
corr = df.corrwith(guinea['total_cases']).sort_values(ascending = False).to_frame()
corr.columns = ['Correlations']
plt.subplots(figsize = (5,25))
sns.heatmap(corr,annot = True,cmap = 'RdBu',linewidths = 0.4,linecolor = 'black')
plt.title('Guinea-Bissau Correlation w.r.t Total Cases')

#Bissau: Deputies donate 25% of salary to fight Covid-19

The National People's Assembly of Guinea-Bissau handed over a check for more than 40,000 euros to the High Commission for the Fight against Covid-19 for support in combating the pandemic caused by the new coronavirus.
Guinean deputies decided to contribute 25 per cent of their salary to help fight Covid-19, donating a total of 27,964,528 cfa francs (about 42,600 euros).

"I am grateful on behalf of the High Commissioner for the check resulting from the contribution of all the deputies of the Nation to the work of the fight against Covid-19," said the high commissioner for the fight against Covid-19, Magda Robalo da Silva.

In the words of thanks, the former Minister of Health asked the deputies to also contribute to the fight against the pandemic among their constituencies by raising awareness for prevention.

"Thank you very much on behalf of the Guinean people, whom you represent at the National People's Assembly," said Magda Robalo da Silva.

https://translate.google.com.br/translate?hl=en&sl=pt&u=https://www.dw.com/pt-002/coronav%25C3%25ADrus-na-guin%25C3%25A9-bissau/t-52892587&prev=search&pto=aue

In [None]:
sns.heatmap(guinea.isnull(),cmap = 'winter',cbar = False)

#Handling Missing Values

In [None]:
# Lets first handle numerical features with nan value
numerical_nan = [feature for feature in guinea.columns if guinea[feature].isna().sum()>1 and guinea[feature].dtypes!='O']
numerical_nan

In [None]:
guinea[numerical_nan].isna().sum()

In [None]:
## Replacing the numerical Missing Values

for feature in numerical_nan:
    ## We will replace by using median since there are outliers
    median_value=df[feature].median()
    
    guinea[feature].fillna(median_value,inplace=True)
    
guinea[numerical_nan].isnull().sum()

In [None]:
# categorical features with missing values
categorical_nan = [feature for feature in guinea.columns if guinea[feature].isna().sum()>1 and guinea[feature].dtypes=='O']
print(categorical_nan)

In [None]:
guinea[categorical_nan].isna().sum()

In [None]:
# replacing missing values in categorical features
for feature in categorical_nan:
    guinea[feature] = guinea[feature].fillna('None')

In [None]:
guinea[categorical_nan].isna().sum()

#Feature Selection

In [None]:
guinea = guinea[['total_cases', 'stringency_index','new_cases','new_cases_per_million','total_cases_per_million', 'total_deaths_per_million', 'total_deaths']]
guinea.head()

In [None]:
import imblearn
from collections import Counter
from imblearn.over_sampling import SMOTE
from imblearn.under_sampling import RandomUnderSampler
from imblearn.pipeline import Pipeline

imblearn.__version__

In [None]:
def model(classifier):
    
    classifier.fit(x_train,y_train)
    prediction = classifier.predict(x_test)
    cv = RepeatedStratifiedKFold(n_splits = 10,n_repeats = 3,random_state = 1)
    print("CROSS VALIDATION SCORE : ",'{0:.2%}'.format(cross_val_score(classifier,x_train,y_train,cv = cv,scoring = 'roc_auc').mean()))
    print("ROC_AUC SCORE : ",'{0:.2%}'.format(roc_auc_score(y_test,prediction)))
    plot_roc_curve(classifier, x_test,y_test)
    plt.title('ROC_AUC_PLOT')
    plt.show()

#EVALUATION METRICS

FOR IMBALANCED DATASETS, WE CANNOT USE THE TRADITIONAL METRIC LIKE ACCURACY. THIS IS BECAUSE THE DATA IS SKEWED TOWARDS THE MAJORITY CLASS AND THE MODEL WILL FAVOUR ITS PREDICTIONS TOWARDS MAJORITY CLASS. NOW AFTER RESAMPLING, WE HAVE DUPLICATED THE DATA AND ADDED NEW DATA POINTS.THUS USING ACCURACY WOULD BE MISLEADING TO EVALUATE THE MODEL. WE WILL USE THE CONFUSION MATRIX,ROC-AUC GRAPH AND ROC-AUC SCORE TO EVALUATE THE MODEL. ROC-AUC GIVES US THE RELATION ABOUT TRUE POSITIVE & FALSE POSITIVE RATE. WE WILL ALSO USE THE F1 SCORE,RECALL AND PRECISION.

In [None]:
def model_evaluation(classifier):
    
    # CONFUSION MATRIX
    cm = confusion_matrix(y_test,classifier.predict(x_test))
    names = ['True Neg','False Pos','False Neg','True Pos']
    counts = [value for value in cm.flatten()]
    percentages = ['{0:.2%}'.format(value) for value in cm.flatten()/np.sum(cm)]
    labels = [f'{v1}\n{v2}\n{v3}' for v1, v2, v3 in zip(names,counts,percentages)]
    labels = np.asarray(labels).reshape(2,2)
    sns.heatmap(cm,annot = labels,cmap = 'Blues',fmt ='')
    
    # CLASSIFICATION REPORT
    print(classification_report(y_test,classifier.predict(x_test)))

#I changed to iloc 6 since there are 7 columns, after feature selection

In [None]:
over = SMOTE(sampling_strategy= 0.5)
under = RandomUnderSampler(sampling_strategy = 0.1)
features = guinea.iloc[:,:6].values
target = guinea.iloc[:,6].values

#Multi class classification problem can be solved in different ways:

Create a binary variable for each class and predict them individually as binary classification after that combine the results but it is not the right choice if we have high number of classes because it takes good processing time. This binary classifier for multiclass can be used with one-vs-all or all-vs-all reduction method. Here you can go with logistic regression, decision tree algorithms.

You can go with algorithms like Naive Bayes, Neural Networks and SVM to solve multi class problem.

You can also go with multi layers modeling also, first group classes in different categories and then apply other modeling techniques over it.https://discuss.analyticsvidhya.com/t/which-algorithms-are-good-for-multiclass-classification-problems/2561

I tried Naive Bayes and SVM and didn't work. I'll have to learn how to deal with binary variables and multilayers.

In [None]:
steps = [('under', under),('over', over)]
pipeline = Pipeline(steps=steps)
features, target = pipeline.fit_resample(features, target)
Counter(target)

In [None]:
x_train, x_test, y_train, y_test = train_test_split(features, target, test_size = 0.20, random_state = 2)

In [None]:
from sklearn.linear_model import LogisticRegression

In [None]:
classifier_lr = LogisticRegression(random_state = 0,C=10,penalty= 'l2')

In [None]:
from imblearn.over_sampling import SMOTE
smote=SMOTE("minority")
X,Y=smote.fit_sample(x_train,y_train)

In [None]:
model(classifier_lr)

In [None]:
model_evaluation(classifier_lr)

In [None]:
X_train = guinea.iloc[:, 2:].values.astype('float64')
y_train = guinea['total_cases'].values

In [None]:
from sklearn.preprocessing import QuantileTransformer

transformed = pd.DataFrame(QuantileTransformer(output_distribution='normal').fit_transform(X_train))

In [None]:
from sklearn.pipeline import make_pipeline
from sklearn.naive_bayes import GaussianNB

pipeline = make_pipeline(QuantileTransformer(output_distribution='normal'), GaussianNB())
pipeline.fit(X_train, y_train)

In [None]:
from sklearn.metrics import roc_curve, auc

fpr, tpr, thr = roc_curve(y_train, pipeline.predict_proba(X_train)[:,1])
plt.plot(fpr, tpr)
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic Plot', **title_config)
auc(fpr, tpr)

In [None]:
from sklearn.model_selection import cross_val_score

cross_val_score(pipeline, X_train, y_train, scoring='roc_auc', cv=5).mean()

In [None]:
#model_evaluation(classifier_svc)

In [None]:
df1 = pd.read_csv('../input/hackathon/task_2-Tuberculosis_infection_estimates_for_2018.csv', encoding='utf8')
df1.head()

In [None]:
Guinea = df1[(df1['country']=='Guinea-Bissau')].reset_index(drop=True)
Guinea.head()

In [None]:
sns.countplot(x="e_hh_size",data=Guinea,palette="ocean",edgecolor="black")
plt.title('Guinea Estimated Household Size', weight='bold')
plt.xticks(rotation=45)
plt.yticks(rotation=45)
# changing the font size
sns.set(font_scale=1)

In [None]:
sns.countplot(x="e_prevtx_kids_pct_hi",data=Guinea,palette="autumn",edgecolor="black")
plt.title('Guinea Preventive Index Kids under 5y High Bound', weight='bold')
plt.xticks(rotation=45)
plt.yticks(rotation=45)
# changing the font size
sns.set(font_scale=1)

In [None]:
df2 = pd.read_csv('../input/hackathon/task_2-BCG_world_atlas_data-bcg_strain-7July2020.csv', encoding='utf8')
df2.head()

In [None]:
guinea_bissau = df2[(df2['country_name']=='Guinea-Bissau')].reset_index(drop=True)
guinea_bissau.head()

In [None]:
sns.countplot(x="vaccination_timing",data=guinea_bissau,palette="flag",edgecolor="black")
plt.title('Guinea-Bissau BCG Vacccination Timing', weight='bold')
plt.xticks(rotation=45)
plt.yticks(rotation=45)
# changing the font size
sns.set(font_scale=1)

In [None]:
ls ../input/hackathon/task_1-google_search_txt_files_v2/GW/

In [None]:
Guinea_Bissau = '../input/hackathon/task_1-google_search_txt_files_v2/GW/Guinea-Bissau-pt-result-13.txt'

In [None]:
text = open(Guinea_Bissau, 'r',encoding='utf-8',
                 errors='ignore').read()

In [None]:
print(text[:2000])

In [None]:
#word cloud
from wordcloud import WordCloud, ImageColorGenerator
text = " ".join(str(each) for each in guinea_bissau.country_name)
# Create and generate a word cloud image:
wordcloud = WordCloud(max_words=200,colormap='Set3', background_color="black").generate(text)
plt.figure(figsize=(10,6))
plt.figure(figsize=(15,10))
# Display the generated image:
plt.imshow(wordcloud, interpolation='Bilinear')
plt.axis("off")
plt.figure(1,figsize=(12, 12))
plt.show()

Das War's Kaggle Notebook Runner: Marília Prata   @mpwolke