#Situation Overview and Humanitarian Needs

Despite the progressive slowing down of new cases registered, the epidemic is still ongoing: all the 8 regions of the country are affected, with a tendency for the disease to spread out from the capital and other main cities towards rural areas.

So far, Niamey reported the highest number of cumulated cases, but Zinder is becoming the region with the most active transmission, with almost daily new cases. Considering the overall slowing down of the new cases, the Niger government decided to suspend some of the measures in place to control COVID-19 transmission and schools will resume their activities starting on 1st June.

UNICEF continues to work closely with the Government and its partners to respond to the ongoing outbreak in the country, which is already facing the consequences of multiple crisis (nutrition, conflicts, natural disasters). According with its updated response plan, UNICEF continues to support the Government, and particularly the Ministry of Health (MoH), in the field of risk communication/community engagement (RCCE), infection prevention and control (IPC), supply and logistics, epidemiological surveillance and healthcare provision.

Moreover, the Country Office is closely working with the Ministry of Education to prepare the re-opening of schools and with Child Protection actors to ensure that the needs of children on the move are met. UNICEF is co-leading 3 of the 8 sub-committees established by the MoH (RCCE, IPC and logistics) at central and sub-national level and is an active member of the others. UNICEF is also participating to the UN COVID-19 crisis group.https://reliefweb.int/report/niger/niger-coronavirus-covid-19-situation-report-06-11-27-may-2020

#Unicef/Juan Haro - In Niger, children are affected by many humanitarian crisis.

![](https://global.unitednations.entermediadb.net/assets/mediadb/services/module/asset/downloads/preset/Libraries/Production+Library/15-05-2020_UNICEF-I328048_Niger.jpg/image1170x530cropped.jpg)https://news.un.org/pt/story/2020/06/1716942

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
import plotly.offline as py
import plotly.express as px

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
df1 = pd.read_csv('../input/hackathon/task_2-Tuberculosis_infection_estimates_for_2018.csv', encoding='utf8')
df1.head()

In [None]:
NE = df1[(df1['country']=='Niger')].reset_index(drop=True)
NE.head()

In [None]:
sns.countplot(x="e_hh_size",data=NE,palette="ocean",edgecolor="black")
plt.title('Niger Estimated Household Size', weight='bold')
plt.xticks(rotation=45)
plt.yticks(rotation=45)
# changing the font size
sns.set(font_scale=1)

In [None]:
sns.countplot(x="e_prevtx_kids_pct_hi",data=NE,palette="autumn",edgecolor="black")
plt.title('Niger Preventive Index Kids under 5y High Bound', weight='bold')
plt.xticks(rotation=45)
plt.yticks(rotation=45)
# changing the font size
sns.set(font_scale=1)

In [None]:
ls ../input/hackathon/task_1-google_search_txt_files_v2/NE/

In [None]:
Niger = '../input/hackathon/task_1-google_search_txt_files_v2/NE/Niger-fr-result-28.txt'

In [None]:
text = open(Niger, 'r',encoding='utf-8',
                 errors='ignore').read()

In [None]:
print(text[:2000])

In [None]:
df = pd.read_csv('../input/ai4all-project/figures/classifier/auc_table.csv', encoding='ISO-8859-2')
df.head()

#New polio outbreak in Niger after vaccination suspended, by: Array, Associated Press Posted: Apr 29, 2020

LONDON (AP) — The World Health Organization says Niger has been struck by a new outbreak of polio, following the suspension of immunization activities during the COVID-19 pandemic.

The U.N. health agency reported that two children were infected by the highly infectious, water-borne disease and that one was paralyzed. The outbreak was sparked by a mutated virus that originated in the vaccine and was not connected to a previous polio epidemic Niger stopped last year, WHO said, in a statement last week.

“The poliovirus will inevitably continue to circulate and may paralyze more children as no high-quality immunization campaigns can be conducted in a timely manner,” said Pascal Mkanda, WHO’s coordinator of polio eradication in Africa.

In rare cases, the live virus in oral polio vaccine can evolve into a form capable of igniting new outbreaks among non-immunized children; stopping the epidemic requires more targeted vaccination. https://www.wric.com/health/un-new-polio-outbreak-in-niger-after-vaccination-suspended/

![](https://www.saudemais.tv/uploads/cache/noticia_0000017379-711x400.jpg)https://www.saudemais.tv/noticia/17379-covid-19-poliomielite-regressa-ao-niger-apos-suspensao-da-vacinacao-devido-a-pandemia

In [None]:
def most_frequent_values(data):
    total = data.count()
    tt = pd.DataFrame(total)
    tt.columns = ['Total']
    items = []
    vals = []
    for col in data.columns:
        itm = data[col].value_counts().index[0]
        val = data[col].value_counts().values[0]
        items.append(itm)
        vals.append(val)
    tt['Most frequent item'] = items
    tt['Frequence'] = vals
    tt['Percent from total'] = np.round(vals / total * 100, 3)
    return(np.transpose(tt))

In [None]:
most_frequent_values(df)

In [None]:
targets = list(df.columns[0:])
targets

In [None]:
# categorical features
categorical_feat = [feature for feature in df.columns if df[feature].dtypes=='O']
print('Total categorical features: ', len(categorical_feat))
print('\n',categorical_feat)

In [None]:
from sklearn.preprocessing import LabelEncoder
categorical_col = ('Model', 'All', 'COVID vs No virus', 'Covid vs Other virus')
        
        
for col in categorical_col:
    label = LabelEncoder() 
    label.fit(list(df[col].values)) 
    df[col] = label.transform(list(df[col].values))

print('Shape all_data: {}'.format(df.shape))

In [None]:
plt.figure(figsize=(14,8))
sns.barplot(data=df,x='Model',y='COVID vs No virus',color=sns.color_palette('Set3')[0])
plt.title('COVID vs No Virus Model')
plt.xlabel('Model')
plt.ylabel('COVID vs No virus')
plt.xticks(rotation=45)
for i in range(df.shape[0]):
    count = df.iloc[i]['COVID vs No virus']
    plt.text(i,count+1,df.iloc[i]['COVID vs No virus'],ha='center')
    
from IPython.display import display, Markdown
display(Markdown("Most Number of COVID vs No Virus **20-50**"))

In [None]:
plt.figure(figsize = (10,8))
sns.set(style = "darkgrid")
plt.title("Distribution of COVID vs No virus", fontdict = {'fontsize':20})
ax = sns.countplot(x = "COVID vs No virus", hue = 'Model', data = df)

In [None]:
plt.figure(figsize = (10,8))
sns.barplot(x = 'COVID vs No virus', y = 'Model', data = df);

In [None]:
from sklearn.metrics import accuracy_score, roc_auc_score, roc_curve, confusion_matrix, auc
from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV
from sklearn.preprocessing import LabelEncoder, StandardScaler 

In [None]:
from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import RandomForestClassifier
from xgboost import XGBClassifier
from sklearn.neural_network import MLPClassifier

from warnings import filterwarnings
filterwarnings('ignore')

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import learning_curve
from sklearn.metrics import r2_score, make_scorer
from sklearn.metrics import roc_auc_score

In [None]:
y = df["COVID vs No virus"]
X = df.drop(["COVID vs No virus"], axis=1)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.30, random_state = 42)

In [None]:
cart = DecisionTreeClassifier(max_depth = 12)

In [None]:
cart_model = cart.fit(X_train, y_train)

In [None]:
y_pred = cart_model.predict(X_test)

In [None]:
print('Decision Tree Model')

print('Accuracy Score: {}\n\nConfusion Matrix:\n {}\n\nAUC Score: {}'
      .format(accuracy_score(y_test,y_pred), confusion_matrix(y_test,y_pred), roc_auc_score(y_test,y_pred)))

#ValueError: multi_class must be in ('ovo', 'ovr')

In [None]:
pd.DataFrame(data = cart_model.feature_importances_*100,
                   columns = ["Importances"],
                   index = X_train.columns).sort_values("Importances", ascending = False)[:20].plot(kind = "barh", color = "r")

plt.xlabel("Feature Importances (%)")

In [None]:
# We can use the functions to apply the models and roc curves to save space.
def model(algorithm, X_train, X_test, y_train, y_test):
    alg = algorithm
    alg_model = alg.fit(X_train, y_train)
    global y_prob, y_pred
    y_prob = alg.predict_proba(X_test)[:,1]
    y_pred = alg_model.predict(X_test)

    print('Accuracy Score: {}\n\nConfusion Matrix:\n {}'
      .format(accuracy_score(y_test,y_pred), confusion_matrix(y_test,y_pred)))
    

def ROC(y_test, y_prob):
    
    false_positive_rate, true_positive_rate, threshold = roc_curve(y_test, y_prob)
    roc_auc = auc(false_positive_rate, true_positive_rate)
    
    plt.figure(figsize = (10,10))
    plt.title('Receiver Operating Characteristic')
    plt.plot(false_positive_rate, true_positive_rate, color = 'red', label = 'AUC = %0.2f' % roc_auc)
    plt.legend(loc = 'lower right')
    plt.plot([0, 1], [0, 1], linestyle = '--')
    plt.axis('tight')
    plt.ylabel('True Positive Rate')
    plt.xlabel('False Positive Rate')

#Model and ROC Curve Comparison

In [None]:
print('Model: Logistic Regression\n')
model(LogisticRegression(solver = "liblinear"), X_train, X_test, y_train, y_test)

#I stopped Since I got ValueError: n_splits=8 cannot be greater than the number of members in each class.

In [None]:
print('Model: Gaussian Naive Bayes\n')
model(GaussianNB(), X_train, X_test, y_train, y_test)

In [None]:
#word cloud
from wordcloud import WordCloud, ImageColorGenerator
text = " ".join(str(each) for each in NE.country)
# Create and generate a word cloud image:
wordcloud = WordCloud(max_words=200,colormap='Set3', background_color="black").generate(text)
plt.figure(figsize=(10,6))
plt.figure(figsize=(15,10))
# Display the generated image:
plt.imshow(wordcloud, interpolation='Bilinear')
plt.axis("off")
plt.figure(1,figsize=(12, 12))
plt.show()

Das War's, Kaggle Notebook Runner: Marília Prata   @mpwolke 