The WHO said on Friday (15 may 2020) that it is studying a possible link between COVID-19 and Kawasaki disease, an inflammatory syndrome that has affected some children around the world.  Then I decide to upload a Dataset about Kawasaki disease.
“The initial hypotheses indicate that this syndrome may be linked to COVID-19 (…). We call on all doctors worldwide to work with their national authorities and WHO to be alert and better understand this syndrome in children, ”said WHO Director-General, Tedros Adhanom Ghebreyesus, at a virtual press conference. in Geneva. 
"It is crucial to characterize this clinical syndrome accurately and urgently, to understand its causality and describe treatment protocols," he added. 
https://translate.google.com.br/translate?hl=en&sl=pt&u=https://istoe.com.br/oms-diz-estudar-possivel-ligacao-entre-a-covid-19-e-a-doenca-de-kawasaki-em-criancas/&prev=search

In [None]:
#codes from Rodrigo Lima  @rodrigolima82
from IPython.display import Image
Image(url = 'https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcRJ92t6pcZ6gUQz3JWdNNXKij04Du52AF8SqZjykTTD2vb-SIsA&usqp=CAU',width=400,height=400)

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.graph_objs as go
import plotly.offline as py
import plotly.express as px

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

Kawasaki disease is a rare acute paediatric vasculitis, with coronary artery aneurysms as its main complication. The diagnosis is based on the presence of persistent fever, exanthema, lymphadenopathy, conjunctival injection, and changes to the mucosae and extremities.

In [None]:
nRowsRead = 1000 # specify 'None' if want to read whole file
df = pd.read_csv('../input/cusersmarildownloadskawasakicsv/kawasaki.csv', delimiter=';', encoding = "ISO-8859-1", nrows = nRowsRead)
df.dataframeName = 'kawasaki.csv'
nRow, nCol = df.shape
print(f'There are {nRow} rows and {nCol} columns')

In [None]:
df.head()


WHO and its global network of doctors have developed a preliminary definition and made available to doctors a notification form for any suspected case of Systemic Inflammatory Response Syndrome (SRIS). 
The cases of the disease registered worldwide are rare and the role of the new coronavirus in the development of the infection remains unknown, said Michael Ryan, head of WHO emergency programs. 

"We don't know if it is the virus that attacks the cells or if it is an excessive immune response" that causes inflammation, as is the case with Ebola fever, he explained. 
The disease has puzzled health authorities in several countries for two weeks, while children are little affected by severe forms of COVID-19. 
After a first alert in the UK in late April, similar cases have been reported in New York, Italy and Spain. Deaths are extremely rare, with a five-year-old deceased in New York and a 14-year-old in London. 
A first fatal case was reported on Friday in France. A nine-year-old boy died in Marseille (south) on May 8 of brain damage from a heart attack, Professor Fabrice Michel, head of the pediatric resuscitation service at La Timone hospital, told AFP. 
Serological tests showed that this child "had come into contact" with the coronavirus, but had not developed symptoms of COVID-19. 
The symptoms of the inflammatory syndrome are high fever, abdominal pain and digestive disorders, skin rash, conjunctivitis and the tongue that turns red, swells and looks like a raspberry. 
Such symptoms are similar to those of Kawasaki disease, which affects children and causes inflammation of blood vessels. 
https://translate.google.com.br/translate?hl=en&sl=pt&u=https://istoe.com.br/oms-diz-estudar-possivel-ligacao-entre-a-covid-19-e-a-doenca-de-kawasaki-em-criancas/&prev=search

In [None]:
df.corr()
plt.figure(figsize=(10,4))
sns.heatmap(df.corr(),annot=True,cmap='Reds')
plt.show()

In [None]:
fig = go.Figure(data=[go.Scatter(
    x=df['Samples'][0:10],
    y=df['204252_at'][0:10],
    mode='markers',
    marker=dict(
        color=[145, 140, 135, 130, 125, 120,115,110,105,100],
        size=[100, 90, 70, 60, 60, 60,50,50,40,35],
        showscale=True
        )
)])
fig.update_layout(
    title='Kawasaki disease',
    xaxis_title="Samples",
    yaxis_title="204252_at",
)
fig.show()

In [None]:
import plotly.offline as pyo
import plotly.graph_objs as go
lowerdf = df.groupby('Samples').size()/df['204252_at'].count()*100
labels = lowerdf.index
values = lowerdf.values

# Use `hole` to create a donut-like pie chart
fig = go.Figure(data=[go.Pie(labels=labels, values=values, hole=.6)])
fig.show()

#Kawasaki-like disease: emerging complication during the COVID-19 pandemic  by Russell M Viner and Elizabeth Whittaker
Published:May 13, 2020DOI:https://doi.org/10.1016/S0140-6736(20)31129-6

In Italy, ten cases were described (seven boys, three girls; aged 7·5 years) of a Kawasaki-like disease occurring in Bergamo,at the peak of the pandemic in the country (Feb 18 to April 20, 2020), a monthly incidence some 30-fold higher than observed for Kawasaki disease across the previous 5 years. Bergamo was the city with the highest rate of infections and deaths in Italy at that time. Within the cluster were five children who had features similar to Kawasaki disease (ie, non-purulent conjunctivitis, polymorphic rash, mucosal changes, and swollen extremities); however, another five children presented with fewer than three of the diagnostic clinical signs and were older than patients with classic Kawasaki disease. There was also a high proportion of shock, with five of ten children presenting with hypotension requiring fluid resuscitation, and two of ten children needing inotropic support. Two of ten children had a positive severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) PCR swab and eight of ten had a SARS-CoV-2-positive serology test; however, these tests were not done contemporaneously with the episode, so the clinical relevance is unclear. The majority of patients with Kawasaki disease respond well to intravenous immunoglobulin; however, 10–20% require additional anti-inflammatory treatment.These differences raise the question as to whether this cluster is Kawasaki disease with SARS-CoV-2 as the triggering agent, or represents an emerging Kawasaki-like disease characterised by multisystem inflammation. The diagnosis of Kawasaki disease is based on clinical and laboratory criteria and is hindered by the lack of a diagnostic test. Understanding the pathophysiology of this emerging phenomenon might provide welcome insights into our understanding of Kawasaki disease.https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(20)31129-6/fulltext

In [None]:
fig = px.bar(df[['204252_at','Samples']].sort_values('Samples', ascending=False), 
                        y = "Samples", x= "204252_at", color='Samples', template='ggplot2')
fig.update_xaxes(tickangle=45, tickfont=dict(family='Rockwell', color='crimson', size=14))
fig.update_layout(title_text="Kawasaki disease")

fig.show()

Although the Article suggests a possible emerging inflammatory syndrome associated with COVID-19, it is crucial to reiterate—for parents and health-care workers alike—that children remain minimally affected by SARS-CoV-2 infection overall. Understanding this inflammatory phenomenon in children might provide vital information about immune responses to SARS-CoV-2 and possible correlates of immune protection that might have relevance both for adults and children. In particular, if this is an antibody-mediated phenomenon, there might be implications for vaccine studies, and this might also explain why some children become very ill with COVID-19, while the majority are unaffected or asymptomatic.
https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(20)31129-6/fulltext

In [None]:
fig = px.bar(df, x= "204252_at", y= "Samples", color_discrete_sequence=['crimson'],)
fig.show()

#Once again thanks to rossinEndrew his SHAP VALUES Visualization.
https://www.kaggle.com/endrewrossin/fast-initial-lightgbm-model-to-detect-exam-result/comments

In [None]:
import shap
import lightgbm as lgb
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import KFold
import random

#An outbreak of severe Kawasaki-like disease at the Italian epicentre of the SARS-CoV-2 epidemic: an observational cohort study
By Lucio Verdoni, MD; Angelo Mazza, MD; Annalisa Gervasoni, MD; Laura Martelli, MD; Maurizio Ruggeri, MD;Matteo Ciuffreda, MD; et al.

Published:May 13, 2020DOI:https://doi.org/10.1016/S0140-6736(20)31103-X

Children diagnosed after the SARS-CoV-2 epidemic began showed evidence of immune response to the virus, were older, had a higher rate of cardiac involvement, and features of MAS. We therefore showed that SARS-CoV-2 might cause a severe form of Kawasaki-like disease.

#Implications of all the available evidence

Outbreaks of Kawasaki-like disease might occur in countries affected by the SARS-CoV-2 pandemic, and might present outside the classic Kawasaki disease phenotype. This condition might be serious and requires prompt and more aggressive management. Future research on the cause of Kawasaki disease and similar syndromes should focus on immune responses to viral triggers. https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(20)31103-X/fulltext 

In [None]:
df.isnull().sum()

In [None]:
df['Samples'] = df['Samples'].replace(['negative','positive'], [0,1])

#Statistical analysis

The Student's t test, the χ2 method, and Fisher's exact test were done when appropriate for statistical analysis to compare continuous and categorical variables. A p value of <0·05 was chosen as cutoff for significance. Data were analysed with SPSS (version 20.0) and GraphPad Prism (version 5.00 for Mac). The study was approved by the Bergamo Ethics Committee (registration number 37/20, 25/03/2020).

In [None]:
SEED = 99
random.seed(SEED)
np.random.seed(SEED)

In [None]:
dfmodel = df.copy()

# read the "object" columns and use labelEncoder to transform to numeric
for col in dfmodel.columns[dfmodel.dtypes == 'object']:
    le = LabelEncoder()
    dfmodel[col] = dfmodel[col].astype(str)
    le.fit(dfmodel[col])
    dfmodel[col] = le.transform(dfmodel[col])

#The Study findings

The findings have important implications for public health. The association between SARS-CoV-2 and Kawasaki-like disease should be taken into account when it comes to considering social reintegration policies for the paediatric population. However, the Kawasaki-like disease described there remains a rare condition, probably affecting no more than one in 1000 children exposed to SARS-CoV-2. This estimate is based on the limited data from the case series in this region.

The study has the limitations of a relatively small case series, requiring confirmation in larger groups. Genetic studies investigating the susceptibility of patients developing this disease to the triggering effect of SARS-CoV-2 should be done. Nonetheless, they reported a strong association between an outbreak of Kawasaki-like disease and the SARS-CoV-2 epidemic in the Bergamo province of Italy. Patients diagnosed with Kawasaki-like disease after the viral spreading revealed a severe course, including KDSS and MAS, and required adjunctive steroid treatment. A similar outbreak of Kawasaki-like disease is expected in countries affected by the SARS-CoV-2 pandemic.https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(20)31103-X/fulltext

In [None]:
#change columns names to alphanumeric
dfmodel.columns = ["".join (c if c.isalnum() else "_" for c in str(x)) for x in dfmodel.columns]

In [None]:
X = dfmodel.drop(['Samples','204252_at'], axis = 1)
y = dfmodel['Samples']

In [None]:
lgb_params = {
                    'objective':'binary',
                    'metric':'auc',
                    'n_jobs':-1,
                    'learning_rate':0.005,
                    'num_leaves': 20,
                    'max_depth':-1,
                    'subsample':0.9,
                    'n_estimators':2500,
                    'seed': SEED,
                    'early_stopping_rounds':100, 
                }

In [None]:
# choose the number of folds, and create a variable to store the auc values and the iteration values.
K = 5
folds = KFold(K, shuffle = True, random_state = SEED)
best_scorecv= 0
best_iteration=0

# Separate data in folds, create train and validation dataframes, train the model and cauculate the mean AUC.
for fold , (train_index,test_index) in enumerate(folds.split(X, y)):
    print('Fold:',fold+1)
          
    X_traincv, X_testcv = X.iloc[train_index], X.iloc[test_index]
    y_traincv, y_testcv = y.iloc[train_index], y.iloc[test_index]
    
    train_data = lgb.Dataset(X_traincv, y_traincv)
    val_data   = lgb.Dataset(X_testcv, y_testcv)
    
    LGBM = lgb.train(lgb_params, train_data, valid_sets=[train_data,val_data], verbose_eval=250)
    best_scorecv += LGBM.best_score['valid_1']['auc']
    best_iteration += LGBM.best_iteration

best_scorecv /= K
best_iteration /= K
print('\n Mean AUC score:', best_scorecv)
print('\n Mean best iteration:', best_iteration)

In [None]:
lgb_params = {
                    'objective':'binary',
                    'metric':'auc',
                    'n_jobs':-1,
                    'learning_rate':0.05,
                    'num_leaves': 20,
                    'max_depth':-1,
                    'subsample':0.9,
                    'n_estimators':round(best_iteration),
                    'seed': SEED,
                    'early_stopping_rounds':None, 
                }

train_data_final = lgb.Dataset(X, y)
LGBM = lgb.train(lgb_params, train_data)

In [None]:
print(LGBM)

In [None]:
# telling wich model to use
explainer = shap.TreeExplainer(LGBM)
# Calculating the Shap values of X features
shap_values = explainer.shap_values(X)

In [None]:
shap.summary_plot(shap_values[1], X, plot_type="bar")

In [None]:
shap.summary_plot(shap_values[1], X)

Rossin and his codes that I wrote in many Notebooks: rossinEndrew SHAP VALUES Visualization. https://www.kaggle.com/endrewrossin/fast-initial-lightgbm-model-to-detect-exam-result/comments

Below code from Thor God of Thunder

In [None]:
cat = []
num = []
for col in df.columns:
    if df[col].dtype=='O':
        cat.append(col)
    else:
        num.append(col)  
        
        
num 

In [None]:
plt.style.use('dark_background')
for col in df[num].drop(['211803_at'],axis=1):
    plt.figure(figsize=(8,5))
    plt.plot(df[col].value_counts(),color='Red')
    plt.xlabel(col)
    plt.ylabel('211803_at')
    plt.tight_layout()
    plt.show()

In [None]:
fig = px.parallel_categories(df, color="211803_at", color_continuous_scale=px.colors.sequential.Viridis)
fig.show()

In [None]:
#codes from Rodrigo Lima  @rodrigolima82
from IPython.display import Image
Image(url = 'https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcQdMfbJurUaH6uReRfU2MvTdf-_DonCd5p40WdGJAP80IaMZ-Fi&usqp=CAU',width=400,height=400)

In [None]:
#codes from Rodrigo Lima  @rodrigolima82
from IPython.display import Image
Image(url = 'https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcRT5fE-RqxWGpXPEvOp-qQ2HnHg-gkgkOyrgjNKB1VbvZMCfjO3&usqp=CAU',width=400,height=400)

Kaggle Notebook Runner: Marília Prata  @mpwolke