<h1 style="background-color:#DC143C; font-family:'Brush Script MT',cursive;color:white;font-size:200%; text-align:center;border-radius: 50% 20% / 10% 40%">Amyotrophic Lateral Sclerosis Mortality</h1>

Amyotrophic Lateral Sclerosis Mortality in the United States, 2011-2014

Citation: Larson TC, Kaye W, Mehta P, Horton DK. Amyotrophic Lateral Sclerosis Mortality in the United States, 2011-2014. Neuroepidemiology. 2018;51(1-2):96-103. doi: 10.1159/000488891. Epub 2018 Jul 10. PMID: 29990963; PMCID: PMC6159829.

"The International Classification of Disease, 10th Revision (ICD-10) did not include a code specific for Amyotrophic lateral sclerosis (ALS) until 2017."

"The proportion of excluded records coded G12.2 but not ALS was 0.21, resulting in 24,328 ALS deaths. The overall age-adjusted mortality rate was 1.70 (95% CI 1.68-1.72). The rate among males was 2.09 (95% CI 2.05-2.12) and females was 1.37 (95% CI 1.35-1.40). The overall rate among whites was 1.84, blacks 1.03, and other races 0.70. For both sexes and all races, the rate increased with age and peaked among 75-79 year-olds. Rates tended to be greater in states at higher latitudes."

"Conclusions: Previous reports of ALS mortality in the United States showed similar age, sex, and race distributions but with greater age-adjusted mortality rates due to the inclusion of other diseases in the case definition. When using ICD-10 data collected prior to 2017, additional review of multiple-cause of death data is required for the accurate estimation of ALS deaths."

https://pubmed.ncbi.nlm.nih.gov/29990963/

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.graph_objs as go
import plotly.offline as py
import plotly.express as px

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
df = pd.read_csv('../input/end-als/end-als/clinical-data/filtered-metadata/metadata/clinical/Mortality.csv', encoding='ISO-8859-2')
pd.set_option('display.max_columns', None)
df.head()

**<span style="color:#DC143C;">Columns Names:</span>**

aut = Was an autopsy performed?  Yes: 1, No: 2, Unknown: 90

autdt = Date of Autopsy

autobt =  Has a copy of the autopsy report been obtained?

autpmi = Post Mortem Interval (hours)

auttyp = Type of autopsy performed (Limited or complete)

diedcaus = Cause of death.

dieddt = Date of death.

icd10cm = ICD-10 CM Code for cause of death.

source= Data Source, medical records, original collection, patient reported, unknown, other.

sourcesp = Specify other data source

https://www.kaggle.com/alsgroup/end-als

In [None]:
df.isnull().sum()

In [None]:
# Lets first handle numerical features with nan value
numerical_nan = [feature for feature in df.columns if df[feature].isna().sum()>1 and df[feature].dtypes!='O']
numerical_nan

In [None]:
## Replacing the numerical Missing Values

for feature in numerical_nan:
    ## We will replace by using median since there are outliers
    median_value=df[feature].median()
    
    df[feature].fillna(median_value,inplace=True)
    
df[numerical_nan].isnull().sum()

<h1 style="background-color:#DC143C; font-family:'Brush Script MT',cursive;color:white;font-size:200%; text-align:center;border-radius: 50% 20% / 10% 40%">Factors Predicting One-year Mortality in Amyotrophic Lateral Sclerosis Patients – Data From a Population-based Registry</h1>

Authors: Joachim Wolf; Anton Safer; Johannes C Wöhrle; Frederick Palm; Wilfred A Nix; Matthias Maschke; Armin J Grau

BMC Neurol. 2014;14(197)

" Survival in amyotrophic lateral sclerosis varies considerably. About one third of the patients die within 12 months after first diagnosis. The early recognition of fast progression is essential for patients and neurologists to weigh up invasive therapeutic interventions. In a prospective, population-based cohort of ALS patients in Rhineland-Palatinate, Germany, the authors identified significant prognostic factors at time of diagnosis that allow prediction of early death within first 12 months."

"Methods: Univariate analysis utilized the Log-Rank Test to identify association between candidate demographic and disease variables and one-year mortality. In a second step theye investigated a multiple logistic regression model for the optimal prediction of one-year mortality rate."

"In the cohort of 176 ALS patients (mean age 66.2 years; follow-up 100%) one-year mortality rate from diagnosis was 34.1%. Multivariate analysis revealed that age over 75 years, interval between symptom onset and diagnosis below 7 months, decline of body weight before diagnosis exceeding 2 BMI units and Functional Rating Score below 31 points were independent factors predicting early death."

"Probability of early death within 12 months from diagnosis is predicted by advanced age, short interval between symptom onset and first diagnosis, rapid decline of body weight before diagnosis and advanced functional impairment."

https://www.medscape.com/viewarticle/834905

In [None]:
# categorical features with missing values
categorical_nan = [feature for feature in df.columns if df[feature].isna().sum()>0 and df[feature].dtypes=='O']
print(categorical_nan)

In [None]:
# replacing missing values in categorical features
for feature in categorical_nan:
    df[feature] = df[feature].fillna('None')
    
df[categorical_nan].isna().sum()    

In [None]:
#Died cause (diedcaus)

x = df.drop(['diedcaus'], axis=1)
y = df.iloc[:,0]

In [None]:
from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()
for i in range(0,x.shape[1]):
    if x.dtypes[i]=='object':
        x[x.columns[i]] = le.fit_transform(x[x.columns[i]])
        
print(x)

#Code by Rafael Batista  https://www.kaggle.com/faelk8/prevendo-sobreviventes-do-titanic-autokeras-e-h2o/notebook

In [None]:
from sklearn.model_selection import train_test_split

x_df, x_test, y_df, y_test = train_test_split(x, y, test_size=0.2)

#AutoKeras

In [None]:
!pip install autokeras

In [None]:
import autokeras as ak

In [None]:
clf = ak.StructuredDataClassifier(overwrite=True, max_trials=100)
clf.fit(x_df,y_df,epochs=30)
predicted_y = clf.predict(x_test)
predicted_y

#val_accuracy: 0.0
Best val_accuracy So Far: 0.15789473056793213

#H2OAutoML

In [None]:
import h2o
from h2o.automl import H2OAutoML

In [None]:
import h2o
from h2o.estimators.gbm import H2OGradientBoostingEstimator

h2o.init()
h2o.cluster().show_status()

df_h2o = h2o.H2OFrame(df)

# convert response column to a factor
df_h2o["diedcaus"] = df_h2o["diedcaus"].asfactor()

# set the predictor names and the response column name
predictors = ['aut','autdt','dieddt','auttyp','source']
response = "diedcaus"

# split into train and validation sets
train, valid = df_h2o.split_frame(ratios = [.8])

# try using the `y` parameter:
# first initialize your estimator
cars_gbm = H2OAutoML(max_models=10, max_runtime_secs=1800)

# then train your model, where you specify your 'x' predictors, your 'y' the response column
# training_frame and validation_frame
cars_gbm.train(x = predictors, y = response, training_frame = train, validation_frame = valid)

# print the auc for the validation data
lb = cars_gbm.leaderboard
lb.head(rows=lb.nrows)

In [None]:
df_h2o = h2o.H2OFrame(df)
vl = cars_gbm.predict(df_h2o)
print(vl)
v = vl[0]
print(v)

In [None]:
vl = vl.as_data_frame()

In [None]:
vl

<h1 style="background-color:#DC143C; font-family:'Brush Script MT',cursive;color:white;font-size:200%; text-align:center;border-radius: 50% 20% / 10% 40%">Prognostic factors in ALS: A critical review</h1>

Citation: Chiò A, Logroscino G, Hardiman O, et al. Prognostic factors in ALS: A critical review. Amyotroph Lateral Scler. 2009;10(5-6):310-323. doi:10.3109/17482960802566824

"The discrepancies about ALS survival found in the published literature are mostly related to differences in study design. However, when considering only studies based on register methodology (more likely to report the full spectrum of the ALS population), the range of median survival is quite narrow (around 30 months from first symptom). Interestingly, these studies are also characterized by an older age of onset (62–67 years) than those based on other designs. However, in 10–20% of cases survival exceeds five years and, in 5–10%, 10 years."

"Despite the evidence of several publications, it is still impossible to predict with a good approximation the prognosis for an individual patient at the time of his/her diagnosis. However, several prognostic factors are well established."

"There is a general consensus that older age and bulbar onset are negatively related to ALS outcome, but the complex relationship between age, female gender and bulbar onset remains to be clarified. Also, the time delay from onset to diagnosis and the El Escorial diagnosis of definite ALS at the time of presentation, seem to have prognostic relevance, since they probably reflect a more rapid progression of the disorder."

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3515205/

In [None]:
#Code by Olga Belitskaya https://www.kaggle.com/olgabelitskaya/sequential-data/comments
from IPython.display import display,HTML
c1,c2,f1,f2,fs1,fs2=\
'#eb3434','#eb3446','Akronim','Smokum',30,15
def dhtml(string,fontcolor=c1,font=f1,fontsize=fs1):
    display(HTML("""<style>
    @import 'https://fonts.googleapis.com/css?family="""\
    +font+"""&effect=3d-float';</style>
    <h1 class='font-effect-3d-float' style='font-family:"""+\
    font+"""; color:"""+fontcolor+"""; font-size:"""+\
    str(fontsize)+"""px;'>%s</h1>"""%string))
    
    
dhtml('Thank you Rafael Batista for the script, @faelk8' )