# Childhood Autistic Spectrum Disorder Screening Prediction

In [1]:
import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, accuracy_score

from keras.models import Sequential
from keras.layers import Dense

## Autism

Autism Spectrum Disorder is a neurodevelopment condition characterized by challengers with social skills, repetitive behaviors, speech, and nonverbal communication. The early diagnosis of neurodevelopment disorders can improve treatment and significantly decrease the associated healthcare costs.

<img src="https://lirp.cdn-website.com/86df2c99/dms3rep/multi/opt/ASD-640w.png" style="width:30%;"/>

## Data set

The [Autistic Spectrum Disorder Screening Data for Children Data Set](https://archive.ics.uci.edu/ml/datasets/Autistic+Spectrum+Disorder+Screening+Data+for+Children++) will be used, which contains screening data for 292 patients. The attributes are:

* Age
* Gender
* Ethnicity
* Born with jaundice
* Family member with PDD (pervasive developmental disorders)
* Who is completing the test (parent, self, caregiver, medical staff, clinician)
* Country of residence 
* Used the screening app before 
* Screening Method Type based on age category (0=toddler, 1=child, 2= adolescent, 3= adult)
* Question 1 Answer 
* Question 2 Answer 
* Question 3 Answer 
* Question 4 Answer 
* Question 5 Answer 
* Question 6 Answer 
* Question 7 Answer 
* Question 8 Answer 
* Question 9 Answer 
* Question 10 Answer 
* Screening Score 

In [2]:
df = pd.read_csv('utils/autism_data.csv')

In [3]:
print(f'Shape: {df.shape}')
df.head()

Shape: (292, 21)


Unnamed: 0,A1_Score,A2_Score,A3_Score,A4_Score,A5_Score,A6_Score,A7_Score,A8_Score,A9_Score,A10_Score,...,gender,ethnicity,jundice,austim,contry_of_res,used_app_before,result,age_desc,relation,Class/ASD
0,1,1,0,0,1,1,0,1,0,0,...,m,Others,no,no,Jordan,no,5.0,4-11 years,Parent,NO
1,1,1,0,0,1,1,0,1,0,0,...,m,Middle Eastern,no,no,Jordan,no,5.0,4-11 years,Parent,NO
2,1,1,0,0,0,1,1,1,0,0,...,m,?,no,no,Jordan,yes,5.0,4-11 years,?,NO
3,0,1,0,0,1,1,0,0,0,1,...,f,?,yes,no,Jordan,no,4.0,4-11 years,?,NO
4,1,1,1,1,1,1,1,1,1,1,...,m,Others,yes,no,United States,no,10.0,4-11 years,Parent,YES


## Data Preprocessing

In [4]:
df = df.drop(['result', 'age_desc'], axis=1)
df.dropna(inplace=True)
df.replace('?',-99999, inplace=True)

In [5]:
X = df.drop(['Class/ASD'], axis=1)
y = df['Class/ASD']

In [6]:
X_hot_encoded = pd.get_dummies(X).to_numpy()
Y_hot_encoded = pd.get_dummies(y).to_numpy()

In [7]:
X_train, X_test, Y_train, Y_test = train_test_split(X_hot_encoded, Y_hot_encoded, test_size = 0.2)

## Neural Network Model

In [20]:
model = Sequential([
    Dense(16, input_dim=88, kernel_initializer='normal', activation='relu'),
    Dense(8, kernel_initializer='normal', activation='relu'),
    Dense(4, kernel_initializer='normal', activation='relu'),
    Dense(2, activation='softmax'),
])

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

### Train

In [21]:
model.fit(X_train, Y_train, epochs=100, batch_size=5, verbose=0)

<keras.callbacks.History at 0x19ebb16c3d0>

### Test

In [22]:
Y_pred = np.round(model.predict(X_test)).astype(int)[:, 0]

In [23]:
print(f'Accuracy: {accuracy_score(Y_test[:, 0], Y_pred):.2f}\n')
print(classification_report(Y_test[:, 0], Y_pred))

Accuracy: 0.86

              precision    recall  f1-score   support

           0       0.86      0.86      0.86        28
           1       0.87      0.87      0.87        30

    accuracy                           0.86        58
   macro avg       0.86      0.86      0.86        58
weighted avg       0.86      0.86      0.86        58

