### AQ-10 methods scores

AQ-10 methods scores:

The Autism-Spectrum Quotient publisehed by Baron-Cohen, Wheelwright, Skinner, Martin, & Clubley was developed to assess how adults with <I>'normal'</I> intelligence has the traits associated with autism spectrum conditions. According to [Wikia](https://psychology.wikia.org/wiki/Autism_Spectrum_Quotient), "the test consists of fifty statements, each of which is in a forced-choice format. Each question allows the subject to indicate "Definitely agree", "Slightly agree", "Slightly disagree" or "Definitely disagree". Approximately half the questions are worded to elicit an "agree" response from normal individuals, and half to elicit a "disagree" response. The subject scores one point for each question which is answered "autistically" either slightly or definitely." The questions cover five different domains associated with the autism spectrum: social skills; communication skills; imagination; attention to detail; and attention switching/tolerance of change.

However, according to the dataset, 'Yes' indicates that the individual is on the Autism Spectum and 'No' is indicated when the final score is less than or equal to 7. You can refer the questions from [here](https://www.nice.org.uk/guidance/cg142/resources/autism-spectrum-quotient-aq10-test-pdf-186582493).

A1 I often notice small sounds when others do not
A2 I usually concentrate more on the whole picture, rather than the small details
A3 I find it easy to do more than one thing at once
A4 If there is an interruption, I can switch back to what I was doing very quickly
A5 I find it easy to ‘read between the lines’ when someone is talking to me
A6 I know how to tell if someone listening to me is getting bored
A7 When I’m reading a story I find it difficult to work out the characters’ intentions
A8 I like to collect information about categories of things (e.g. types of car, types of bird, types
of train, types of plant etc) 
A9 I find it easy to work out what someone is thinking or feeling just by looking at their face
A10 I find it difficult to work out people’s intentions



In [2]:
import numpy as np
import pandas as pd

#Visualisation
import matplotlib.pyplot as plt
import missingno
import seaborn as sns

#Model
from sklearn.model_selection import train_test_split

ModuleNotFoundError: No module named 'missingno'

In [None]:
#load the dataset
df = pd.read_csv("Autism_Data.arff")
df = df.replace('?', np.nan)

In [None]:
df.head()

In [None]:
#column name spell check/ fixes 
df = df.rename(columns = {"A1_Score":"A1","A2_Score":"A2", 'A3_Score':'A3', 'A4_Score':'A4', 'A5_Score':'A5', 'A6_Score':'A6', 'A7_Score':'A7', 'A8_Score':'A8', 'A9_Score':'A9', 'A10_Score':'A10', "jundice":"jaundice", "austim":"autism", "contry_of_res": "country", "Class/ASD":"asd_classification"})

In [None]:
#Missing data
missingno.matrix(df, figsize =(30,10))

In [None]:
df.dtypes

In [None]:
df.describe()

### Age Composition

 The age column has a value 383.0 and two NaN values.  

In [None]:
#Check how many NaN values exist
df['age'].isnull().sum()

Removing the two rows that don't have the age values

In [None]:
df.dropna(subset = ["age"], inplace=True)

In [None]:
df['age'].max()
mean = df['age'].mean()
df['age']= df['age'].replace(383.0 ,mean)
df['age'] = df['age'].astype('int')

In [None]:
fig = plt.figure(figsize=(26,8))
sns.countplot(x="age", data=df)

### Country

Country of residence of the participants. Maximum participants are from United States, UAE, India, New Zealand and the UK. 

In [None]:
fig = plt.figure(figsize=(12,15))
sns.countplot(y='country', data=df);
df.country.value_counts()

### Gender distribution

In [None]:
fig = plt.figure(figsize=(10,6))
sns.countplot(x="gender", data=df, facecolor=(0, 0, 0, 0), linewidth=5, edgecolor=sns.color_palette("dark", 3))

If the user is male, m( = 0);
If the user is female (= 1).
According to the dataset, 367 male and 337 female persons with ASD participated.

In [None]:
new_df = pd.DataFrame({'gender':df.gender.map(dict(f=1,m=0))})
df.update(new_df)

In [None]:
df['gender'] = df['gender'].astype('int')

### Users without Jaundice during birth

The bar shows how the ASD cases are divided into jaundice at birth cases

If the user was born with jaundice then, yes( = 1);
If the user wasn't born with jaundice then, no (= 0).
According to the dataset, 69 individuals were born with jaundice out of 704.

In [None]:
fig = plt.figure(figsize=(25,6))
sns.countplot(y='jaundice', data=df);
df.jaundice.value_counts()

In [None]:
new_df= pd.DataFrame({'jaundice':df.jaundice.map(dict(yes=1,no=0))})
df.update(new_df)

In [None]:
df['jaundice'] = df['jaundice'].astype('int')

In [None]:
sns.barplot(x='gender', y='jaundice', data=df)

### Autism

If the user has an immediate family member who had a Pervasive Developmental Disorder then, yes( = 1);
If the user has an immediate family member who had  Pervasive Developmental Disorder then, no( = 0);
According to the dataset, 91 individuals had a family member who had PDD out of 704.

In [None]:
new_df= pd.DataFrame({'autism':df.autism.map(dict(yes=1,no=0))})
df.update(new_df)

In [None]:
df['autism'] = df['autism'].astype('int')

In [None]:
fig = plt.figure(figsize=(25,6))
sns.countplot(y='autism', data=df);
df.autism.value_counts()

### ASD Classification

In [None]:
new_df = pd.DataFrame({'asd_classification':df.asd_classification.map(dict(YES=1,NO=0))})
df.update(new_df)

In [None]:
df['asd_classification'] = df['asd_classification'].astype('int')

In [None]:
fig = plt.figure(figsize=(25,6))
sns.countplot(y='asd_classification', data=df);
df.asd_classification.value_counts()

### Used App Before

Whether the p/w ASD has used a screening app

In [None]:
new_df = pd.DataFrame({'used_app_before':df.used_app_before.map(dict(yes=1,no=0))})
df.update(new_df)

In [None]:
df['used_app_before'] = df['used_app_before'].astype('int')

In [None]:
fig = plt.figure(figsize=(25,6))
sns.countplot(y='used_app_before', data=df);
df.used_app_before.value_counts()

In [None]:
df.age_desc.value_counts()

### Problem Columns: Ethnicity and Relation Cols

In [None]:
plt.figure(figsize =(15,10))
sns.countplot(x= 'ethnicity',data = df)
df['ethnicity'].value_counts()

In [None]:
plt.figure(figsize =(15,10))
sns.countplot(x= 'relation', data = df)
df['relation'].value_counts()

In [None]:
df.dtypes

In [None]:
df.dtypes

In [None]:
df.drop(['country', 'ethnicity', 'used_app_before' , 'age_desc','relation'],axis=1, inplace=True)
df

In [None]:
fig = plt.figure(figsize=(12,10))
sns.heatmap(df.corr())

In [None]:
#Split predictor variables
X= df.iloc[:,1:15] 
X

In [None]:
#Split outcome variables
Y=df.loc[:,"asd_classification"]
Y

In [None]:
train_test_split(X, Y, test_size=0.2, random_state=0)

In [None]:
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.2)