# ASD Children traits - Initial Analysis

This Notebook contains some initial evaluation of the dataset.  The primary goal is to find answers for unknown questions: **What can we learn from this data?**

## Loading data
First we'll import required libraries and import the data to a Pandas DataFrame:

In [1]:
import pandas as pd

In [2]:
df = pd.read_csv('./assets/data_csv.csv', header=0, sep=',')

In [6]:
df.shape

(1985, 28)

In [3]:
df.describe()

Unnamed: 0,CASE_NO_PATIENT'S,A1,A2,A3,A4,A5,A6,A7,A8,A9,A10_Autism_Spectrum_Quotient,Social_Responsiveness_Scale,Age_Years,Qchat_10_Score,Childhood Autism Rating Scale
count,1985.0,1985.0,1985.0,1985.0,1985.0,1985.0,1985.0,1985.0,1985.0,1985.0,1985.0,1976.0,1985.0,1946.0,1985.0
mean,993.0,0.299244,0.238287,0.213098,0.27204,0.278589,0.306297,0.345088,0.243829,0.25995,0.446348,3.074393,9.624685,4.234841,1.701763
std,573.164462,0.458042,0.426143,0.4096,0.445123,0.448418,0.461071,0.475517,0.429499,0.438717,0.497238,3.680263,4.302416,2.898247,1.015367
min,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0
25%,497.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.0,2.0,1.0
50%,993.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,9.0,4.0,1.0
75%,1489.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,0.0,1.0,1.0,5.0,14.0,6.0,2.0
max,1985.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,10.0,18.0,10.0,4.0


And now let's take a look on the columns and see what data is available to us:

In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1985 entries, 0 to 1984
Data columns (total 28 columns):
 #   Column                                              Non-Null Count  Dtype  
---  ------                                              --------------  -----  
 0   CASE_NO_PATIENT'S                                   1985 non-null   int64  
 1   A1                                                  1985 non-null   int64  
 2   A2                                                  1985 non-null   int64  
 3   A3                                                  1985 non-null   int64  
 4   A4                                                  1985 non-null   int64  
 5   A5                                                  1985 non-null   int64  
 6   A6                                                  1985 non-null   int64  
 7   A7                                                  1985 non-null   int64  
 8   A8                                                  1985 non-null   int64  
 9

## Understanding the data
We have 28 columns at our disposal and most of them seems to be self-explanatory.  However, there are a few that are specific to the test used to generate the data so we need some research to know what they mean.

We'll take a detour now to try and figure out their meaning as the [dataset's Kaggle page](https://www.kaggle.com/datasets/uppulurimadhuri/dataset) has not enough information about it.

### The AQ-10 test
The AQ-10 test is a short test that can be used for primary care practioneers to see if a child should be referred to an autism treatment.  Initially designed for self-report from adults, it can be answered by parents and professionals as an initial screening for children.  According to a study from Cambridge University, it's not reliable enough to be the sole clinical diagnosis tool but it's a way to initiate screening.

Altough it's not explicit on the dataset, we might be able to link the column questions as such:

- A1: S/he often notices small sounds when others do not
- A2: S/he usually concentrates more on the whole picture, rather than the small details
- A3: In a social group, s/he can easily keep track of several different people’s conversations
- A4: S/he finds it easy to go back and forth between different activities
- A5: S/he doesn’t know how to keep a conversation going with his/her peers
- A6: S/he is good at social chit-chat
- A7: When s/he is read a story, s/he finds it difficult to work out the character’s intentions or feelings
- A8: When s/he was in preschool, s/he used to enjoy playing games involving pretending with other children
- A9: S/he finds it easy to work out what someone is thinking or feeling just by looking at their face 
- A10: S/he finds it hard to make new friends --> **WARNING:  This doesn't seem to be present on the dataset**

### Scoring
Only 1 point can be scored for each question. Score 1 point for Definitely or Slightly Agree on each of items 1, 5, 7 and 10. Score 1 point for Definitely or Slightly Disagree on each of items 2, 3, 4, 6, 8 and 9. If the individual scores 6 or above, the responder should consider referring them for a specialist diagnostic assessment.

#### Sources

- Dr. Natalie Engelbrecht ND RP - 2020 [The AQ-10 | Embrace Autism](https://embrace-autism.com/aq-10/)
- Carrie Allison Ph.D, Bonnie Auyeung Ph.D., Simon Baron-Cohen Ph.D. - 2012 [Toward Brief “Red Flags” for Autism Screening: The Short Autism Spectrum Quotient and the Short Quantitative Checklist in 1,000 Cases and 3,000 Controls](https://www.sciencedirect.com/science/article/abs/pii/S0890856711010331)
- Emily C. Taylor, Lucy A. Livingston, Rachel A. Clutterbuck, Punit Shah - 2020 [Psychometric concerns with the 10-item Autism-Spectrum Quotient (AQ10) as a measure of trait autism in the general population](https://www.cambridge.org/core/journals/experimental-results/article/psychometric-concerns-with-the-10item-autismspectrum-quotient-aq10-as-a-measure-of-trait-autism-in-the-general-population/2E2F8CF1ECEF65BBB867F49A65A2A3D4#)
- Bonnie Auyeung Ph.D., Simon Baron-Cohen Ph.D., Sally Wheelwright, Carrie Allison Ph.D. - 2007 [The Autism Spectrum Quotient: Children’s Version (AQ-Child)](https://docs.autismresearchcentre.com/papers/2008_Auyeung_etal_ChildAQ.pdf)
