## Brain Volume, Mental State Examination, and Education Attainment as Possible Diagnostic Predictors for Alzheimer's Disease

### Introduction:
Dementia is a chronic condition associated with aging and brain atrophy, involving widespread neuropsychological deficits that significantly hinder daily activities (Reuben et. al, 2010). Our project focuses on Alzheimer's disease, the most common form of dementia. Unfortunately, Alzheimer’s has neither definitive diagnosis nor cure (Ash, 2007). This exploratory analysis aims to identify predictors to help with the early detection and prevention of Alzheimer's disease.

Specifically, we want to examine whether (1) total brain volume, (2) Mini-mental State Examination score, and (3) education attainment can predict individuals’ dementia state. If these variables demonstrate to have high predictive strength for classifying dementia groups, they may be applied clinically as an accessible convenient technique for detecting early signs of Alzheimer's disease in clinical settings.

We will use the longitudinal tabular dataset Dementia Classification: Compare Classifiers  from Kaggle.com (Deepak N, 2018), with the pathway “oasis_longitudinal.csv”. It should be noted that this is a real data set coming directly from the MRI Open Access Series of Imaging Studies (OASIS-2, 2009). This dataset consists of 15 variables and 373 rows, sampling from 150 participants of age 60-96. It classifies whether participants have dementia or not. This table includes description of the variables:

<img src=https://raw.githubusercontent.com/churancc/Dementia_Project/main/Table%20column%20descriptors.png width="500">

### Methods & Results:
#### Exploratory data analysis (EDA) 
We begin by loading our data set into Jupyter and tidying before investigating the data set. Using the R-programming language on Jupyter, the main characteristics of the data set is summarized and data visualizations are created for variables of interest.

Loading & Tidying Data
We load the packages tidyverse, purr and tidymodels in Jupyter R. To ensure data reproducibility, the seed to 999. 


In [1]:
library(tidyverse)
library(purrr)
library(tidymodels)

set.seed(999)

── [1mAttaching packages[22m ─────────────────────────────────────── tidyverse 1.3.1 ──

[32m✔[39m [34mggplot2[39m 3.3.6     [32m✔[39m [34mpurrr  [39m 0.3.4
[32m✔[39m [34mtibble [39m 3.1.7     [32m✔[39m [34mdplyr  [39m 1.0.9
[32m✔[39m [34mtidyr  [39m 1.2.0     [32m✔[39m [34mstringr[39m 1.4.0
[32m✔[39m [34mreadr  [39m 2.1.2     [32m✔[39m [34mforcats[39m 0.5.1

── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m masks [34mstats[39m::filter()
[31m✖[39m [34mdplyr[39m::[32mlag()[39m    masks [34mstats[39m::lag()

── [1mAttaching packages[22m ────────────────────────────────────── tidymodels 1.0.0 ──

[32m✔[39m [34mbroom       [39m 1.0.0     [32m✔[39m [34mrsample     [39m 1.0.0
[32m✔[39m [34mdials       [39m 1.0.0     [32m✔[39m [34mtune        [39m 1.0.0
[32m✔[39m [34minfer       [39m 1.0.2     [32m✔[39m [34mworkflows   [39m 1.0.0
[32m✔