# Determining the Appropriate Fitness Testing Protocol

The goal of this project is to create a classification model to determine the appropriate fitness test protocol. Using features commonly assessed during routine clinical visits, the model should classify individuals as being **low fit** or **normal/high fit**. Those individuals classified as being low fit would then perform an entirely walking-based test while everyone else would perform a test that involves running.

I began with features typically expected to be related to fitness:
- Body mass index (BMI)
- Resting heart rate
- Age
- Binary cardiovascular disease status
- Binary physical activity status (meeting recommendations or not)
- Binary asthma status

![MVP_pairplot.jpg](attachment:MVP_pairplot.jpg)

I then created a **kNN** model that had the following metrics (on the test dataset sample):
- Accuracy: 78.9%
- Precision: 59.8%
- Recall: 46.0%
- F1: 52.0%
- ROC AUC Score: 68.9%

![MVP_kNN_confusion.jpg](attachment:MVP_kNN_confusion.jpg)

For this project, **the greater concern is correctly classifying individuals who have low fitness.** This is because there are less consequences/inaccuracies when starting a fitness test at an easy intensity compared to an intensity that is too difficult. 

As such, the primary metric of interest in terms of model performance is recall. 

The kNN model had a low recall of 46.0%.

Next, I created a **logistic regression** model. I found the recall metric for this model was an improvement over the kNN model, particularly when adjusting the decision threshold (on the test dataset sample):
- Accuracy: 76.0%
- Precision: 46.2%
- Recall: 82.0%
- F1: 59.1%
- ROC AUC Score: 78.2%

![MVP_Logistic_confusion.jpg](attachment:MVP_Logistic_confusion.jpg)

**These results suggest a logisitic regression model performs better than a kNN model and could be used to determine the appropriate fitness test protocol for an individual.**

However, improvements could be made to further improve recall without reducing precision or accuracy. 

**The next steps** are to further explore feature engineering, different feature selection, and different classification models (eg, random forest models).