In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from umap import UMAP
from sklearn.decomposition import PCA
from sklearn.preprocessing import MinMaxScaler
from sklearn.cluster import KMeans
from sklearn.model_selection import train_test_split
from scipy.cluster.hierarchy import dendrogram, linkage
from sklearn.cluster import AgglomerativeClustering
from sklearn.cluster import DBSCAN
from sklearn.metrics import silhouette_score
from sklearn.neighbors import NearestNeighbors
from kneed.knee_locator import KneeLocator

# HOW CAN ADULTS SUFFERING FROM DEPRESSION BE CLUSTERED?

<img src="ethan-sykes-TdM_fhzmWog-unsplash.jpg" alt="Drawing" style="width: 50%; height:100%;"/>

# Table of Contents

* [Introduction](#introduction)
* [Research Question](#hypothesis)
* [Data](#data)
* [Methods](#methods)
* [Results](#results)
* [Discussion and Recommendations](#discussion)
* [References](#references)
* [Appendix](#appendix)

<a class="anchor" id="introduction"></a> 
# Introduction

At least 17 million adults in the US suffer from depression each year.<sup>[2](#references)</sup>  And, according to the NCBI article, <u>Depression: How effective are antidepressants?</u><sup>[1](#references)</sup>, only 40 to 60 percent of those with severe depression who take antidepressants notice an improvement in their symptoms.  In this project I would like to explore how adults suffering from depression can be clustered.  If a clear seperation of clusters can be found, tailored treatments can be created to better help those sufferening from depression.  This is a topic of personal interest as my husband suffers from chronic depression.

<a class="anchor" id="hypothesis"></a> 
# Research Question

How can adults suffering from depression be clustered?

<a class="anchor" id="data"></a> 
# Data

Now that you have walked through the relevance of the topic, posed your research questions, and framed testable hypotheses based on those questions, it's time to introduce the dataset. Tell your audience about the data—when and where was it collected? Perhaps include descriptive statistics or measures of distribution.

In order to answer this question, I am using data from the 2019 National Health Interview Survey, Sample Adult Interview.<sup>[4](#references)</sup>

In [None]:
df = pd.read_csv('data/adult19.csv')
df.info()

<a class="anchor" id="methods"></a> 
# Methods

Explain the steps needed to test the hypotheses. This includes any data wrangling, tests, and visualizations that you will need to definitively reject or fail to reject your null hypotheses.

<a class="anchor" id="results"></a> 
# Results

Now, you can walk through the results of the methods. State the results of your tests and explain whether these results mean that you reject or fail to reject the null. Also include compelling tables or graphs to illustrate your findings.

<a class="anchor" id="discussion"></a> 
# Discussion and Recommendations

What insights can interested parties get from this research? What would you recommend for further research?

<a class="anchor" id="references"></a>   
# References

1. InformedHealth.org [Internet]. Cologne, Germany: Institute for Quality and Efficiency in Health Care (IQWiG); 2006-. Depression: How effective are antidepressants? [Updated 2020 Jun 18]. Available from: https://www.ncbi.nlm.nih.gov/books/NBK361016/

2. National Institute of Mental Health. Major Depression. https://www.nimh.nih.gov/health/statistics/major-depression

3. National Center for Health Statistics. (April 5, 2021) National Health Interview Survey. https://www.cdc.gov/nchs/nhis/2019nhis.htm

Sharma, A. (2020, September 8) How to Master the Popular DBSCAN Clustering Algorithm for Machine Learning. Analytics Vidhya. https://www.analyticsvidhya.com/blog/2020/09/how-dbscan-clustering-works/

Mullin, T. (2020, July 9) DBSCAN Parameter Estimation Using Python.  Medium. https://medium.com/@tarammullin/dbscan-parameter-estimation-ff8330e3a3bd

Erich Schubert, Jörg Sander, Martin Ester, Hans-Peter Kriegel, and Xiaowei Xu. 2017. DBSCAN Revisited,
Revisited: Why and How You Should (Still) Use DBSCAN. ACM Trans. Database Syst. 42, 3, Article 19 (July
2017), 21 pages.
https://doi.org/10.1145/3068335



<a class="anchor" id="appendix"></a> 
# Appendix