Lets explore this unsupervised learning on a Myopia Dataset
First we need to process the raw data to fit the machine learning models. K-Mean clustering algorithms is used for exploration.
This activity is broken down into four parts:
-
Part 1: Prepare the Data
-
Part 2: Apply Dimensionality Reduction
-
Part 3: Perform a Cluster Analysis with K-means
-
Part 4: Make a Recommendation
-
Read
myopia.csv
into a Pandas DataFrame. -
Remove the "MYOPIC" column from the dataset.
-
Standardize your dataset so that columns that contain larger values do not influence the outcome more than columns with smaller values.
-
Perform dimensionality reduction with PCA.
-
Further reduce the dataset dimensions with t-SNE and visually inspect the results. To do this, run t-SNE on the principal components, which is the output of the PCA transformation.
Create an elbow plot to identify the best number of clusters.
- Looking at elbow cure we can say that there aren't any highly differentiated cluster.
- Same can be concluded from the scatter plot generated from the t-SNE features. We can also see that myopic and non-myopic data is blended togethere and are not forming any differentaible cluster.
- Deep learning
Sequential
model would be a better choice for this problem.