Personlaity Clustering and Labeling

Clustering personalities and labelling based on sampled survey responses (Hierarchical and K-Means).

Description

An interactive online personality test was conducted (2016-2018), based on the “Big-Five Factor Markers” from the International Personality Item Pool (IPIP). “Big-Five Factor Markers” represent the five main personality traits suggested for overall grouping for several personality traits - extraversion, neuroticism, agreeableness, conscientiousness, and openness to experience. In the test, they recorded the answers of the test participants for research use with approval. Results of statistical analyses on such data could reveal plenty of insights on applied psychology. For problem statement 2 in the report, I perform clustering analysis on the same dataset to group the participants based on their responses to the test. The data set is vast, so I do a sampling of an appropriate subset and build clusters on that, validate by assigning cluster labels and visualising results. Lastly, I use the already built clusters and the assigned labels to make predictions for unseen observations.

This analysis was done as part of the Statistical Learning course held at Dalarna University for the master in the data science program.

Getting Started

Usage

The analysis is presented in AnalysisReport.pdf in the Problem 2 section and the code for analysis in problem2.R. The code is in R; hence, R and optional tools like RStudio are required to run the code. In addition, the images used in the report are present separately for easy access to the information.

Note: Problem 1 in the report answers problems related to concepts of collinearity problem in linear regression and derivation of QDA decision rule from Bayes' rule. problem1.R contains the code for this problem. These can be ignored.

However, this repository does not provide the data used for analysis.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
ClusterSizesLabelled.png		ClusterSizesLabelled.png
HeirarchicalOutputPC2.png		HeirarchicalOutputPC2.png
KMeansOutputPC2.png		KMeansOutputPC2.png
KValuesSample1000.png		KValuesSample1000.png
KValuesSample2000.png		KValuesSample2000.png
LICENSE		LICENSE
README.md		README.md
Report_HomeExercise3.pdf		Report_HomeExercise3.pdf
UnseenSamplePredictions.png		UnseenSamplePredictions.png
problem1.R		problem1.R
problem2.R		problem2.R
sampleSizes4K.png		sampleSizes4K.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Personlaity Clustering and Labeling

Description

Getting Started

Usage

License

About

Releases

Packages

Languages

License

guptasaumya/personality-clustering-labeling

Folders and files

Latest commit

History

Repository files navigation

Personlaity Clustering and Labeling

Description

Getting Started

Usage

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages