Personality Recognition

Overview

The project is designed to build a ‘neuroticism’ classifier using straightforward and understandable logistic regression as a baseline and discover which feature set, resources and learning techniques are useful in extraction of personality from text(and social media data). This project provides a useful sandbox for exploring natural language processing(NLP) techniques to improve the baseline model.

Project Background

The report and the problem were based on the article “Workshop on Computational Personality Recognition: Shared Task” which discusses personality traits values and social media statuses from well-known Big 5 personality traits (also known as OCEAN for the 5 traits it defines: Openness, Conscientiousness, Extraversion, Agreeableness and Neuroticism). The article discussed two datasets (Essays and My Personality) with gold standard labels user information used for personality recognition by 8 different groups. The goal of the workshop was for each group to go about finding personality recognition solutions on the same datasets in a way they saw most fit. This idea originated from the observation that much of the research being done in personality recognition was being done with varying resources and techniques that did not permit an adequate comparison between colleagues. Thus, at the conclusion of the workshop, the groups would present their work with the performance increase or decrease that was obtained, and they would serve as a benchmark from which future projects in a similar field could compare themselves to.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
data		data
Baseline_Kfold.ipynb		Baseline_Kfold.ipynb
Baseline_TrainTestSplit.ipynb		Baseline_TrainTestSplit.ipynb
FinalProject_Report.pdf		FinalProject_Report.pdf
Improved_Adding Lexical Features and K-Fold Cross Validation.ipynb		Improved_Adding Lexical Features and K-Fold Cross Validation.ipynb
Improved_LSTM.ipynb		Improved_LSTM.ipynb
Improved_NaiveBayes_SVM_RF.ipynb		Improved_NaiveBayes_SVM_RF.ipynb
Improved_NoGrams_Tfidf_CountVectorizer.ipynb		Improved_NoGrams_Tfidf_CountVectorizer.ipynb
Improved_TrainTestSplit_Evaluate.ipynb		Improved_TrainTestSplit_Evaluate.ipynb
Improved_parameter_tuning.ipynb		Improved_parameter_tuning.ipynb
Instruction on code files.pdf		Instruction on code files.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

Baseline_Kfold.ipynb

Baseline_Kfold.ipynb

Baseline_TrainTestSplit.ipynb

Baseline_TrainTestSplit.ipynb

FinalProject_Report.pdf

FinalProject_Report.pdf

Improved_Adding Lexical Features and K-Fold Cross Validation.ipynb

Improved_Adding Lexical Features and K-Fold Cross Validation.ipynb

Improved_LSTM.ipynb

Improved_LSTM.ipynb

Improved_NaiveBayes_SVM_RF.ipynb

Improved_NaiveBayes_SVM_RF.ipynb

Improved_NoGrams_Tfidf_CountVectorizer.ipynb

Improved_NoGrams_Tfidf_CountVectorizer.ipynb

Improved_TrainTestSplit_Evaluate.ipynb

Improved_TrainTestSplit_Evaluate.ipynb

Improved_parameter_tuning.ipynb

Improved_parameter_tuning.ipynb

Instruction on code files.pdf

Instruction on code files.pdf

README.md

README.md

Repository files navigation

Personality Recognition

Overview

Project Background

About

Releases

Packages

Contributors 2

Languages

T22sri/Personality_Recognition_NLP

Folders and files

Latest commit

History

Repository files navigation

Personality Recognition

Overview

Project Background

About

Topics

Resources

Stars

Watchers

Forks

Languages