Implementation of a hierarchical CNN based model to detect Big Five personality traits
Branch: master
Clone or download
SenticNet
SenticNet Update README.md
Latest commit 17c8703 Jul 13, 2017
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.gitignore Added Readme Jul 13, 2017
Emotion_Lexicon.csv Initial Commit Feb 9, 2017
README.md Update README.md Jul 13, 2017
conv_net_classes.py Initial Commit Feb 9, 2017
conv_net_train.py Initial Commit Feb 9, 2017
essays.csv Initial Commit Feb 9, 2017
mairesse.csv Initial Commit Feb 9, 2017
process_data.py Initial Commit Feb 9, 2017

README.md

Deep Learning-Based Document Modeling for Personality Detection from Text

This code implements the model discussed in Deep Learning-Based Document Modeling for Personality Detection from Text for detection of Big-Five personality traits, namely:

  • Extroversion
  • Neuroticism
  • Agreeableness
  • Conscientiousness
  • Openness

Requirements

Preprocessing

process_data.py prepares the data for training. It requires three command-line arguments:

  1. Path to google word2vec file (GoogleNews-vectors-negative300.bin)
  2. Path to essays.csv file containing the annotated dataset
  3. Path to mairesse.csv containing Mairesse features for each sample/essay

This code generates a pickle file essays_mairesse.p.

Example:

python process_data.py ./GoogleNews-vectors-negative300.bin ./essays.csv ./mairesse.csv

Training

conv_net_train.py trains and tests the model. It requires three command-line arguments:

  1. Mode:
    • -static: word embeddings will remain fixed
    • -nonstatic: word embeddings will be trained
  2. Word Embedding Type:
    • -rand: randomized word embedding (dimension is 300 by default; is hardcoded; can be changed by modifying default value of k in line 111 of process_data.py)
    • -word2vec: 300 dimensional google pre-trained word embeddings
  3. Personality Trait:
    • 0: Extroversion
    • 1: Neuroticism
    • 2: Agreeableness
    • 3: Conscientiousness
    • 4: Openness

Example:

python conv_layer_train.py -static -word2vec 2

Citation

If you use this code in your work then please cite the paper - Deep Learning-Based Document Modeling for Personality Detection from Text with the following:

@ARTICLE{7887639, 
 author={N. Majumder and S. Poria and A. Gelbukh and E. Cambria}, 
 journal={IEEE Intelligent Systems}, 
 title={{Deep} Learning-Based Document Modeling for Personality Detection from Text}, 
 year={2017}, 
 volume={32}, 
 number={2}, 
 pages={74-79}, 
 keywords={feedforward neural nets;information filtering;learning (artificial intelligence);pattern classification;text analysis;Big Five traits;author personality type;author psychological profile;binary classifier training;deep convolutional neural network;deep learning based method;deep learning-based document modeling;document vector;document-level Mairesse features;emotionally neutral input sentence filtering;identical architecture;personality detection;text;Artificial intelligence;Computational modeling;Emotion recognition;Feature extraction;Neural networks;Pragmatics;Semantics;artificial intelligence;convolutional neural network;distributional semantics;intelligent systems;natural language processing;neural-based document modeling;personality}, 
 doi={10.1109/MIS.2017.23}, 
 ISSN={1541-1672}, 
 month={Mar},}