Skip to content
Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Pisco: Personality Identification in Source Code

Build Status license

Pisco is a program that identifies the personality trait of a software developer based on his or her Java code. It was developed for the PR-SOCO challenge 2016. Our research paper can be accessed here.

Abstract of our paper

We developed an approach to automatically predict the personality traits of Java developers based on their source code for the PR-SOCO challenge 2016. The challenge provides a data set consisting of source code with their associated developers' personality traits (neuroticism, extraversion, openness, agreeableness, and conscientiousness). Our approach adapts features from the authorship identification domain and utilizes features that were specically engineered for the PR-SOCO challenge. We experiment with two learning methods: linear regression and k-nearest neighbors regressor. The results are reported in terms of the Pearson product-moment correlation and root mean square error.


  • docker
  • knife (will be automatically pulled from Docker Hub)

How to setup the project:

This will install all required dependencies.

cd pisco
make build

How to evaluate all features

python --train_corpus=openness --recognizer=linear_regression --features all

How to evaluate single features

Style Features

make run
python --train_corpus=openness --recognizer=linear_regression --features mean_length_of_method_names
python --train_corpus=openness --recognizer=linear_regression --features mean_length_of_method_parameter_names
python --train_corpus=openness --recognizer=linear_regression --features mean_length_of_field_names
python --train_corpus=openness --recognizer=linear_regression --features mean_length_of_local_variable_names_in_methods

Structure Features

make run
python --train_corpus=openness --recognizer=linear_regression --features mean_number_of_classes
python --train_corpus=openness --recognizer=linear_regression --features cyclomatic_complexity
python --train_corpus=openness --recognizer=linear_regression --features mean_number_of_methods
python --train_corpus=openness --recognizer=linear_regression --features mean_number_of_method_parameters
python --train_corpus=openness --recognizer=linear_regression --features mean_length_of_methods
python --train_corpus=openness --recognizer=linear_regression --features mean_number_of_fields
python --train_corpus=openness --recognizer=linear_regression --features mean_number_of_local_variables_in_methods
python --train_corpus=openness --recognizer=linear_regression --features duplicate_code_measure
python --train_corpus=openness --recognizer=linear_regression --features contains_IDE_template_text
python --train_corpus=openness --recognizer=linear_regression --features ratio_of_external_libraries

Misc Features

make run
python --train_corpus=openness --recognizer=linear_regression --features mean_number_of_empty_classes
python --train_corpus=openness --recognizer=linear_regression --features ratio_of_unparsable_sections

Creating Submission Files

Copy the required config file into the configs folder or a specific run and execute:

python --run_folder runs/run1


I you want to cite us in your work, please use the following BibTeX entry:

  author    = {Matthias Liebeck and
               Pashutan Modaresi and
               Alexander Askinadze and
               Stefan Conrad},
  title     = {{Pisco: {A} Computational Approach to Predict Personality Types from
               Java Source Code}},
  booktitle = {Working notes of {FIRE} 2016 - Forum for Information Retrieval Evaluation},
  pages     = {43--47},
  year      = {2016},
  url       = {},


Personality Recognition in Source Code




No releases published


No packages published