Skip to content
Personality Recognition in Source Code
Python Other
  1. Python 98.8%
  2. Other 1.2%
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data Add method names of 1-2 folds Sep 11, 2016
pisco
runs
tests
.gitattributes updated .gitattributes Jul 21, 2016
.gitignore
.travis.yml
Dockerfile
License.md Add license file Feb 22, 2017
Makefile
README.md
evaluate.py
gridsearch.py Fix pep8 Feb 22, 2017
requirements.txt
setup.cfg
setup.py Update author emails Feb 22, 2017

README.md

Pisco: Personality Identification in Source Code

Build Status license

Pisco is a program that identifies the personality trait of a software developer based on his or her Java code. It was developed for the PR-SOCO challenge 2016. Our research paper can be accessed here.

Abstract of our paper

We developed an approach to automatically predict the personality traits of Java developers based on their source code for the PR-SOCO challenge 2016. The challenge provides a data set consisting of source code with their associated developers' personality traits (neuroticism, extraversion, openness, agreeableness, and conscientiousness). Our approach adapts features from the authorship identification domain and utilizes features that were specically engineered for the PR-SOCO challenge. We experiment with two learning methods: linear regression and k-nearest neighbors regressor. The results are reported in terms of the Pearson product-moment correlation and root mean square error.

Dependencies

  • docker
  • knife (will be automatically pulled from Docker Hub)

How to setup the project:

This will install all required dependencies.

cd pisco
make build

How to evaluate all features

python evaluate.py --train_corpus=openness --recognizer=linear_regression --features all

How to evaluate single features

Style Features

make run
python evaluate.py --train_corpus=openness --recognizer=linear_regression --features mean_length_of_method_names
python evaluate.py --train_corpus=openness --recognizer=linear_regression --features mean_length_of_method_parameter_names
python evaluate.py --train_corpus=openness --recognizer=linear_regression --features mean_length_of_field_names
python evaluate.py --train_corpus=openness --recognizer=linear_regression --features mean_length_of_local_variable_names_in_methods

Structure Features

make run
python evaluate.py --train_corpus=openness --recognizer=linear_regression --features mean_number_of_classes
python evaluate.py --train_corpus=openness --recognizer=linear_regression --features cyclomatic_complexity
python evaluate.py --train_corpus=openness --recognizer=linear_regression --features mean_number_of_methods
python evaluate.py --train_corpus=openness --recognizer=linear_regression --features mean_number_of_method_parameters
python evaluate.py --train_corpus=openness --recognizer=linear_regression --features mean_length_of_methods
python evaluate.py --train_corpus=openness --recognizer=linear_regression --features mean_number_of_fields
python evaluate.py --train_corpus=openness --recognizer=linear_regression --features mean_number_of_local_variables_in_methods
python evaluate.py --train_corpus=openness --recognizer=linear_regression --features duplicate_code_measure
python evaluate.py --train_corpus=openness --recognizer=linear_regression --features contains_IDE_template_text
python evaluate.py --train_corpus=openness --recognizer=linear_regression --features ratio_of_external_libraries

Misc Features

make run
python evaluate.py --train_corpus=openness --recognizer=linear_regression --features mean_number_of_empty_classes
python evaluate.py --train_corpus=openness --recognizer=linear_regression --features ratio_of_unparsable_sections

Creating Submission Files

Copy the required config file into the configs folder or a specific run and execute:

python run.py --run_folder runs/run1

Citation

I you want to cite us in your work, please use the following BibTeX entry:

@inproceedings{pisco,
  author    = {Matthias Liebeck and
               Pashutan Modaresi and
               Alexander Askinadze and
               Stefan Conrad},
  title     = {{Pisco: {A} Computational Approach to Predict Personality Types from
               Java Source Code}},
  booktitle = {Working notes of {FIRE} 2016 - Forum for Information Retrieval Evaluation},
  pages     = {43--47},
  year      = {2016},
  url       = {http://ceur-ws.org/Vol-1737/T1-8.pdf},
},
You can’t perform that action at this time.