Django-Admin based web application supporting automatic code analysis, code reading, and code-comments coherence evaluation.
The application has been developed to support the creation of the
Coherence Dataset,
a publicly available benchmark containing annotated (code, comments)
pairs, and their corresponding Coherence evaluation.
The whole code base of the project has been developed in Python 3, and in Django 1.6.x.
Other dependencies are:
celery
andrabbitmq
: necessary in the code analysis phase, which is activated every time a new project is uploaded;Antlr3
(included): to support Java Code Analysis (Note: So far, only the analysis of Java code is supported)
To setup the entire project, it is required to have a Python environment properly configured as to satisfy all the package dependencies.
For the sake of simplicity, these dependencies have been collected in the requirements.txt
file by
the pip freeze
command.
Thus, to install all the dependencies, it is just required to execute the following command:
pip install -r requirements.txt
To avoid polluting the main system (Python) environment, it is highly suggested to create a proper virtual environment for the specific sake:
venv -p python3 <DEST_DIR>
source <DEST_DIR>/bin/activate
pip install -r requirements.txt
Once the environment has been properly set up, it is finally necessary to create a database
using PostgreSQL (9.3
). For further details on this, please refer to the
official documentation,
depending on your machine and operating system.
Please refer to the DATABASES
directive in the Main Settings
(code_comments_coherence/code_comments_coherence/settings.py
) to see the details about
database name and corresponding authentication parameters.
The code ships with the set of initial data to re-create the database from scratch.
The data are provided in the form of fixtures
, located in the source_code_analysis/fixture
folder.
This folder contains the initial_data
archive, and the corresponding instructions to unpack it.
To recreate the entire (Structure+Data), it is necessary to execute the following command:
python manage.py syncdb
In case it is necessary to upload the source code of new software projects, it is required to have a set of Celery workers running on a RabbitMQ queue.
To spawn server(s), please take a look at the RUNNING_Servers.md
file.
A. Corazza, V. Maggio, G. Scanniello, On the Coherence Between Comments and Implementations in Source Code,
In Procs of 41st EUROMICRO Conference on Software Engineering and Advanced Applications (SEAA),
26-28 Aug. 2015, Madeira (Portugal) DOI: 10.1109/SEAA.2015.20