gecco2015

This project contains the source code necessary for reproducing the results from our paper "An Efficient Structural Diversity Technique for Genetic Programming." This also includes all the configuration files used for each experiment that is discussed in this paper.

See LICENSE.txt for license information. NOTE: Although this code can be used as-is, it is not intended to be a general GP library and is therefore not documented as such. The intended use of this code is for reproducing the results, as well as potentially extending this technique. For a highly-documented, general GP (as well as EC in general) framework, we recommend trying a larger project such as ECJ.

DEPENDENCIES:

Java 1.6 or above

Apache Maven - for building the project (and automatically gathering required dependencies)

Matplotlib - for plotting results (tested on Linux with version 1.3.1 and on Windows x86 using Anaconda Python 2.1.0)

Apache Commons Lang 3

Apache Log4j 1.2.17

JUnit 4.4

BUILDING THE CODE:

The easiest way to build this project into an executable jar file is to use Apache Maven:

Download and install Apache Maven (http://maven.apache.org/)
Navigate to the top-level directory of this project and build with Maven: Execute "mvn clean package" (Assuming that you have Maven installed correctly and have the correct permissions, this will download all required dependencies and place them into the executable jar file).

RUNNING THE CODE:

After you have built the executable jar file as described above, simply execute the following:

java -jar target/gp-research-0.1.jar conf/CONFIG_FILE.properties

This will use the config file located at conf/CONFIG_FILE.properties, and the resulting output will be saved to the directory "output" by default (Specify a path, which can be new if you have the appropriate permissions, with the outputDir=DIRECTORY option to use a different output directory. This is useful when automating execution to perform several independent trials).

NOTE: The configuration files were all taken directly from our experiments, which were run in a high-performance computing environment. You may need to adjust the number of threads (numThreads option in the configuration files) for your system.

NOTE: When using multiple threads, due to concurrency, the fitness evaluation count will likely slightly differ between runs when explicitly setting the random seed to be the same because thread execution order is not guaranteed. However, we do not use random seeds in our experiments because each run is independent. This is just something to be aware of in case you are playing with random seeds.

GENERATING PLOTS:

We have also included all the scripts necessary for recreating the plots from the paper. These scripts are intended to run on a directory containing data files from multiple runs of the code. The scripts are located in the postRunScripts directory:

compareFitness.py - Plots mean best fitness over time.

Example usage: python compareFitness.py directory1,directory2,directory3 label1,label2,label3 outputDirectory
convergence.py - Plots bar charts of the mean number of fitness evaluations required to find a solution, for the runs in which a solution was found. Also creates (1) a file containing pairwise Mann-Whitney U test p-values for every input directory supplied, and (2) a file containing the mean number of fitness evaluations required.

Example usage: python convergence.py directory1,directory2,directory3 label1,label2,label3 outputDirectory
successRates.py - Creates (1) a text file containing the success rates (percentage of runs in which a solution was found) for each input directory supplied, and (2) a file containing pairwise Fisher's Exact test p-values for every input directory supplied.

Example usage: python successRates.py directory1,directory2,directory3 label1,label2,label3 outputDirectory
treeTagPlots.py - Plots the mean density of the most dense genetic marker over time (in generations), as described in our paper. Note: The LEGEND_CODE corresponds to the Matplotlib legend location integer code, which is used to position the legend. We used 1 (upper right) for results from our approach and 4 (lower right) for results from standard GP.

Example usage: python treeTagPlots.py inputDirectory LEGEND_CODE outputDirectory

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
conf		conf
postRunScripts		postRunScripts
src		src
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
log4j.properties		log4j.properties
log4j.properties.unittest		log4j.properties.unittest
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conf

conf

postRunScripts

postRunScripts

src

src

.gitignore

.gitignore

LICENSE.txt

LICENSE.txt

README.md

README.md

log4j.properties

log4j.properties

log4j.properties.unittest

log4j.properties.unittest

pom.xml

pom.xml

Repository files navigation

gecco2015

About

Releases

Packages

Languages

License

burks-pub/gecco2015

Folders and files

Latest commit

History

Repository files navigation

gecco2015

About

Resources

License

Stars

Watchers

Forks

Languages