Skip to content


Merge bug mining into master (#135)
Browse files Browse the repository at this point in the history
* slight modifications for build script

* fix a few little things making them work a little better

* adjust some of these variables

* might not even use this bugmine.stanity.check after all, it seeps to have a problem getting the major classpath but that shouldnt be too much of an issue really

* clean that up

* document that and remove dummy param, it was silly anyway

* update some basic project lib stuff

* update paths on that and remove trailing spaces and backslashes

* be a little more verbose when skipping existing entries

* its acutally just trigger not triggering

* all ant tasks return boolean for success not a return status code

* adapt this a little

* fix issues that cause obvious failure, still failing now but just not getting the right results

* fix get class list and update readme to reflect any changes to bug mining that I forgot about

* move depenency module list into main readme, add glossary to main readme

* update paths one last time

* minimize patch and promote to directory scripts updated

* copy notice about new build file location

* update method fingerpring

* remind myself to remove this

* remove framework/build-scripts

* copy overview to main readme

* dont need to just diff the src directory

* do the same in the lang project module

* bugmine.sanity.check was just for testing, it isnt even used now, can be removed

* update comment

* Revert change in string interpolation syntax

* In checkout_id update variable name and comment

* Updated a comment in Project module.

* Removed a delegation method: checkout_vid is already implemented and checkout_id is never called by the bug-mining scripts.

* Some simplifications and consistency changes.

* Consistently use the work directory, if provided.

* Infer commit-db and build file location from work_dir; expect work_dir to be provided in each Project constructor.

* Let ant report that error

* Remove outdated comment

* Fixed broken Project module.

* Added another variable for the projects directory.

* Started removing the work_dir attribute, which is replaced by a global PROJECTS_DIR attribute.

* Remove failing, flaky, and random tests in Project module.

* More fixes in core Modules; dynamically determine the directory layout.

* Added the adapted build.xml template.

* Updated Project template.

* Updated

* Renamed directory mapping for Math.

* Updated Math and Lang project modules.

* Made template consistent with Lang and Math project modules.

* Updated the documentation.

* Updated two more scripts, which still require some fine tuning.

* Updated the core framework to support bootstrapping in the bug-mining scripts.

* Don't hard-code the location of the project build file.

* Fixed init script; create all necessary dirs in the create-project script.

* Pass over documentation.

* Fixed a layout issue.

* Skip invalid candidates.

* Fixed a typo.

* Improved script documentation; consistently use -b for bug ids.

* Tweaked the README.

* Minor fixes and output tweaks.

* Clone the repo when creating the project.

* More documentation improvements.

* Updated the get-trigger script; tweaked perldoc of analyze-project script.

* More documentation updates.

* Removed PatchReader from get-class-list script and updated documentation.

* Pass over main README.

* Create all directories when setting up a project; properly set D4J core variables in get-class script.

* PatchReader is no longer required.

* Update the project_repos README after cloning the repo.

* Updated the minimize-patch script.

* Updated file headers.

* Minimize changes in the PR since we eliminated the confusing use of prog_root and work_dir -- we should rename the prog_root attribute in to work_dir, though.

* More prog_root -> work_dir changes.

* Updated Chart project module.

* Added the init subroutine, which is called by the bug-mining scripts, to the template.

* Updated documentation.

* prog_root -> work_dir.

* Fixed Chart project module.

* Don't apply co_hook in Vcs module.

* Reverted changes that are no longer required.

* Added the cached directory layout map for Chart.

* More information about locales (addresses #129).

* Don't hard-code paths to meta data in util scripts.

* Properly set environment variables in get-class-list script.

* Run fewer bugs for Math to avoid travis timeouts.

* Added a todo in the Project module.

* Updated Time module.

* Use revision IDs to identify broken builds in Time.

* Temporarily disable a couple tests.

* Some build files in Time are broken; added the functionality back in to fix these.

* Added the generated directory layout map.

* Renamed Mockito files.

* Added a todo.

* Renamed a subroutine.

* Updated Mockito module.

* Improved project module template.

* Updated Closure module.

* More descriptive names for build stages.

* Fixed a typo in travis config; allocate more time for Closure defects.

* Fixed a regex typo.

* Added the directory layout for Closure.

* Final dir-layout for Closure.

* Added one more test that fails non-deterministically on Travis.

* Moved glossary of D4J terms to bug-mining README.

* Some clarifications in README.

* Fewer jobs for Mockito.

* Split a batch of 10 Closure bugs that always time out on Travis into two batches.

* Renamed promote script and update the documentation; still need to update the code.

* Updated the README.

* Fixed a typo in project module template.

* In download issues script add new param into documentation and require it for github

* In download issues script append org to uri unless its part of project already

* Updated README and added TODOs.

* Correct download issues script to require a org for github (through project or through new org arg)

* Update readme and download issues script to be more obvious about difference between project id and tracker id

* Renamed and updated get-metadata script.

* Fixed minimize-patch script.

* Smaller change in README.

* Fixed a typo in the main build file.

* Fleshing out bug mining README

* Added more information on bug-mining README

Analyzer-related information will be added later.

* Fixed typos in formatter.

* Added build-file-analyzer to lib, and updated gitignore.

* Updated initialize-revision to call analyzer.

* Updated initialize-revision to download dependencies.

* Updated Formatter to handle special cases where test names passed in are missing package names.

* Require tracker_id in merge-commit-db

* Ignore any other special params on commit db when looking for revs

* Updated to use analyzer produced test patterns.

* Updated README.

* All new bugs entered will get tracker id

* Small change in Formatter and removed unneccessary files.

* Use OS-specific path separater to find build file.

* Changed project directory to relative path

* Fix call to create_project in promot to db

* Fix missing variable declaration in

* Add tracker id at the rev list database

* Adding in tracker name but it isnt working for whatever reason

* Using n is not permitted for arguments with getopts who would have guessed that

* Allow case-insensitive user input.

* Modified README.

* Reorganize test directories and add new test cases for analyzer.

* Add one more test case.

* Update analyzer build file.

* Add one more edge case, and refactored code.

* Changed error message to print with debugger.

* Add one more test case.

* Moved subtask helper method to util.

* Move method to util.

* Analyze src,test,src_output,and test_output directories.

* Update test cases to support new functions, and update analyzer.jar.

* Fix wrong target input and hot fix on getting src.test.dir.

* Update README and analyzer.jar.

* Can supply src directory to the merge-commit-db script for projects with a non-standard src directory

* Clarified a few bits and added a few TODOs.

* Clarified 'project name' format.

* Attempt to normalize documentation, options, and code.

* Pass over scripts that initialize revisions.

* Removed invalid instruction.

* All bug-mining scripts are executable and don't require an explicit call of perl.

* Fixed broken merge.

* Improved the bug-mining README.

* Restored meta data for Time.

* Removed a file that should not be under version control

* Minor tweak in README.

* Moved build-analyzer utility program to another repository.

* Avoid keeping data of a failure bid.

* Revert a previous commit and export variables.

* Added info messages.

* Fail if the build system is not ant or maven.

* Fixed variables.

* Augmented the list of generic files that should be copied over.

* Re-use existing function to copy files.

* Augmented list of revision specific files.

* Copy over 'lib' directory.

* Copy over the repository directory.

* Fixed regex to identify the directory name.

* Update project_repos README file.

* Copy project submodule.

* Collect issues urls.
Fixed collection of issues from Google Code.

* Populate 'commit-db' with issue-tracker ID and URL.

* Promote a commit-db with issue-tracker ids and issue urls.

* Older commits first.

* -r parameter is mandatory.

* Attempt to simplify the step-by-step tutorial.

* Fix.

* Perform a couple sanity checks after minimizing a patch.

* Removed debug messages.

* Override global constant, otherwise checkout command it is not able of finding the project repository.

* Copy over the 'relevant_tests' directory.

* Added test cases for the bug-mining framework.

* Java-7 support.

* ${d4j.home}/framework/projects/ -> ${d4j.dir.projects}/

* Added test_bug_mining to travis configuration file.

* The bug-mining framework might require Apache Maven on the PATH to mine projects/bugs that use Maven as their build system.

* Attempt to debug Travis failure.

* Added module List::Util to the list of required Perl modules as the script in the bug-mining framework requires two additional functions only available in recent versions: the 'all' function is available since version 1.33 and 'pairmap' function is available since 1.29.

* Relaxed assertions of trigger_tests and failing_tests.

* Attempt to address the "Can't locate in @inc" issue.

* Deleted debug message.

* Utility script requires diffstat to determine the modified files.

* Inform bug-mining users that diffstat must be installed.

* Several command-line options were not enabled. Some are still not used, but I allow them to be set, and have enabled the query option to actually work, as that is needed for projects like GSON that do not label issues as 'bug'

* Extracted issues ids from the commit history.

* Report the oldest bug report id. Commit c7a581e55fc988bd90fa4bb1b0acece5181b7c5f addressed 3 bug reports: #60 ( created in 2008-10-20), #65 ( created in 2009-01-31), and #102 ( created in 2010-11-04).

* Fix: report issues ids and not pull requests ids.

* Use the official google archive URL.

* Added subroutines to extract a bug report ID and a bug report URL.

* Fixed Chart commit-db file: all rows should have commas, even if no ID/URL is available.

* Re-use existing subroutines to extract a bug report id/url.

* Fixed commit-db path.

* Removed unused use.

* Updated version number.
  • Loading branch information
tomecho authored and rjust committed Jan 18, 2019
1 parent d024b18 commit 8830c99
Show file tree
Hide file tree
Showing 71 changed files with 6,802 additions and 720 deletions.
1 change: 1 addition & 0 deletions .gitignore
Expand Up @@ -3,4 +3,5 @@ major/
7 changes: 7 additions & 0 deletions .travis.yml
Expand Up @@ -4,6 +4,12 @@ perl:

os: linux

- maven
- diffstat

- /home/travis/.java/
Expand Down Expand Up @@ -35,6 +41,7 @@ jobs:
- script: carton exec ./
- script: carton exec ./
- script: carton exec ./
- script: carton exec ./
# Verify that all bugs are reproducible (run multiple jobs for projects that
# take a long time to finish).
- stage: verify-bugs
Expand Down
7 changes: 6 additions & 1 deletion
@@ -1,4 +1,4 @@
Defects4J -- version 1.3.1 [![Build Status](](
Defects4J -- version 1.4.0 [![Build Status](](
Defects4J is a collection of reproducible bugs and a supporting infrastructure
with the goal of advancing software engineering research.
Expand Down Expand Up @@ -151,6 +151,11 @@ provides the following scripts:
| [run_evosuite]( | Generate test suites using EvoSuite |
| [run_randoop]( | Generate test suites using Randoop |

Mining and contributing additional bugs to Defects4J
The bug-mining [README](framework/bug-mining/ details the bug-mining process.

Additional resources

Expand Down
1 change: 1 addition & 0 deletions cpanfile
Expand Up @@ -3,3 +3,4 @@ requires 'DBD::CSV', '>= 0.48';
requires 'URI', '>= 1.72';
requires 'JSON', '>= 2.97';
requires 'JSON::Parse', '>= 0.55';
requires 'List::Util', '>= 1.33';
11 changes: 11 additions & 0 deletions framework/bin/d4j/d4j-info
Expand Up @@ -77,12 +77,17 @@ my $BID = $cmd_opts{b};
my $project = Project::create_project($PID);
my $revision_id;
my $revision_date;
my $bug_report_id;
my $bug_report_url;
# Check version id
if (defined $BID) {
$BID =~ /^(\d+)$/ or die "Wrong bug_id format: $BID! Expected: \\d+";
# Obtain revison ID and date
$revision_id = $project->lookup("${BID}f");
$revision_date = $project->{_vcs}->rev_date($revision_id);
# Obtain bug report ID and url
$bug_report_id = $project->bug_report_id($BID);
$bug_report_url = $project->bug_report_url($BID);

Expand Down Expand Up @@ -112,6 +117,12 @@ if (defined $BID) {
print("Revision date (fixed version):\n");
print("Bug report id:\n");
print("Bug report url:\n");
print("Root cause in triggering tests:\n");
foreach my $i (0..$#trigger) {
next unless $trigger[$i] =~ /(--- )(.*)/;
Expand Down

0 comments on commit 8830c99

Please sign in to comment.