Skip to content
/ drr Public
forked from ASSERT-KTH/drr

Tool & data on the correctness of Defects4 patches generated by program repair tools

License

Notifications You must be signed in to change notification settings

Zustin/drr

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Automated Patch Assessment for Program Repair

A tool for automatic correctness assessment for patches generated by program repair systems. We consider the human patch as ground truth oracle and use Random tests based on the Ground Truth (RGT). See Automated Patch Assessment for Program Repair at Scale

If you use this repo, please cite:

@Article{Ye2021EMSE,
    author = {Ye, He and Martinez, Matias and Monperrus, Martin},
    title = "Automated Patch Assessment for Program Repair at Scale",
    journal="Empirical Software Engineering",
    volume = "26",
    issn = "1573-7616",
    doi = "https://doi.org/10.1007/s10664-020-09920-w",
    year = "2021"
}

Folder Structure

├── Patches 257 patches from Dcorrect and 381 patches from Doverfitting
│ 
├── RGT: incl. tests from Evosuite2019, Randoop2019, EvosuitASE15, RandoopASE15 and EvosuiteEMSE18
│   
├── DiffTGen
│   ├── Results: the running result overfitting patches found by DiffTGen. 
│   ├── runDrr.py: a command to reproduce DiffTGen experiment(details see below)
│ 
├── statistics: our exerimental statistics for all RQs
│ 
└──  run.py: a command to reproduce all experiments

Prerequisites

  • JDK 1.7
  • OS: Linux and Mac
  • Configure the DEFECTS4J_HOME="home_of_defects4j"
  • Add submodule defects4j and checkout the commit 486e2b4(Please note our experiment depends on several Defects4J commands)
git submodule add https://github.com/rjust/defects4j
git reset --hard 486e2b49d806cdd3288a64ee3c10b3a25632e991

Run

To assess an indiviual patch for Defects4J:

./run.py patch_assessment <patch_id> <dataset:Dcorrect|Doverfitting> <RGT:ASE15_Evosuite|ASE15_Randoop|EMSE18_Evosuite|2019_Evosuite|2019_Randoop>  
example:  ./run.py patch_assessment patch1-Lang-35-ACS.patch Dcorrect 2019_Evosuite

To perform different sanity checks:

./run.py applicable_check
./run.py plausible_check

To identify flaky tests:

./run.py flaky_check <patch_id> <dataset:Dcorrect|Doverfitting> <RGT:ASE15_Evosuite|ASE15_Randoop|EMSE18_Evosuite|2019_Evosuite|2019_Randoop>  
example:  ./run.py flaky_check patch1-Lang-35-ACS.patch Dcorrect 2019_Evosuite

To reproduce our Expriment with RGT patch assessment

RQ1: ./run.py RQ1
RQ3: ./run.py RQ3
RQ4: ./run.py RQ4
RQ5: cd ./statistics   ./RQ5-randomness-script.py  <Evosuite2019|Randoop2019>

Results

Credits

  • For more details about Defects4J, see the original repository of the Defects4J benchmark.
  • For more details about DiffTGen, see the original repository of the DiffTGen.

About

Tool & data on the correctness of Defects4 patches generated by program repair tools

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 100.0%