Skip to content

Latest commit

 

History

History
142 lines (108 loc) · 9.62 KB

guidance.md

File metadata and controls

142 lines (108 loc) · 9.62 KB

Guidance (methods, hints and tips)

This document is for reviewers using Github to assign badges to SE publications.

F.A.Q: Why use Github?

We prefer tools like Github over tools like EasyChair, HotCrp, etc., since tools like Github allows for the collaborative interactive process required to prepare artifacts for widespread use.

Also, if reviewers create anonymous ids for themselves, then Github easily supports blind review.

Goal

We say that artifact evaluation is a collaborative process where reviewers and authors work together to double check that research artifacts are ready to be used (or have been used) by other teams.

This is not an adversarial process.

  • Everything we deal with here relates to materials that have passed peer review at other venues.
  • Hence we expect a very large "acceptance" rate, i.e., most authors will get the badges that they request.

This is an interactive process where reviewers and authors may have multiple interactions, e.g.,

  • Reviewers point out some small issue that stops an artifact installing on a new platform.
  • Authors fix up their artifacts to remove that issue.

Actors

The review process described here has three actors:


Process for Track chairs:

  1. Recruitment: gather your review team. Collect anonymous Github ids from each reviewer.
  2. Create repo: make a public repo. Make with one sub-directory for each badge that might be assigned.
    • Those badges may change from conference to conference and might include the following.
      • reusable
      • available
      • replicated
      • reproduced
    • For the definitions of the badges above, see the ICSE'20 artifacts track cfp.
  3. Define your tracking labels: these labels will be the process marks that track the stages of the review process.
    • The labels you use may change from conference to conference and might include the following. Note that the first half shows the process of the artifact through the review process and the last five are final decisions (which are added at the end of the review process):
        1. InitialInstalls
        1. NeedsFirstReview
        1. NeedsSecondReview
        1. AuthorComment
      • nobadge
      • reusable
      • available
      • replicated
      • reproduced
  4. Watch for pull requests from authors: ensure that their materials are submitted to unique sub-directory names under the badge directories.
  5. Assignment: to assign reviewers to artifacts, create one (and only one) issue per submission. At the top of that issue, add a link to the sub-directory for the artifact. Assign some reviewers and the authors to the issue.
  6. Monitor: watch the comment process, stamping out fruitless discussions.
    • During InitialInstalls, allow author/reviewer interaction (so the installs can be debugged).
    • During NeedsFirstReview and NeedsSecondReview, delete any author comments (so reviewers can reflect on their submission without author distaction).
    • During AuthorComment, allow author/reviewer interaction (so the artifacts can be improved and, where possible, reviewer issues can be resolved).

Process for Reviewers:

  1. Create ids: reviewers create an anonymous Id for themselves and pass that id to the track chairs.
  2. InitialInstalls: once assigned to an issue, see if you can use the artifact. Use author/reviewer interaction (so the installs can be debugged).
  3. NeedsReview: comment on the artifact (perhaps using the sample comments shown below). Ignore any author comments (so your can reflect on submissions without author distactions).
  4. AuthorComment : Use author/reviewer interaction (so the artifacts can be improved and, where possible, reviewer issues can be resolved).


Process for Authors:

  1. Prepare the artifact: According to Wilson et al artifacts can take many forms such as simple text files, checklists to guide questionnaires, scripts, packages, containers, etc. You should choose an artifact format that works best for you.

    • That said, in their INSTALL.md document, authors should made a one sentence note explaining why they did not submit their artifact as a VM or Linux container (if that was the case). There are certainly times were these formats are not appropriate, but for some research domains in many cases they can be used and they can greatly improve the testing process for reviewers. That note could comment if the issue was lack of time, a preference for package systems (like PIP) or some homebrew build system, or some lack of knowledge, or poor fit etc.
  2. Check out: check out the repo maintained by the track chairs into your own local branch.

  3. Document:

    • Authors document their artifact using some files specified by the track chairs. Those files may change from conference to conference and might include:
      • CONFLICTS.md: list of review committee members that are conflicted with the authors (and should not review the submission).
      • CONTACTS.md: emails for authors.
      • LICENSE.md: usage permissions.
      • INSTALL.md: where to get the artifact; how to install it.
      • STATUS.md: if applying for multiple badges, list those here.
      • README.md: introductory notes on the artifact, and perhaps, tutorials on how to use it. Example1. Example2.
    • Those files are added to a subdirectory of the submissions directory in the repo.
    • Commit those changes (in your branch) back to Github.
  4. Pull request: issue a pull request to the master branch of the repo.

  5. Watch for help requests from authors (in the issue devoted to your submission).

  6. Then leave the reviewers alone during their review period. Note that track chairs will delete your comments during this period (so reviewers can reflect on your submission without your distractions).

  7. Once your submission's issues are labelled AuthorComment, feel free to interact extensively with the reviewers.


Tips for Authors

This section lists standard comments that past artifact authors have made about artifact evaluations. A wide author might review that list to see if their artifact might attract negative or positive reviewer comments.

This list comes from Erin Dahlgren's excellent report Getting Research Software to Work: A Case Study on Artifact Evaluation for OOPSLA 2019 and lists some common comments that reviewers make about artifacts.

Common Negative Comments

  • Environment
    • Not enough resources. Reviewers didn’t have enough physical resources to test the artifact locally in a reasonable amount of time.
    • Issues with software dependencies. Reviewers struggled to find and install the right software dependencies, sometimes preventing them from setting up and testing the artifact.
    • Works in limited environments. Reviewers had difficulty getting access to proprietary operating systems and running benchmarks locally.
  • Format
    • Issues with VM or container. Reviewers encountered errors when they tried to setup VMs and containers, or the VMs and containers slowed down the review process.
    • Problems with docs. Reviewers encountered typos in instructions, and missing or unclear instructions.
    • Errors in scripts. Reviewers wasted time debugging typos and unexpected errors in helper scripts.
    • Too complicated. Reviewers struggled to complete complicated instructions to test complicated artifacts.
  • Execution
    • Long running tests. Reviewers struggled to complete tests that took hours or days.
    • Issues compiling or running. Reviewers encountered errors when they tried to compile or run the artifact.
    • Ignored errors. Reviewers weren’t confident about artifacts that emitted errors, even if the results produced were correct.
    • Downloads during execution. Reviewers spent a long time running test cases that downloaded data on the fly. Reviewers were also worried that such data would not always be available.

Common Positive Comments

  • Self-contained. Reviewers were enthusiastic about artifacts that required minimal setup and worked seamlessly.
  • Lightweight. Reviewers benefited from artifacts that required minimal storage space. They also appreciated being able to download small parts of a large artifact.
  • Comprehensive documentation. Reviewers praised clear, easy to follow documentation that covered most if not all aspects of the artifact.