Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ten years reproducibility challenge: paper #41 #32

Closed
civodul opened this issue Apr 28, 2020 · 38 comments
Closed

Ten years reproducibility challenge: paper #41 #32

civodul opened this issue Apr 28, 2020 · 38 comments

Comments

@civodul
Copy link

civodul commented Apr 28, 2020

Original article: Storage Tradeoffs in a Collaborative Backup Service for Mobile Devices_(2006)

PDF URL: article.submitted.pdf
Metadata URL: metadata.yaml
Code URL: https://gitlab.inria.fr/lcourtes-phd/edcc-2006-redone

Scientific domain: Fault tolerance, Software engineering
Programming language: C, Scheme
Suggested editor:

@civodul
Copy link
Author

civodul commented Apr 28, 2020

This corresponds to ReScience/ten-years#1 (comment).

@rougier
Copy link
Member

rougier commented May 4, 2020

Thanks for your submission. I've modified your post to include link to your new article and metadata. We'll assign an editor soon.

@otizonaizit Coudl you handle this submission for the Ten Years Reproducibility Challenge (only one reviewer needed)?

@otizonaizit
Copy link
Member

I can edit this

@otizonaizit otizonaizit self-assigned this May 13, 2020
@otizonaizit
Copy link
Member

@ogrisel : would you be up to review this?

@otizonaizit
Copy link
Member

@sabinomaggi : would you be up to review this?

@sabinomaggi
Copy link

Ok, I think I can do that.

@sabinomaggi
Copy link

Within one week / ten days?

@otizonaizit
Copy link
Member

sure!

@sabinomaggi
Copy link

@otizonaizit

  • very nice paper, may I contact the author for questions, clarifications and the like (I guess I can, but better to be sure);
  • the review should be written in this thread?

@otizonaizit
Copy link
Member

otizonaizit commented May 20, 2020 via email

@sabinomaggi
Copy link

Hi @civodul
it was a real pleasure to read your paper, a very impressive reproduction, indeed!
GNU Guix seems very interesting, even if I wonder if the average scientist (myself included!) could be able to manage it instead of using a Docker container or a Python virtual environment.

Before delving into the actual review of the paper, I would like to manage the code/licensing issue.

The 10 years challenge requires all partecipants to make available both the original and the revised code in a single repository, and to explicitly license it with an open source license. However, your gitlab account https://gitlab.inria.fr/lcourtes-phd/ contains two separate repositories for the original and revised code and in both of them you reserve all rights for the code.

Frankly I don't know if having two separate repositories rather than one is a real problem (I hope the editor @otizonaizit can help on this), but you definitely need to add an open license to your revised code. If possible, also to your original code (of course, they could be the same).

@otizonaizit
Copy link
Member

Frankly I don't know if having two separate repositories rather than one is a real problem (I hope the editor @otizonaizit can help on this), but you definitely need to add an open license to your revised code. If possible, also to your original code (of course, they could be the same).

Given that the repo with the original code contains only a single commit, i.e. there is no history to preserve, I'd prefer to see both original and new code in the same repo.

Following the challenge guidelines the ideal setup is for the initial commit of the repo to be adding the old code, and then the following commits are adaptations to the code that are documenting what needed to be changed to reproduce the results.
And yes, a license should be added, at least to the new code :-)

@civodul : can you do this?

@civodul
Copy link
Author

civodul commented May 22, 2020

Thanks @sabinomaggi and @otizonaizit for taking a look!

The files in https://gitlab.inria.fr/lcourtes-phd/edcc-2006-redone already had a GPLv3+ license header, but I've now added a top-level COPYING file for clarity. Apologies for the omission! This repository contains the only new pieces of code.

The original code from 2006 is used as-is. It comes from these two repositories:

  1. The libchop repository, under the GNU GPLv3-or-later. The 2006 revision being tested lacks explicit licensing because at the time it was private software; it was released under the GPLv3+ in 2010, once I had been authorized to do so.
  2. The chop-eval repository had never been released. It is an extension of libchop and as such falls under the GPLv3+ as well, which I've now made explicit.

I realize I took a fairly unconventional approach by reusing the original code as-is. It seems that "Software" section of the guidelines expected that code would likely have to be modified, which I made a point of not doing here. The section also seems to assume that there is no version control history available for the source code, whereas I was able to recover the original version control history.

Let me know what you think!

@sabinomaggi
Copy link

Hi @civodul
I think that the license issue is fixed, thanks for the quick reply!

I realize I took a fairly unconventional approach by reusing the original code as-is. It seems that "Software" section of the guidelines expected that code would likely have to be modified, which I made a point of not doing here.

I don't think this is an issue here, since you were able to reproduce the paper as requested. What unfortunately is not immediately apparent is how you were able to reproduce the original paper. If I understand well, the 2006 paper performed all calculations by running the lout code from within the source of the paper itself -- more or less as can be done now with R Markdown, Jupyter Notebooks or Papermill (or even from LateX itelf with the proper packages) -- while now you translated everything to Guix while keeping the actual lout code unchanged. Could you please confirm if I am right?

The original code from 2006 is used as-is. It comes from these two repositories:[...]

As for the double repository, I suggest two alternative routes to overcome the problem:

The first approach might seem more cumbersome and requires duplication of code which is always bad practice, but said frankly I think it is also much cleaner and complies in full to the rules of the challenge.

@otizonaizit
Copy link
Member

@civodul : I second @sabinomaggi 's proposal: if you create a single repo with two commits, it is going to be much easier in the future for people to find the code and understand the paper. The original repos of course do not need to disappear, if you value the commits and the development history.

@otizonaizit
Copy link
Member

@sabinomaggi : let's assume that the repo issue will be fixed very soon. Could you proceed with the actual review? Thanks! ;-)

@sabinomaggi
Copy link

Here is my review of the paper. I have tried to follow as much as possible the reviewer's guidelines of ReScience C, together with the additional checks suggested by @pdebuyl within the first actual review of a 10 Years Challenge paper (#11).

Replication. The paper reproduces the results of an article published in 2006 about the design of the storage layer of a mobile backup service. Within the source text of the paper, the original article contained all the code needed to automatically perform the analysis of several storage pipelines, produce the figures accompanying the paper and generate the PDF of the paper itself. In the review I'll pay particular attention to this latter self-generation aspect, since the Ten Years Challenge is focused on the reproducility issues, rather than on the actual scientific content of the original paper.

The approach taken by the author for the present reproduction is to rebuild the original authoring toolchain and to make it work with more modern tools on the original, untouched, computation code. All this happens within the build process that produces the actual article under review. Such unconventional approach is very interesting (albeit fairly complex!), and I believe is still well within the rules of the Challenge.

Today, the topic of reproducible computation/automated report generation is very popular (the present Challenge is a clear proof) and is supported by several different tools, but I guess that 14 years ago the situation was fairly different.

Reproducibility of the replication. To run the code I installed the GNU Guix distribution in a Virtual Machine and I ran all the commands reported in the article. Everything worked perfectly, without any issue.

Strangely, the date of all files produced by the toolchain corresponds to the beginning of UNIX-time (1-1-1970), while everything else in GNU Guix has the current date. I don't know if this is a quirk of the Virtual Machine or of GNU Guix, but surely it does not affect the quality of the replication.

Modifications needed to reproduce the results. None.

Clarity of code and instructions. The current code is based on Scheme, a variant of LISP, which is notoriously very readable, nearly as it were a text in english. Comments are sparse, but they would not add much to the readability of the code.

Clarity and completeness of the accompanying article. The article is clear and well written, however it gives the feeling to be addressed to a specialized computer science audience rather than to a general readership, as expected here.

The rebuild of the authoring toolchain and the reproduction of the original experimental results are presented in detail, but the author fails to present the reasons to follow this self-generated approach, confining them only to a brief discussion in the last sections at the end of the paper. It would be better to add an Introduction section, where the author discusses the general motivation of the original paper and to follow his self-generated approach, possibly describing the state of the art of reproducible computation/automated report generation in 2006, with the pros and cons of his solution and of the alternate tools available back then (and maybe also today).

The same holds true for GNU Guix, which is described in some detail only in section 4.1 and not in Section 2.2 when the author first mentions this tool. In my opinion, a short general introduction to Guix in Section 2.2 (or in the Introduction itself), while leaving untouched the more technical description of Section 4.1, would add clarity to the present paper.

It might be my fault but, even if I have used Autotools for several years, I cannot fully understand the meaning of this sentence in Section 2.2, "The good thing with the Autotools is that a user building from a source tarball does not need to install the Autotools. However, no release of libchop had been published as a source tarball back then, and building from a version-control checkout does require the Autotools."

It is a real pity that all text boxes of Figure 1 are so small to require a large zooming factor only to make them barely legible. Would it be possible to tweak to code that generates the Figure, in order to create larger text boxes and prevent to have so much wasted empty space?

As a final note, I wonder if, and how much, the author's approach to reproducible computation/automated report generation is feasible for the average scientist, in particular when compared to tools with a smoother learning curve, such as Docker containers, Jupyter notebooks, R Markdown documents and the like. A brief analysis of this topic with a clear presentation of the advantages of the author's approach in the Discussion session would be worthwhile.

Availability and licensing of past source code. The original source code is available. The license is not defined, but as far as I understand this is not a mandatory requirement for the original code.

Availability and licensing of updated source code. The updated source code is available and is licensed under the GNU General Public License v3.0.

@civodul
Copy link
Author

civodul commented May 26, 2020

Hi @sabinomaggi,

I don't think this is an issue here, since you were able to reproduce the paper as requested. What unfortunately is not immediately apparent is how you were able to reproduce the original paper. If I understand well, the 2006 paper performed all calculations by running the lout code from within the source of the paper itself -- more or less as can be done now with R Markdown, Jupyter Notebooks or Papermill (or even from LateX itelf with the proper packages) -- while now you translated everything to Guix while keeping the actual lout code unchanged. Could you please confirm if I am right?

No, but that makes me realize I could have been clearer.

Lout is a document typesetting system that I used to typeset the original paper (via another authoring tool, Skribilo). It has nothing to do with the code evaluated in the paper, nor with the scripts used to run the benchmarks. More to the point: https://gitlab.inria.fr/lcourtes-phd/edcc-2006 is not relevant to this work, as mentioned in Section 1. It contains the source code of the original article, nothing else, and it is not reused at all here. I guess the lesson is that Section 1 could state it more clearly, or perhaps even omit the reference to https://gitlab.inria.fr/lcourtes-phd/edcc-2006 ?

What I "translated" to Guix is the plumbing to build the artifacts in this paper: building and deploying the software (libchop and its dependencies), "building" the benchmark results, and building the PDF with LaTeX.

Your review rightfully hints at a lack of clarity when it comes to describing this, which I'm willing to address.

As for the double repository, I suggest two alternative routes to overcome the problem:

* create a new repo on GitHub, GitLab or whatever with only two commits, the first containing only the original 2006 code/paper and the second with the current code/paper submitted to the ReScience challenge.

* Add to the README file in https://gitlab.inria.fr/lcourtes-phd/edcc-2006-redone a note stating that the code in the repository is an improvement/update of the original code found in https://gitlab.inria.fr/lcourtes-phd/edcc-2006, with a link to the latter repo.

The first approach might seem more cumbersome and requires duplication of code which is always bad practice, but said frankly I think it is also much cleaner and complies in full to the rules of the challenge.

I've added links now.

I think part of the motivation for this article is to show that all this software, whether we call it "library", "program", "script", or "article", is intimately tied together, and how Guix allows you to express those connections. Picking two repositories out of this software stack and merging them into one would seem to me as dissonant compared to that vision.

Thanks a lot for your feedback and for the thorough review!

@sabinomaggi
Copy link

Hi @civodul
sorry for the misunderstanding about lout, I admit I googled for lout and skb without finding anything, and I had to infer what happened by looking at the code. And when one looks at lout sources it is evident that they are computer code, more or less as LaTeX (or TikZ) is a typesetting system but also a computer language.

But well, this philosophical discussion doesn't matter much per se, what is really important is whether it could help to improve the paper.

What I "translated" to Guix is the plumbing to build the artifacts in this paper: building and deploying the software (libchop and its dependencies), "building" the benchmark results, and building the PDF with LaTeX.

What really preoccupies me is more fundamental. When reading your paper I had the strong impression that it was a tool to rebuild everything from scratch, i.e. that it performed all calculations and rebuild the figures and the final PDF document from within itself (an automated report generator in current speak). In other words, the original paper was also the code that performed all the calculations reported in the article (through the chop-eval script referred in the paper). Therefore, rebuilding the paper with today's tools, while keeping libchop and chop-eval unaltered, automatically meant to reproduce the original results from the paper itself. And this is what I tried to stress throughout the review.

If this is not the case, as it seems now, we have a few problems:

  • first of all, according to the rules of the Challenge, you should make available all the scripts used to perform the actual analysis described in the 2006 paper (namely chop-eval and whatever else) in the project repository referred by this paper (presently https://gitlab.inria.fr/lcourtes-phd/edcc-2006-redone);
  • I am not absolutely sure about libchop but, as you have used a specific 2006 version of the library, I guess that, in order to allow everyone to redo your work, that particular version should be incorporated into the same project repository;
  • on the other hand, the "code" of the original paper itself does not need to be integrated in the project repository, provided there is a public link to the published PDF file;
  • all the changes/scripts atc. needed to make the old code run today should be added as a separate commit to the same project repository, following the editor (and my own) suggestion;
  • last but not least, I should update my review and you definitely need to make your paper clearer for the general audience of this conference.

@civodul
Copy link
Author

civodul commented May 26, 2020

What really preoccupies me is more fundamental. When reading your paper I had the strong impression that it was a tool to rebuild everything from scratch, i.e. that it performed all calculations and rebuild the figures and the final PDF document from within itself (an automated report generator in current speak). In other words, the original paper was also the code that performed all the calculations reported in the article (through the chop-eval script referred in the paper). Therefore, rebuilding the paper with today's tools, while keeping libchop and chop-eval unaltered, automatically meant to reproduce the original results from the paper itself. And this is what I tried to stress throughout the review.

I'd like to stress that the source of the original paper is not used at all here and is irrelevant. This is what I attempted to express in Section 1:

https://gitlab.inria.fr/lcourtes-phd/edcc-2006 contains the source code of the paper itself. It turned out to not be useful for this replication.

I understand this was unclear, hence my suggestions to remedy that. What do you think?

Conversely, this new paper contains everything to go from nothing to PDF—and I really mean it, because the guix time-machine command given in the article builds (or downloads) everything from scratch. In other words, https://gitlab.inria.fr/lcourtes-phd/edcc-2006-redone is self-contained.

The author guidelines read:

Add your old software as a single commit (ideally the initial one) of a new source repository.

Again, I believe the assumption was that the "old software" wasn't under version control.

Also, what I tried to demonstrate is that the software that I wrote is more than just the code and scripts I wrote: it's also the whole software stack beneath it, without which the software I wrote won't run. Thus, I do not think that chop-eval and libchop need special treatment compared to, say, Guile, the C library, and the C compiler that are part of that stack.

last but not least, I should update my review and you definitely need to make your paper clearer for the general audience of this conference.

Understood, and I very much agree. I will work on this and other issues you pointed out and let you know when I have an updated version.

Thanks!

@khinsen
Copy link

khinsen commented May 26, 2020

Jumping in because I have read this paper carefully as well, not with the goal of reviewing, but with the goal of learning more about Guix. BTW, my own contribution to the challenge also uses Guix to provide a reproducible baseline for the future, but in a more pedestrian (and arguable more "standard") way which is perhaps of interest of those wanting to learn more about Guix.

My view on the repository debate is that @civodul has basically demonstrated that we weren't careful enough in writing the guidelines for this challenge. We were indeed considering the case of old software requiring patches to run on modern systems. We did not at all consider the machinery to make formally non-reproducible computations reproducible, by automating steps and formalizing dependencies that were initially undocumented or described only informally. @civodul's submission contains only this formalization, whereas the original code (which is https://gitlab.inria.fr/lcourtes-phd/chop-eval) required no changes. I propose to grant @civodul an exemption from the rule in the guideline for the merit of having demonstrated that it is insufficient!

As for @sabinomaggi's question if this is accessible to ordinary scientists, my personal answer is no: this is advanced reproducibility wizardry. Which is also why I find this paper interesting. See it as a proof of concept for future technology, which remains to be developed into something user-friendly. There are open issues as discussed in the paper, but I do believe that this approach will become the norm in the future. Remember that you read it first in ReScience C!

Finally, a minor suggestion for @civodul's revision: the discussion of Mohammad Akhlaghi's "template" would profit from an update since it has in the meantime been released under the name Maneage.

@sabinomaggi
Copy link

@khinsen You are absolutely right, the approach used by @civodul is very bright and promising.

To me, the goal of the debate above was twofold:

  1. to try to understand which code had to be included in the submission, as per the original guidelines. Your comment and your proposal is more than welcome, as it eases this task. After all, guidelines are made to be broken!
  2. to improve the presentation in order to make the paper accessible to a wider audience of scientists who are not formally trained in computer science. The average scientist like myself might not understand everything of what @civodul did, but he/she could at least have the feeling that there is a whole world to be explored, outside of Docker containers, Jupyter notebooks or virtual environments.

I am eager to read the updated version of the paper.

@civodul
Copy link
Author

civodul commented Jun 2, 2020

Hi @sabinomaggi,

I pushed an updated version that tries to address the various points you raised. In particular, the introduction and Section 2 provide an overview of GNU Guix, and the new "Related Work" section compares it to other tools commonly used in the context of reproducible research.

In Section 4.2, I added an example code snippet to illustrate one aspect of how Guix is used here. I was unsure whether to go further but thought it's perhaps not the right place to explain the programming interface. Let me know what you think.

Thanks also @khinsen for the kind words and for the perspective you bring!

Ludo'.

@sabinomaggi
Copy link

@otizonaizit @civodul @khinsen
The present version of the paper addresses all questions raised during the review process. I have no further comments and I recommend publication.

@otizonaizit
Copy link
Member

Great. So the paper is herevy accepted for publication. @civodul : I'll be back to you early next week for finalizing the publication.

@otizonaizit
Copy link
Member

@civodul : I need you to perform an additional step before publication.

I have updated the article metadata. I have no access to the gilab repo at INRIA where your article lives, so I can't make a pull request there. Instead, I pushed my modifications here:
https://github.com/otizonaizit/edcc-2006-redone

There is a single commit there with the updated metadata. You should merge that commit to your repo, refresh the PDF with the new metadata, rename the article from article.submitted.pdf to article.pdf, and push the whole thing to your repo.

I will do the rest :-)

@civodul
Copy link
Author

civodul commented Jun 11, 2020

Hello @otizonaizit,

I've merged the commit and pushed article.pdf.

I noticed two minor issues: "Received" and "Published" on the first page are just a hyphen, and I didn't have an orcid so I've created one (even though I'm skeptical about the need for a central database entrusted with personal information) and updated metadata.yaml accordingly.

Let me know if anything else should be done!

@otizonaizit
Copy link
Member

I've merged the commit and pushed article.pdf.

thanks!

I noticed two minor issues: "Received" and "Published" on the first page are just a hyphen

Ouch, this not a minor issue, unfortunately. The dates seems fine in the metadata.yaml file, and they get correctly compiled into the latex template file when I run the metadata file through the yaml-to-latex.py converter, so the dates should be visible in the PDF file eventually. Can you try to debug the issue and see what goes wrong on your machine? Maybe something is still not 100% correct in the GUIX setup? I can't publish it like that: we really need those dates.

and I didn't have an orcid so I've created one (even though I'm skeptical about the need for a central database entrusted with personal information) and updated metadata.yaml accordingly.

OK. Thanks!

Speaking of which, actually it would be good if you applied for a DOI from zenodo and/or a Software Heritage identifier for your article repo. This info should also be added to the metadata.yaml. Another thing that you may want to add is the DOI of the original article you are replicating here.

So, another little bit of effort and we will be soon done with this :-)

@civodul
Copy link
Author

civodul commented Jun 11, 2020

I fixed the date issue, added the DOI of the original article and a Software Heritage intrinsic identifier. It looks good to me now, let me know if anything is amiss!

The date issue was interesting, and of course 100% reproducible in the Guix framework. :-) It turns out that article.py would attempt to use dateutil.parser and otherwise silently fall back to another code path that returns an empty string. The fix was to add dateutil to the environment that builds metadata.tex.

@otizonaizit
Copy link
Member

The date issue was interesting, and of course 100% reproducible in the Guix framework. :-) It turns out that article.py would attempt to use dateutil.parser and otherwise silently fall back to another code path that returns an empty string. The fix was to add dateutil to the environment that builds metadata.tex.

Good catch! Could you open an issue (and even a PR if you know how to fix it!) here https://github.com/rescience/template ?

The paper is published! @rougier : could you merge my PR
ReScience/rescience.github.io#81 and update the website? I don't have permission to do it myself, thanks! ;)

@khinsen
Copy link

khinsen commented Jun 11, 2020

@civodul Thanks for spotting (and fixing) this problem!

@otizonaizit I just pushed a partial fix in http://github.com/rescience/articles, by making dateutil compulsory (and adding it to the list of dependencies in the README). With that fix, PDFs compiled by the editors always have the correct dates. We could do the same in https://github.com/rescience/template for author-generated PDFs, but at the price of making authors' lives more difficult by requiring them to install one more dependency. For drafts, the dates don't matter, so authors who don't compile final PDF are fine without dateutil.

BTW, we should have better instructions in https://github.com/rescience/template, in particular a list of dependencies for the Python scrips.

@otizonaizit
Copy link
Member

We could do the same in https://github.com/rescience/template for author-generated PDFs, but at the price of making authors' lives more difficult by requiring them to install one more dependency. For drafts, the dates don't matter, so authors who don't compile final PDF are fine without dateutil.

Well, yes, but given we ask them to install PyYAML and make, I think dateutil is really not that big of an issue, don't you think? I think that the setup for editors and authors should be as equal as possible. Ideally, the authors should be able to create the final PDF, no? Given that they already have to be able to generate the version we use for review, there's not much of a difference in then re-generating the Pdf for final publication...
I mean, in this case @civodul generated the final PDF: given that I don't control his setup, I could not know that he could not generate the PDF correctly with dates. So requiring dateutil removes this last source of confusion, no?

@civodul
Copy link
Author

civodul commented Jun 12, 2020

I mean, in this case @civodul generated the final PDF: given that I don't control his setup, I could not know that he could not generate the PDF correctly with dates.

In fact, you do control my setup because you have its source code. :-)

But yeah, I guess that explicitly requiring dateutil is the right thing.

@wdkrnls
Copy link

wdkrnls commented Jul 3, 2020

I ran into an issue when I tried to build with:

guix time-machine -C channels.scm -- build -f guix.scm

Here is the snippet from the log:

https://paste.debian.net/1154968/

There were also many references to ldconfig: command not found in log text.

Maybe this was just an unreliable download? I was able to run guix build -f article/guix.scm without problems.

@civodul
Copy link
Author

civodul commented Jul 3, 2020 via email

@wdkrnls
Copy link

wdkrnls commented Jul 3, 2020

wdkrnls notifications@github.com skribis:
I ran into an issue when I tried to build: https://paste.debian.net/1154968/.
Could you show the exact command you used? (You need to make sure to use the provided ‘channels.scm’ file via ‘guix time-machine’.) Is this on x86_64? Thanks for trying it out!

Sorry for not being so precise. Since I was following the readme, this was actually the output of:

guix time-machine -C channels.scm -- build -f guix.scm

Yes, this was on x86_64.

@khinsen
Copy link

khinsen commented Aug 19, 2020

@otizonaizit Since this paper is now published, can we close this issue?

@otizonaizit
Copy link
Member

Sure!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants