cleaning up introduction and removing content that goes off track #57

vsoch · 2020-04-02T22:57:10Z

This pull request is a first shot at cleaning up the manuscript, namely:

the introduction side tracked into talking about research compendiums, which was distracting and out of scope for an introduction that should be focused on leading into talking about building data science containers.
I removed lines that (although possibly true / meaningful) didn't add to the flow of the paper.

I can add comment for any specific choice for reviewers that are interested.

Signed-off-by: vsoch vsochat@stanford.edu

Signed-off-by: vsoch <vsochat@stanford.edu>

Don't want to imply that donoho talked about containerization

also one sentence per line

nuest

Thanks @vsoch - good job. I'll do another run through and try to remove some repeated content and close the remaining issues today. Besides the open issues, the only thing missing now is a better example Dockerfile for within the article.

nuest · 2020-04-15T16:26:57Z

ten-simple-rules-dockerfiles.Rmd

@@ -1,5 +1,5 @@
 ---
-title: "Ten Simple Rules for Writing Dockerfiles for Reproducible Research"
+title: "Ten Simple Rules for Writing Dockerfiles for Reproducible Data Science"


nuest · 2020-04-15T16:31:25Z

ten-simple-rules-dockerfiles.Rmd

 By providing this recipe, authors of scientific articles greatly improve their work's level of documentation, transparency, and reusability.
 Such practice is one important part of common practices for scientific computing [@wilson_best_2014; @wilson_good_2017], with the result that it is much more likely both the author and others are able to reproduce and extend an analysis workflow.
 The containers built from these recipes are portable encapsulated snapshots of a specific computing environment.
 Such containers have been demonstrated for capturing scientific notebooks [@rule_ten_2019] and reproducible workflows [@sandve_ten_2013].
-Research compendia also allow for proper citation of the used computing environment, which is not possible within containers alone.
-Best practices are still a work in progress [cf. @katz_software_2018], but you should try your best to give credit to creators of software you rely on by following recommendations of projects such as CodeMeta ([https://codemeta.github.io/](https://codemeta.github.io/)) and the Citation File Format ([https://citation-file-format.github.io/](https://citation-file-format.github.io/)).


Killed one of my darlings here... good job :-).

nuest · 2020-04-15T16:40:21Z

ten-simple-rules-dockerfiles.Rmd

@@ -369,7 +337,7 @@ You can view all labels with [`docker inspect`](https://docs.docker.com/engine/r
 Labels serve as structured metadata that can be leveraged by services, e.g., https://microbadger.com/labels.
 For example, software versions, license, and maintainer contact information are commonly seen and very useful if a `Dockerfile` is discovered out of context.
 While you can add arbitrarily complex information with labels, for research compendia the user-facing documentation is much more important.
-If you want to earn extra points, and you never know what future algorithms will be able to make sense of, include global identifiers such as [ORCID identifiers](https://orcid.org/) for people, a DOI of the research compendium, e.g., [reserved on Zenodo](https://help.zenodo.org/) before publishing the research compendium, or your funding agency's grant number.
+Important metadata that might be more utilized with future tools includes global identifiers such as [ORCID identifiers](https://orcid.org/), DOIs of the research compendium, e.g., [reserved on Zenodo](https://help.zenodo.org/), or a funding agency's grant number.


I've re-added the research compendium link here, but am fine with not using the term throughout the article.

nuest · 2020-04-15T16:51:45Z

ten-simple-rules-dockerfiles.Rmd

 Depositing the image next to other project files, i.e., data, code, and the used `Dockerfile`, in a public repository makes them likely to be preserved, but is is highly unlikely that over time you will be able to recreate it precisely from the accompanying `Dockerfile`.
 Publishing the image and the contained metadata therein (e.g., the Docker version used) may even allow future science historians to emulate the Docker runtime environment.
-Applying proper preservation strategies (cf. [@emsley_framework_2018]) can be highly complex, but simply running an image "as-is", i.e. with the default command and entrypoint (see \ruleref{rule:interactive}), and observing the output is quite likely to work for many years into the future.


Another one of my darlings 💀 !! I agree though that container preservation as a topic is not mature enough yet to expose the target audience of this article.

vsoch · 2020-04-15T17:37:05Z

Woohoo! Thank you @nuest !

cleaning up introduction and removing content that goes off track

23e4fcf

Signed-off-by: vsoch <vsochat@stanford.edu>

vsoch mentioned this pull request Apr 3, 2020

Add note about audience for paper to introduction #55

Closed

nuest added 4 commits April 15, 2020 18:29

Move reference

6f24f8a

Don't want to imply that donoho talked about containerization

Add rc URL on first occurence

41d7e04

update use of research compendium

1fb28ce

fix typo

36315d8

also one sentence per line

nuest approved these changes Apr 15, 2020

View reviewed changes

nuest merged commit f811f9a into master Apr 15, 2020

nuest deleted the cleaning-up-sections branch May 13, 2020 06:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cleaning up introduction and removing content that goes off track #57

cleaning up introduction and removing content that goes off track #57

vsoch commented Apr 2, 2020

nuest left a comment

nuest Apr 15, 2020

nuest Apr 15, 2020

nuest Apr 15, 2020

nuest Apr 15, 2020

vsoch commented Apr 15, 2020

cleaning up introduction and removing content that goes off track #57

cleaning up introduction and removing content that goes off track #57

Conversation

vsoch commented Apr 2, 2020

nuest left a comment

Choose a reason for hiding this comment

nuest Apr 15, 2020

Choose a reason for hiding this comment

nuest Apr 15, 2020

Choose a reason for hiding this comment

nuest Apr 15, 2020

Choose a reason for hiding this comment

nuest Apr 15, 2020

Choose a reason for hiding this comment

vsoch commented Apr 15, 2020