Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

submission: suppdata #195

Closed
15 of 19 tasks
willpearse opened this issue Feb 7, 2018 · 44 comments
Closed
15 of 19 tasks

submission: suppdata #195

willpearse opened this issue Feb 7, 2018 · 44 comments

Comments

@willpearse
Copy link

willpearse commented Feb 7, 2018

Summary

  • What does this package do? (explain in 50 words or less):
    Downloads supplementary materials from published papers using their DOIs as a reference. This facilitates reproducible analyses.

  • Paste the full DESCRIPTION file inside a code block below:

Package: suppdata
Type: Package
Title: Downloading Supplementary Data from Published Manuscripts
Version: 0.9-0
Date: 2018-02-02
Author: William D. Pearse, Scott Chamberlain
Maintainer: William D. Pearse <will.pearse@gmail.com>
Description: Downloads data supplementary materials from manuscripts,
    using papers' DOIs as references. Includes some code to download
    from other Internet APIs (e.g., Xeno-Canto, EPMC).
License: MIT + file LICENSE
URL: https://github.com/ropensci/fulltext
BugReports: https://github.com/ropensci/fulltext/issues
VignetteBuilder: knitr
LazyLoad: yes
Suggests:
    knitr (>= 1.6),
    testthat (>= 2.0.0),
    fulltext(>= 0.1.4.9000)
Imports:
    httr (>= 1.0.0),
    xml2 (>= 1.2.0),
    jsonlite (>= 1.5),
    RCurl (>= 1.95-4.10),
    rcrossref (>= 0.8.0)
RoxygenNote: 6.0.1
  • URL for the package: https://github.com/willpearse/suppdata

  • Please indicate which category or categories from our package fit policies this package falls under *and why(? (e.g., data retrieval, reproducibility. If you are unsure, we suggest you make a pre-submission inquiry.):

    • data retrieval: because it downloads data from online scientific publications.
    • reproducibility: because by downloading raw data, it facilitates reproducible analysis scripts
  •   Who is the target audience and what are scientific applications of this package?  

    • Audience: anyone (re-)analysing published scientific data
    • Application: Conducting reproducible analyses, database creation scripts
  • Are there other R packages that accomplish the same thing? If so, how does
    yours differ or meet our criteria for best-in-category?

Originally, this code was part of fulltext, but has now been split out. That is the only package I'm aware of that does this.

  •   If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.
    @sckott

Requirements

Confirm each of the following by checking the box. This package:

  • does not violate the Terms of Service of any service it interacts with.
    • This package does download data from journal publishers such as Wiley. I have heard stories/rumours of them getting very unhappy about people downloading articles from their websites at high speed. I'm not aware that this package violates their terms/conditions, but I want to flag this here. This code has already been part of an ROpenSci package (fulltext) without any problems, so I can't see that this is an issue.
  • has a CRAN and OSI accepted license.
  • contains a README with instructions for installing the development version.
  • includes documentation with examples for all functions.
  • contains a vignette with examples of its essential functions and uses.
  • has a test suite.
  • has continuous integration, including reporting of test coverage, using services such as Travis CI, Coveralls and/or CodeCov.
  • I agree to abide by ROpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

Publication options

  • Do you intend for this package to go on CRAN?
  • Do you wish to automatically submit to the Journal of Open Source Software? If so:
    • The package has an obvious research application according to JOSS's definition.
    • The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
    • The package is deposited in a long-term repository with the DOI: 0.5281/zenodo.1168713
  • Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:
    • The package is novel and will be of interest to the broad readership of the journal.
    • The manuscript describing the package is no longer than 3000 words.
    • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal.
    • (Please do not submit your package separately to Methods in Ecology and Evolution)

Detail

  • Does R CMD check (or devtools::check()) succeed? Paste and describe any errors or warnings:

  • Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions:
    Yes it does, and all this code was, previously, part of the ROpenSci package fulltext.

  • If this is a resubmission following rejection, please explain the change in circumstances:

  • If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:

@noamross
Copy link
Contributor

noamross commented Feb 7, 2018

Thanks @willpearse for your submission (and creation of this package at our behest!). Things look good, but I haven't been able to run all of our tests because of this build error, which I get both locally and on r-hub:

* checking for file ‘suppdata/DESCRIPTION’ ... OK
* preparing ‘suppdata’:
* checking DESCRIPTION meta-information ... OK
* installing the package to build vignettes
* creating vignettes ... ERROR
Error in texi2dvi(file = file, pdf = TRUE, clean = clean, quiet = quiet,  : 
  Running 'texi2dvi' on 'suppdata-intro.tex' failed.
LaTeX errors:
! LaTeX Error: File `ae.sty' not found.

Type X to quit or <RETURN> to proceed,
or enter new name. (Default extension: sty)

! Emergency stop.
<read *> 
         
l.30 \ifthenelse
                {\boolean{Sweave@inconsolata}}{%^^M
Calls: <Anonymous> -> texi2pdf -> texi2dvi
Execution halted

It's odd that this doesn't show up on Travis. Anything non-standard in your vignette Rnw?

Also - to get ahead of things: Please add coverage checks to your CI and the coverage badge to your README. (See usethis::use_coverage)

@willpearse
Copy link
Author

willpearse commented Feb 7, 2018

@noamross
Copy link
Contributor

noamross commented Feb 7, 2018

So I tried building the vignette in a couple of fresh Docker images: rocker/verse (which has LaTeX pre-installed), and rocker/tidyverse, which does not, using tinytex to install TeX. Both gave me the same error, which means that somehow your vignette is doing something that is not in either of those standard environments, but is on Travis and your machine. Oddly the ae package, which should have the ae.sty file, is installed with tinytex.

@willpearse
Copy link
Author

willpearse commented Feb 7, 2018

@noamross
Copy link
Contributor

noamross commented Feb 7, 2018

OK, I've solved this by installing texinfo into my Docker container, but that doesn't seem like a long-run solution. I'll ask reviewers to see if they have issues and we'll return to it if they do. I might suggest a switch to markdown if it keeps up.

Editor checks:

  • Fit: The package meets criteria for fit and overlap
  • Automated tests: Package has a testing suite and is tested via Travis-CI or another CI service. Please add code coverage to your CI and a badge in your readme (see usethis::use_coverage)
  • License: The package has a CRAN or OSI accepted license
  • Repository: The repository link resolves correctly
  • Archive (JOSS only, may be post-review): The repository DOI resolves correctly
  • Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?

Editor comments

Here's the goodpractice::gp() output. Nothing to prevent moving to review but these should be straightforward fixes. I will go ahead and find reviewers.

 ✖ omit "Date" in DESCRIPTION. It is not required and it gets invalid quite
    often. A build date will be added to the package when you perform `R CMD build` on
    it.
  ✖ avoid long code lines, it is bad for readability. Also, many people prefer
    editor windows that are about 80 characters wide. Try make your lines shorter than 80
    characters

    R/journals.R:9:1
    R/journals.R:18:1
    R/journals.R:24:1
    R/journals.R:33:1
    R/journals.R:36:1
    ... and 47 more lines

  ✖ avoid sapply(), it is not type safe. It might return a vector, or a list,
    depending on the input data. Consider using vapply() instead.

    R/suppdata.R:126:17
    R/suppdata.R:135:17

  ✖ fix this R CMD check NOTE: Namespace in Imports field not imported from:
    ‘RCurl’ All declared Imports should be used.
───────────────────────────────────────────

Reviewers: @sarahsupp @rossmounce
Due date: 2018-03-16

@willpearse
Copy link
Author

willpearse commented Feb 8, 2018

@noamross
Copy link
Contributor

noamross commented Feb 8, 2018

Oh, one other thing, I also get the following error while testing (goodpractice runs as CRAN so it doesn't show up):

test-all.r:52: warning: suppdata fails well
404: Resource not found. - (nonsense)

I believe this is due to a breaking API change in testthat and you want expect_warning().

@willpearse
Copy link
Author

willpearse commented Feb 8, 2018

@noamross
Copy link
Contributor

noamross commented Feb 9, 2018

Oh, you can also put the review badge on your README:

[![](https://badges.ropensci.org/195_status.svg)](https://github.com/ropensci/onboarding/issues/195)

@willpearse
Copy link
Author

willpearse commented Feb 9, 2018

@noamross
Copy link
Contributor

noamross commented Feb 23, 2018

Reviewers assigned! Thank you @sarahsupp and @rossmounce for agreeing to review.
Due date: 2018-03-16

@willpearse
Copy link
Author

willpearse commented Feb 23, 2018

Great! Thanks very much!

@rossmounce
Copy link

rossmounce commented Mar 9, 2018

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s) demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions in R help
  • Examples for all exported functions in R Help that run successfully locally
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).
For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • A short summary describing the high-level functionality of the software
  • Authors: A list of authors with their affiliations
  • A statement of need clearly stating problems the software is designed to solve and its target audience.
  • References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package
    and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 20


Review Comments

There is a clear need for an R package like this, to download suppdata based upon a publication identifier such as a DOI.

As it currently stands this package is too thin in my opinion - it needs more development before being accepted in JOSS. It barely goes beyond what is already possible with utils::download.file https://github.com/willpearse/suppdata/issues/20#issuecomment-371820342 The statement of need needs to be clearer and more explicit in terms of subject disciplines covered and scale of analyses intended.

AFAIK there is no example given of how to set the download directory parameter correctly https://github.com/willpearse/suppdata/issues/14

The README.md file is woefully insufficient on the intended scope/usage of the package and in informational content in general, it needs a whole lot more than just one overly simplistic example https://github.com/willpearse/suppdata/issues/11 https://github.com/willpearse/suppdata/issues/18 https://github.com/willpearse/suppdata/issues/20#issuecomment-371820342 . I had a look at other JOSS packages just to be sure about this e.g. https://github.com/ropensci/drake and https://github.com/Molmed/checkQC and yes both README files on github at those two projects are far more helpfully detailed. AFAIK github is a valid first point of entry for a user wanting to use the software.

On a higher-level, some thought needs to be given as to what and who (discipline-wise?) this package is for and if at the moment it could reasonably fulfil a real use-case e.g. comparing supplementary data between PeerJ and Nature https://github.com/willpearse/suppdata/issues/19 Even if one scoped the intention of this package down to just 'ecology' it still couldn't fulfil an ecological meta-analysis use-case. I hate journal rankings and this is in no way an endorsement of journal rankings but just for the sake of argument, if one takes a look at what are considered by one citation-based metric (SJR) the 'top 50' ecology journals, I'd say this package covers suppdata at less than half http://www.scimagojr.com/journalrank.php?category=2303

I humbly suggest given the scale of and diversity of suppdata at academic journals that more thought is given to documenting how external community contributors can get involved with the project to increase the diversity and coverage of journals. I have given lots of concrete examples as a constructive starting point https://github.com/willpearse/suppdata/issues/2 https://github.com/willpearse/suppdata/issues/18

FYI linking data with literature is a hot topic at the moment. Have a look at http://www.scholix.org/ for instance. This might provide a nice pathway in future.

As I said in issue 18 https://github.com/willpearse/suppdata/issues/18 if Will would like, I'd love to help with increasing journal coverage if I can, but I'm not too familar with implementing regular expressions in R - I'm not sure how best to contribute. This package has a lot of potential though.

@noamross
Copy link
Contributor

noamross commented Mar 10, 2018

Thank you for your thoughtful review, @rossmounce! @willpearse, you can wait until @sarahsupp's review is in to respond, and I'll provide an editor's summary as well.

@willpearse
Copy link
Author

willpearse commented Mar 12, 2018

@sarahsupp
Copy link

sarahsupp commented Mar 12, 2018

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s) demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions in R help
  • Examples for all exported functions in R Help that run successfully locally
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).
For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • A short summary describing the high-level functionality of the software
  • Authors: A list of authors with their affiliations
  • A statement of need clearly stating problems the software is designed to solve and its target audience.
  • References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package
    and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing:
2.5


Review Comments

As someone who has done many studies using published supplementary data, I think this package is a great idea! While it seems to meet a clear need, it is lacking Statement of Need, with a clear description of who the intended user is, and how the package will help automate their analytical process in R. Some of this info is presented in the paper.md file, but not in the README.md.

For example, while it seems like many major publishers are supported (i.e., Wiley, Science, Biorxiv), with decent representation from biological/ecological sources, it is unclear to me if this package would be most beneficial only to those fields, or if it has strong enough representation for use in other STEM research areas, or quantitative research in the social sciences. It would be excellent to see something like this serve a wide diversity of fields. Even within ecology, I could see myself using this package for a relatively small analysis that used 1 or a few datasets from supported journals, with a streamlined code from raw data to analysis; but for larger meta-analyses, only a portion of the datasets I'd need access to look like they'd be supported, as the package currently stands. Do the authors plan to add more publishers and data repositories to the package, or are there some limitations that prevent doing so?

In the R help files for help(suppdata), I was not sure what the authors meant when they said "I'm aware that there isn't perfect overlap between these publishers and the rest of the package; I plan to correct this in the near future."

The help file was very useful, and since some of the publishers require different ways of calling the data, the examples were much needed and useful. There are a few small typos in the help file (SCott, FigShare).

The package conforms to most of the rOpenSci packaging guidelines, but is missing some of the requested information (see check boxes above).

I could not run the vignette successfully locally, which I tried to do using RSweave, Compile PDF. When I tried to compile the pdf, I had errors associated with:

  • Line 38 Citation 'crabs' on page 1 undefined on input line 38.
  • Line 105 Font T1/cmr/m/n/12=ecrm1200 at 12.0pt not loadable: Metric (TFM) file not found.

My disclaimer is that I do not have a lot of experience with .Rnw files, so perhaps it is a user (me) error.

Overall, I think the package has a lot of great potential, but perhaps needs a bit more documentation and if it is feasible, greater connection to supplemental data hosted across additional publishers and repositories.

@rossmounce
Copy link

rossmounce commented Mar 12, 2018

Thanks @sarahsupp , I also had problems with the vignette but wasn't sure if it was just me and my particular setup, so I didn't say anything about it.

@noamross
Copy link
Contributor

noamross commented Mar 13, 2018

Thanks for your review, @sarahsupp!

Let me chime on on the biggest issue that was highlighted in both Sarah and Ross's reviews: that of journal coverage and the ability to collect supplemental data en masse. This is a "long tail" problem, of which we are quite familiar at rOpenSci. Ross's suggestion that one way to deal with this is facilitate community contributions is a good one: provide good documentation in a vignette or CONTRIBUTING file of how to provide a publisher plugin with appropriate tests, along with up-front documentation of the current journals covered and the next one to prioritize. As an rOpenSci package we'd be happy to facilitate some this maintenance - reviewing pull requests, using our fora as a way to encourage others to contribute, etc., So the long-term burden won't be all on @willpearse :)

I note a project both @rossmounce and I were involved with, quickscrape, went down this road, going so far to to write a JSON spec for web site scrapers for different publishers. That repo may be a good resource for some additional patterns for suppinfo.

Thanks for the mention of Scholix - rOpenSci may build something specific for it in the future (pinging @sckott, who may already know about it). We would, of course, prefer if a central standard and clearinghouse allowed us to get most Supplementary Data. I believe our fulltext package is able to find full-text articles through CrossRef for some publishers, but still needs suppdata-like handlers for many that don't enable full-text access this way.

@willpearse
Copy link
Author

willpearse commented Mar 13, 2018

Great, thank you all for this. I'm running around doing fieldwork prep right now, so please don't be surprised if you don't hear from me for a fortnight or so. Thanks again!

@sckott
Copy link
Contributor

sckott commented Mar 15, 2018

It is indeed a really hard problem to programatically sort out links for articles, and even harder for supp. materials. In fulltext, we take a variety of approaches, starting from the highest level crossref, then if not found there, moving down to various other tools such as publisher specific code in the pkg, or using an API https://ftdoi.org/ I made to sort out URL patterns per publisher. That API - and as far as I can tell quickscrape as well - don't have any logic for supp. materials. A community contributed set of logic does seem best as a standalone thing - to be used in any programming language and via an API like ftdoi.org or any other.

What % of supp. materials files are now in Figshare/Dryad/other repositories? How useful is it to ping those APIs or to ping Datacite API?

@rossmounce
Copy link

rossmounce commented Mar 16, 2018

@sckott "What % of supp. materials files are now in Figshare/Dryad/other repositories?"

That's a really good question I'd also love to have an answer to. To re-state it slightly
What proportion of SI is at third-party repositories versus journal-hosted

The confusing bit is that for some journals e.g. PLOS the SI is available at both figshare and the journal platform itself.

I have a feeling that for say articles published in 2014 the % of SI that is journal hosted is greater than the percentage that is third-party hosted. But maybe that (horrid) situation is improving somewhat. Articles published in 2017 might be expected to have a higher proportion of SI at third-party repositories relative to articles published in 2014 and 2010...? There is of course disciplinary variation and journal to journal variation. Depends highly upon one's scoping!

@willpearse
Copy link
Author

willpearse commented Apr 8, 2018

I thank the reviewers and editor for their comments and suggested improvements to suppdata. I think that, as a result of their suggestions, the package is in much better shape. Thank you!

Below I respond to what the editor highlighted as the over-arching concern, and after that I respond to more specific issues raised by the reviewers and the editor.

Supporting more publishers and the package being "too thin"

I agree with the reviewers and editor that suppdata needs more publisher wrappers. To address this need going forward, I have created a detailed wiki-page for the package that describes how to go about adding new wrappers, in the hope that the community can help with this. This is referenced in the readme file, and can be found here: https://github.com/willpearse/suppdata/wiki.

Dr Supp raised a concern that the current coverage might be insufficient to facilitate a large meta-analysis, and Dr Mounce raised similar concerns about the top-50 journals in Ecology. While I, of course, agree that suppdata is not appropriate for everything, I have written two packages - NATDB and NACDB - that, combined, load of the order of 130 studies' datasets using suppdata. Thus I do agree that suppdata isn't perfect, but it is useful (at least to me and my collaborators) right now.

Dr Mounce raises the concern that suppdata doesn't cover as many publishers as ContentMine. He says ContentMine covers ~30 publishers; suppdata covers ~10 (https://github.com/willpearse/suppdata/issues/18). ContentMine is a company with multiple employees (at the time of writing, they are recruiting to fill two positions); I am a single developer, and so to be honest I'm quite pleased that suppdata covers even a third of the number of publishers that ContentMine does. For clarity, I add that my understanding is that ContentMine is similar in scope and aim to what the package fulltext carries out, not what suppdata does, but I would be grateful of being corrected if I am wrong.

I do agree that more work is needed on suppdata; my hope is the package can get that help as a member of the ROpenSci ecosystem. I add that I was asked by ROpenSci to submit this code as a separate package, so I presume that is also the hope of others at ROpenSci.

More specific comments

  • I have added more documentation to the package on its GitHub page. I would be grateful if the reviewers could let me know if this is sufficient for them, and if not where they would like more documentation. I don't think this should be a difficult package to use, and that's why I've tried to keep the documentation short, sharp, and sweet. I could easily have failed in my attempt, and so, again, any suggestions would be gratefully received!
  • Vignette build problems.
    I have corrected the "crabs" citation error, but it is not immediately obvious to me what the error Dr Supp found on line 105 relates to. There is no line 105 in the vignette file, so the error is extremely opaque to me. All I know is that the vignette builds on Travis, which implies to me that whatever is stopping the reviewers from building the package is specific to their system. Perhaps there are two things the reviewers could try doing:
    • Make sure they have LaTeX installed on their system (see https://support.rstudio.com/hc/en-us/articles/200532257-Customizing-LaTeX-Options and links)
    • If they're on Linux, try running sudo apt install libpoppler-cpp-dev, which is a PDF rendering library that they might be missing.
    • If neither of those steps works, if the reviewers could let me know what operating system they're using I'll look into it further. Regardless, I have added a PDF to the GitHub repo now, so it not necessary to build the vignette from source.
  • Lack of examples for user-facing functions. All user-facing functions have examples (there is only one; suppdata). They are wrapped in "dontrun" blocks to avoid having all the CRAN mirrors spamming servers with download requests for the same files. This is described in the documentation, but if the user runs the examples themselves they do run. CRAN prefers for examples not to take longer than five seconds anyway; if I were to remove them from this "dontrun" block then I would likely get into trouble with CRAN. If the reviewers are unhappy with this, I'm happy to make whatever changes they suggest as long as they won't get me into trouble when I submit to CRAN.
  • I have changed the version number to 1.0-0
  • The only complaint from goodpractice::gp is that only 93% of the package is unit tested.
  • I have added a review badge to the README, along with a covr badge
  • I have added a statement of need to the README ("a more detailed set of motivations for suppdata")
  • The phrase "I'm aware that there isn't perfect overlap between these publishers and the rest of the package; I plan to correct this in
    the near future" was a hang-over from when this code was part of fulltext; thank you for finding this, and I have now removed it from the helpfile.

@noamross
Copy link
Contributor

noamross commented Apr 13, 2018

Thanks for your response, @willpearse! @rossmounce and @sarahsupp, please let us know if Will's response addresses your comments.

A quick suggestion, Will: I suggest moving content from the GH wiki to a CONTRIBUTING.md file and README/other documentation, which will travel more transparently with clones and forks of the repo.

@willpearse
Copy link
Author

willpearse commented Apr 13, 2018

Thanks for that; I've migrated the wiki over now.

@sarahsupp
Copy link

sarahsupp commented Apr 16, 2018

Hi @willpearse. I think the new wiki pages will be really useful in encouraging folks to help contribute and in giving more description into why and how to use the suppdata package. Nice addition!

For the vignette build issue, I was working on Linux, and I think it's entirely possible that the PDF rendering library was missing. I can check later, and I'll try to get back to you if I figure out what the problem was. The PDF should help in case others have this problem too, or aren't used to the process of building the vignette from source.

While I mentioned previously that the number of publishers currently limits the usefulness of the package for large meta-analyses, I do think it is a great start and can already be useful for many projects, and I wouldn't consider this a major concern for moving forward. In general, I think that suppdata is looking good, and I'm excited to see how this grows as more can contribute to it and expand the publisher base from which data can be accessed.

@noamross
Copy link
Contributor

noamross commented Apr 25, 2018

@rossmounce Can you let us know if Will's update addresses your comments?

@rossmounce
Copy link

rossmounce commented Apr 26, 2018

Thanks for the ping. I think Will's updates mostly address my comments.

It's a great start and I look forward to seeing the usefulness of this package grow over time with more scraper methods to cover a wider range of publishers.

@noamross
Copy link
Contributor

noamross commented Apr 29, 2018

Approved! Thank you @willpearse, @sarahsupp, and @rossmounce. To-dos:

To-dos:

  • Transfer the repo to the rOpenSci organization under "Settings" in your repo. I have invited you to a team that should allow you to do so. You'll be made admin once you do.
  • Update badges, CI and coverage links, and URLs to point to the new repo address.
  • If you would like to, and if your reviewers agree, you may choose to credit their reviews in the DESCRIPTION file by adding them as 'rev'-type contributors (example)
  • Make a new Zenodo archive from the updated repo
  • Submit to JOSS and note this review thread

Welcome aboard! I think that it would be great to have a blog post about suppdata, especially as a mechanism to get more contributors to participate and add more publisher coverage. If you are interested in writing one, let us know and, @stefaniebutland will be in touch about content and timing.

@willpearse
Copy link
Author

willpearse commented Apr 30, 2018

Fantastic! Thank you all very much for your help!
@sarahsupp and @rossmounce , could you let me know if you would like to be a rev-type contributor (see above)? Once I hear from you about that, I can submit the package to JOSS.

@willpearse
Copy link
Author

willpearse commented Apr 30, 2018

@noamross awkwardly, I just went to settings, clicked 'transfer ownsership', gave it to ropensci (or so I thought), and was told that I didn't have permission to create a repo there. Have I clicked the wrong button? Sorry!...

@sarahsupp
Copy link

sarahsupp commented Apr 30, 2018

@rossmounce
Copy link

rossmounce commented May 1, 2018

Yes I'm OK with being acknowledged as a rev-type contributor. Thanks for asking.

@willpearse
Copy link
Author

willpearse commented May 1, 2018

Great, thank you both very much! I'll go ahead and make all those changes now

@willpearse
Copy link
Author

willpearse commented May 1, 2018

Submitted! Thanks very much for your help!

@noamross
Copy link
Contributor

noamross commented May 2, 2018

@stefaniebutland Will has expressed interest in writing a blog post for suppdata. I've pointed him to the guidance in the roweb2 repo. Let him know what the schedule is!

@noamross noamross closed this as completed May 2, 2018
@stefaniebutland
Copy link
Member

stefaniebutland commented May 14, 2018

@willpearse I apologize for my delay in responding. I am temporarily putting most blog post reviews and scheduling on hold until after our late-May unconference. In the meantime, if it's helpful to you, you could proceed with a draft while it's relatively fresh in your mind. There is so much value to readers in these posts by package authors!

I anticipate resuming non-unconference blog post reviews in mid-June for publishing in July and August

@willpearse
Copy link
Author

willpearse commented May 16, 2018

Thanks for this. I'm happy to leave this for a while; please do get back in touch with me closer to this being published. I am still adding things to the package, so I don't really want to spend time writing a blog asking for help developing features when it's reasonably likely I'll have implemented a lot of them by then. I'm still very happy to write the blog post, but I want to make sure it's optimally useful. Thanks for your help!

@stefaniebutland
Copy link
Member

stefaniebutland commented May 16, 2018

Thank you @willpearse. Your strategy makes perfect sense. I'll get back in touch in June

@willpearse
Copy link
Author

willpearse commented May 16, 2018

@stefaniebutland
Copy link
Member

stefaniebutland commented Jul 18, 2018

@willpearse Are you still interested in contributing a blog post? Still adding to the package or ready to write? I'm now starting to line up posts for publication in late August, September.

@willpearse
Copy link
Author

willpearse commented Jul 18, 2018

@stefaniebutland
Copy link
Member

stefaniebutland commented Jul 18, 2018

No problem. I'll ping you one last time in 4-6 weeks

@willpearse
Copy link
Author

willpearse commented Jul 18, 2018

@stefaniebutland
Copy link
Member

stefaniebutland commented Aug 21, 2018

@willpearse how about now? 😀 Would love to have a post about suppdata but I understand this goes beyond the time already committed to the review process. I have blog post publication dates available Sept 18, Oct 2, 9, 16. If you're still interested, let me know your preferred date and mark your calendar to submit a draft a week in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants