Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code Deposit - Github Integration #2739

Open
leeper opened this issue Nov 15, 2015 · 21 comments
Open

Code Deposit - Github Integration #2739

leeper opened this issue Nov 15, 2015 · 21 comments

Comments

@leeper
Copy link
Member

@leeper leeper commented Nov 15, 2015

Zenodo now provides a really convenient way to archive a Github repository, using git tags (i.e., Github releases). This is a really convenient way to attach a DOI to a Github repository. Being able to do the same with Dataverse would be awesome.

The reason I thought of it is that I was considering building a layer into the R client that would make it convenient to archive a version of a local git repository using the Dataverse SWORD API, but if this was all implemented natively within Dataverse that would probably be even better.

@mercecrosas
Copy link
Member

@mercecrosas mercecrosas commented Nov 16, 2015

@leeper yes, this is something we've been considering, and you are right that Zenodo does this very well. Thanks for pointing it out and creating the issue.

@mercecrosas
Copy link
Member

@mercecrosas mercecrosas commented Nov 16, 2015

👍

@mercecrosas mercecrosas added this to the In Review milestone Nov 30, 2015
@mercecrosas mercecrosas added this to the In Review milestone Nov 30, 2015
@scolapasta scolapasta removed this from the Not Assigned to a Release milestone Jan 28, 2016
@scolapasta scolapasta removed this from the Not Assigned to a Release milestone Jan 28, 2016
@pdurbin
Copy link
Member

@pdurbin pdurbin commented Jan 13, 2017

I mentioned to @christophergandrud this morning that @leeper had opened this issue. At some point we should all put our heads together on this. 😄

@pdurbin
Copy link
Member

@pdurbin pdurbin commented Jun 25, 2017

@leeper @christophergandrud shoot, we should have talked about this during the Community Meeting! @leeper now that you've added the "dataverse" package to CRAN, do you have any more thoughts on this issue? How can we unblock it?

@leeper
Copy link
Member Author

@leeper leeper commented Jun 25, 2017

From an API perspective, this should be pretty easy because it's just a matter of doing git checkout on the appropriate tag, zipping the contents (sans the .git folder) and dumping to the right SWORD endpoint.

It might make sense in the user interface as a plugin (as @pdurbin and I talked about for the Dropbox add file dialog) that does this from a specified git repo.

@pdurbin
Copy link
Member

@pdurbin pdurbin commented Jun 25, 2017

@leeper maybe I'm just hearing what I want to hear, but are you saying that you think it's possible to implement this feature entirely client-side, such as within https://github.com/IQSS/dataverse-client-r ? If so, can we move this issue to that repo?

@leeper
Copy link
Member Author

@leeper leeper commented Jun 25, 2017

Let me try to make an example using R and then feedback to this issue about how well that goes.

@christophergandrud
Copy link

@christophergandrud christophergandrud commented Jun 26, 2017

Nonetheless, it would be great if this was ultimately language agnostic.

(Sorry, off topic, but honestly my dream would be if Dataverse could act as a remote git repository).

@pdurbin
Copy link
Member

@pdurbin pdurbin commented Jun 26, 2017

@christophergandrud interesting. I guess supporting git would be language agnostic. Yes, this is all off topic but please see my "A Thought Experiment: Datasets As Git Repos" at https://docs.google.com/document/d/18WDIS8hrFJvMJBcnRuQ8NfD-VxGq32vJ9WwlEgyyWZs/edit?usp=sharing which I originally shared at https://groups.google.com/d/msg/dataverse-community/5zJrr03R9ZE/6ahp8ZgQwt8J .

@leeper I see you opened IQSS/dataverse-client-r#16 . Thanks! Please keep us posted.

@dlmurphy
Copy link
Contributor

@dlmurphy dlmurphy commented Jun 26, 2018

An example of a real-world use case, from our notes on a UX interview we conducted with an Astrophysics librarian in 2016 that touched on this issue:

The researchers she works with primarily use Zenodo because of its GitHub integration. “Zenodo has a hook into GitHub. If you’re putting your code on GitHub, you can mint a DOI for a release of your code, and then it’ll be indexed by the Astrophysics Data System (which is like the Pubmed of Astronomy).” The researcher’s software is required to process the data, which is in the form of FITS images. You need the images AND the code for the data to be meaningful.

@pdurbin
Copy link
Member

@pdurbin pdurbin commented Jun 28, 2018

@leeper I keep thinking about the diagram you showed at the Dataverse Community Meeting (from https://osf.io/xfj5h/ ), how there was a mix of code and non-code (data.csv, paper, slideshow, website, citations, etc) in what I understand to be the recommendation for organizing your dataset in your field. For this "code deposit" feature, are you thinking you'd want anything that looks like this or are you thinking that you'd want a "code only" dataset that doesn't have your data, your paper, your slides, etc? Here's the diagram:

leeper

Others are welcome to comment on how this feature should work as well! I'm just asking Thomas since he opened it. 😄

@leeper
Copy link
Member Author

@leeper leeper commented Jun 28, 2018

I'd want to deposit the whole project with folder/file hierarchies into single dataset.

@pdurbin
Copy link
Member

@pdurbin pdurbin commented Jun 28, 2018

@leeper cool, thanks. Would https://github.com/leeper/rio be a good example of a repo that you'd consider depositing into Dataverse if/when this feature were available? Or are there other repos that would be better examples?

@mercecrosas
Copy link
Member

@mercecrosas mercecrosas commented Jun 28, 2018

@dlmurphy
Copy link
Contributor

@dlmurphy dlmurphy commented Jul 11, 2018

Here's our design team's document for summarizing what we know about this issue and considering next steps:

https://docs.google.com/document/d/1Wa8OJBftzJs_v9QDeccRanx6S0MNoiuPjqnRaL20cNk/edit

@mheppler mheppler changed the title Zenodo-style Github Integration Code Deposit - Github Integration Sep 19, 2018
@pdurbin
Copy link
Member

@pdurbin pdurbin commented Sep 20, 2018

A couple things:

  • rOpenSci just published a blog post called "Building Reproducible Data Packages with DataPackageR" that says, "When a manuscript is submitted based on a specific version of a data package, one can make a GitHub release and automatically push it to sites like zenodo so that it is permanently archived." https://ropensci.org/blog/2018/09/18/datapackager/
  • "Papers with code" seems like an interesting dataset that's highly related to this issue: https://github.com/zziz/pwc . I wonder if we should ask @zziz if there are any plans to publish this dataset in a repository.

Hat tip to @amoeba from @whole-tale for putting both of these on my radar!

@zziz
Copy link

@zziz zziz commented Sep 20, 2018

Dear @pdurbin, dataset is already available on the repository as a CSV file. But I don't recommend to use it just yet. I have very recently started this project and there are some works yet to be finished. You are welcome to use is it right now, but I recommend to check back in a month.

@poikilotherm
Copy link
Contributor

@poikilotherm poikilotherm commented Nov 8, 2018

My 2 cents to this: please keep in mind (at least for later extension):

@poikilotherm
Copy link
Contributor

@poikilotherm poikilotherm commented Dec 21, 2018

Please let me bring to your attention that there are efforts to integrate GitLab with Zenodo (based on Invenio at CERN). Maybe the changes needed at the GitLab side could be usefull for or even aligned with Dataverse?

See especially this comment.

@pdurbin
Copy link
Member

@pdurbin pdurbin commented Feb 14, 2020

I promised @ethomson and @neovintage from GitHub that I would link up the notes from the fantastic meeting we had with them today. Here they are: https://groups.google.com/d/msg/dataverse-community/uEJRcNoghjY/RVnXkuxoBgAJ

@poikilotherm
Copy link
Contributor

@poikilotherm poikilotherm commented Aug 10, 2020

Heads up that @djbrooke and @jggautier discussed this briefly today, as this is related to #7077.
Please see my notes, including a link to a doc by marvelous @jggautier where he describes what still needs to be done to provide this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
10 participants