Skip to content
This repository has been archived by the owner on Jun 23, 2020. It is now read-only.

dataset hosting #7

Open
pdurbin opened this issue Sep 5, 2017 · 3 comments
Open

dataset hosting #7

pdurbin opened this issue Sep 5, 2017 · 3 comments

Comments

@pdurbin
Copy link

pdurbin commented Sep 5, 2017

Hi! I just heard that freeCodeCamp has datasets at http://breakingintostartups.com/quincy-free-code-camp/ and found this "2017 New Coder Survey" easily enough. I suppose that GitHub is not a bad place to host data, but I'm wondering if you've considered hosting the data in a data repository such as the ones listed at https://www.nature.com/sdata/policies/repositories . This is just a thought, and I'm biased because I work on a data repository, but I just thought I'd put this bug in your ear. Great episode! 😄

@erictleung
Copy link
Member

@pdurbin thanks for the suggestion! As an academic person as well, I'm for citable data repositories as well. I was considering Zenodo because of its seemingly easy integration with GitHub. Do you have any suggestions on where to put deposit this data set and advice overall in depositing this data set in general?

cc/ @QuincyLarson

@pdurbin
Copy link
Author

pdurbin commented Sep 6, 2017

@erictleung I've heard good things about Zenodo. Citation is certainly important. You might want to take a peek at a recent comparative review of data repositories at https://docs.google.com/spreadsheets/d/1KptHzDHIdB3s1v5m1mMwphcwXhOVWdkRYdjEWW1dqrE/edit?usp=sharing made by the Dataverse team and blogged about at https://dataverse.org/blog/comparative-review-various-data-repositories . Full disclosure that I work on the Dataverse code! If you're aware of other comparisons like this, please let me know. Also, for what it's worth, IQSS/dataverse#2739 has some discussion about Zenodo-style Github Integration.

I haven't actually looked at your dataset yet so I can't really comment. It seems like social science data to me. 😄

@evaristoc
Copy link

evaristoc commented Oct 4, 2017

@pdurbin Hi man! Long time! Sorry for me not showing... I have been trying to go through some few projects myself.

Also someone with academic background here.

@pdurbin - regarding your idea... What if we keep copies in both places, let's say Zenodo and Github?

The other question I have is one of availability. I am personally reluctant to the use of platforms that prevent people to work on the data. And the usual target for part of the data is not necessarily academic.

I would like to think of a solution that eases the use of the datasets for things not just academic but also exercising or even developing small products, and at the same time also encourages the correct citation when required.

Not all the data produced in freeCodeCamp should be included, IMO. It depends on the value of the datasets for different communities (business, individuals, academy).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants