Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify datasets for potential inclusion in the ODL #28

Open
Daniel-Mietchen opened this issue Oct 24, 2018 · 8 comments
Open

Identify datasets for potential inclusion in the ODL #28

Daniel-Mietchen opened this issue Oct 24, 2018 · 8 comments
Labels
discoverability How to discover data or their existence documentation How things (are supposed to) work infrastructure That which will only be noticed if it isn't working policy Basic rules and guidelines on how the Open Data Lab operates

Comments

@Daniel-Mietchen
Copy link
Member

One way to start looking into this would be to check open resources like

On that basis, we could then decide (see also the inclusion criteria in ODL, as per #18 ) as to whether we'd like to go for datasets scoring high and/or low / average on those scales.

@Daniel-Mietchen Daniel-Mietchen added documentation How things (are supposed to) work infrastructure That which will only be noticed if it isn't working discoverability How to discover data or their existence labels Oct 24, 2018
@Daniel-Mietchen Daniel-Mietchen added this to Needs triage in Reference datasets via automation Oct 24, 2018
@Daniel-Mietchen Daniel-Mietchen changed the title Identify datasets that may be worth including in the ODL Identify datasets for potential inclusion in the ODL Oct 24, 2018
@Daniel-Mietchen Daniel-Mietchen added this to To do in Public beta via automation Oct 24, 2018
@Daniel-Mietchen Daniel-Mietchen added this to Needs triage in Private beta via automation Oct 24, 2018
@Daniel-Mietchen Daniel-Mietchen moved this from Needs triage to High priority in Private beta Oct 24, 2018
@Daniel-Mietchen Daniel-Mietchen moved this from Needs triage to High priority in Reference datasets Oct 24, 2018
@Daniel-Mietchen Daniel-Mietchen added the policy Basic rules and guidelines on how the Open Data Lab operates label Oct 24, 2018
@Daniel-Mietchen
Copy link
Member Author

Another potential candidate: http://retractiondatabase.org/ — described by some as "antediluvian".

@Daniel-Mietchen
Copy link
Member Author

@Daniel-Mietchen
Copy link
Member Author

Datasets and code involved in projects for which there is a bug bounty, e.g. https://rubenarslan.github.io/posts/2018-10-26-on-making-mistakes-and-my-bug-bounty-program/ .

@Daniel-Mietchen
Copy link
Member Author

allofplos, as per https://github.com/PLOS/allofplos

@Daniel-Mietchen
Copy link
Member Author

https://doi.org/10.5061%2Fdryad.n5g39d7 - & mdash; probably the most comprehensive public dataset about Hemimastigophora to date

@Daniel-Mietchen
Copy link
Member Author

"Teaching data science with real world datasets"
https://twitter.com/emcandre/status/1068139908836012032

@Daniel-Mietchen
Copy link
Member Author

@Daniel-Mietchen
Copy link
Member Author

Here is some inspiration from the kinds of data and related services hosted at IDigInfo's data portal:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discoverability How to discover data or their existence documentation How things (are supposed to) work infrastructure That which will only be noticed if it isn't working policy Basic rules and guidelines on how the Open Data Lab operates
Projects
Private beta
  
High priority
Public beta
  
To do
Reference datasets
High priority
Development

No branches or pull requests

1 participant