Is the CyVerse Curated Data Repository right for my data?
Before applying for a permanent identifier in CyVerse Curated Data through the Data Commons, answer the following series of questions.
Question 1: Do you have a CyVerse account?
- Are you a registered CyVerse user? If not, register at user.cyverse.org.
- If so, have you used the Discovery Environment (DE)?
- The tools for submitting data to Data Commons Curated Data are simple to use and available as part of the DE. At a minimum, you should be able to upload and organize your data using the DE or iCommands, and be able to apply a metadata template.
Question 2: Is your data ready for publication?
- Is the dataset complete, stable, and ready for public consumption?
- Are you and all contributors to the dataset prepared to move the data into the public domain (meaning that anyone can access and use the data for any purpose, including commercial purposes)?
Have you sufficiently documented how the data was created such that other scientists in your field will be able to reuse it? - If there is a standard or commonly used format for your datatype, is your data in that format? If no standard exists, is your data in a format that can be easily used by most people with open source software (e.g., tables as a CSV or text file, rather than a Microsoft Excel spreadsheet)? - Is your data organized in a clear and reasonable structure that others will be able to understand? - If you answered no to any of these questions, your dataset is not yet ready for a permanent identifier through Data Commons Curated Data. Please continue to work on your dataset until it meets these requirements. Data will be reviewed by a curator to ensure that it meets these requirements.
If you would like to make your data public, but it is not complete and/or stable, you may request data hosting in the Data Commons.
Question 3: Is your data suitable for reuse in scientific analyses?
- Is your data of the type and format that allow it to be reused in other analyses?
- Are you prepared to supply metadata for your dataset?
- Does your dataset or metadata include sufficient instructions (e.g., a Readme file) such that someone in your field can understand how to reuse the data?
Question 4: Is there a canonical repository for your data?
- Does a canonical repository exist for your data? Examples include NCBI, EBI, and MG-RAST.
- If a canonical repository exists, you should use it. CyVerse is there to help fill a gap, not replace an existing resource.
If you answered No
- If you answered no to any of these questions, your data may be suitable for a DOI, but not through Data Commons Curated Data. You should consider other repositories that are not geared specifically toward data analysis, such as your institution's library, Dryad, or Figshares.
If your data was generated by or input for an analysis algorithm or software that you developed yourself, please consider making the method available through CyVerse infrastructure (e.g., the Discovery Environment or Atmosphere) as well.