Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where should datasets be uploaded? #13

Open
jgvictores opened this issue Jan 27, 2017 · 4 comments
Open

Where should datasets be uploaded? #13

jgvictores opened this issue Jan 27, 2017 · 4 comments
Labels

Comments

@jgvictores
Copy link
Member

jgvictores commented Jan 27, 2017

Here are some solutions (updated from roboticslab-uc3m/xgnitive#23):

  1. Zenodo: We chose this option in the mentioned issue, which generates DOIs and is popular in the machine learning community. An example from XGNITIVE: https://zenodo.org/record/168156#.WIt3FlwmRh5
  2. ResearchGate: Not sure if it still generates DOIs.
  3. Mendeley Data: a new player in this area.

PD: While we used to publish in https://sourceforge.net/projects/roboticslab/files/Datasets/, these are more modern and probably better solutions.

@David-Estevez
Copy link

David-Estevez commented Jan 28, 2017

Are there any limitations on dataset size?

For some datasets, such as the garment 3D scans this would be a key factor to settle for one solution.

@jgvictores
Copy link
Member Author

From the Zenodo FAQ:

We currently accept up to 50GB per dataset (you can have multiple datasets); there is no size limit on communities. However, we don't want to turn away larger use cases. If you would like to upload larger files, please contact us, and we will do our best to help you.

@PeterBowman
Copy link
Member

In the past, we used to upload those datasets to our RL-UC3M server. Isn't this an option nowadays?

@jgvictores
Copy link
Member Author

In the past, we used to upload those datasets to our RL-UC3M server. Isn't this an option nowadays?

Not really. It required a certain permissions/access level to the server, which is not easy to maintain. Additionally, scalability is an issue (I got complaints on large traffic of our largest file, a Robot Devastation .iso). My current recommendations would be:

  • Informal (work in progress, internal, etc): Google Drive within @ing.uc3m.es domain. It's easy and popular.
  • Formal release: Zenodo. It's easy and popular.

I could extend on reasons pro/cons of these options and on others, but it would be pretty long and redundant with respect to the above. This is just documenting my most intuitive and updated conclusions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants