Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk edits to a datasets #387

Closed
chrisgorgo opened this issue Feb 5, 2018 · 5 comments
Closed

Bulk edits to a datasets #387

chrisgorgo opened this issue Feb 5, 2018 · 5 comments

Comments

@chrisgorgo
Copy link
Contributor

Every so often a modification of a dataset will require touching many different files. One example of such update is adding a column to every single _events.tsv file in a dataset with 50 participants. Current interface is good at adding a new folder with a new subject or a single file, but not bulk changes. We need to support the use case of bulk file modification. Some ideas:

  • upgrade the upload resume to use file hashes and upload only modified files
  • rely only on the CLI (Support CLI uploads #200) and use wildecards
  • rely on git-annex (pull requests?)
@kimberlylray
Copy link

Hi Chris, this is a bit redundant from my comment on NeuroStars. But I wasn't sure if it would be helpful to have this documented here (in a bit more detail) as well.
This could probably be assigned to a number of issues...
The online platform took a long time to upload my initial 15+ sessions of imaging data. It was understandable given it was probably over 30Gb, but I had to refresh the upload multiple times. Since then, I've started uploading 1 subject at a time (as we acquire the data), and it seems to work just fine.
But now I'm running into 2 issues:

  1. our study is longitudinal (3 sessions total), and I can't add the later sessions unless I delete the subject completely and re-upload all of the sessions at once.
  2. We haven't yet created BIDS compliant events.tsv files. When we upload that data at a later time, it would be useful to be able to select multiple files at once for upload.

If these change can be made in the online interface, then great! But I'd also be willing to try some command line ways to upload data to OpenNeuro. It might help speed upload on my end, and hopefully it helps y'all with troubleshooting?

@chrisgorgo
Copy link
Contributor Author

Thanks for your feedback @kimberlylray - you will see updates here when we will have a solution ready for testing.

@olgn olgn added in progress and removed next labels Mar 5, 2018
@JohnKael JohnKael added backlog and removed next labels Mar 7, 2018
@JohnKael JohnKael added next and removed backlog labels Mar 20, 2018
@JohnKael JohnKael added this to the 2018 Sprint 7 - April 5 milestone Mar 20, 2018
@nellh nellh self-assigned this Apr 3, 2018
@JohnKael JohnKael added in progress and removed next labels Apr 10, 2018
@nellh
Copy link
Contributor

nellh commented Apr 23, 2018

This is partially implemented in #200.

Here's an example of how this will work there:

openneuro sync --dataset ds000001 local-directory replaces any changed files and adds any new files but does not delete anything in the target.
openneuro sync --dataset ds000001 --delete local-directory fully syncs the local files with the remote, removing any remote files that do not exist in the source.

I'm thinking about how this could also be used for #555 as well:

openneuro sync --dataset ds000001 --prefix sub-01/anat anat would add any new files to the sub-01/anat directory in that dataset. It could be combined with delete to get similar sync the local and remote completely behavior. What do you think of this @chrisfilo?

@chrisgorgo
Copy link
Contributor Author

Sounds good to me. We need to make sure that:

  1. Bulk editing is well advertised (on the dataset page for which logged user has edit access to and in the FAQ)
  2. There is documentation with examples (this could be on github as a README.md or similar)

@chrisgorgo
Copy link
Contributor Author

Superseded by #1067

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants