UKB storage to upload processed bulk data #11

sina-mansour · 2021-11-24T00:58:05Z

Following on the suggestion by @Lestropie (this commit):

RS: As per discussion, need to find out how much data can be uploaded per subject to UKB
(and indeed what volume of data could potentially be hosted elsewhere). Any temporaries that
are not to be later hosted anywhere are better off being stored on a RAM file system.
My typical approach here is to load all input data into a scratch directory that I can force
to be in /tmp/, store all intermediate files and final outputs there, and only upon script
completion do I then write the desired derivatives to the location requested by the user.
I then only retain the scratch directory if the user explicitly requests that it be retained.
Your structure here checks for the pre-existence of calculated files, which is useful when you
are testing perturbations to the script, but for final deployment this ability is not as high
a priority.

sina-mansour · 2021-11-24T04:50:10Z

We're currently storing all intermediary files on the scratch file system. The following processed data are stored/will be stored to be shared with the public:

Native atlases:
- Surface atlases registered to the native space are available as nifti volumetric hard parcellations and will be released as such
- These atlases include 20 from Schaefer et al. as well as the HCP MMP1.0 atlas
Functional time-series:
- We have provide the resting state time-series for all of the atlases (we decided to provide the time-series rather than correlation connectivity measure as this would increase the possible usecases)
- We have also provided a global signal time-series for studies aiming to apply global signal regression
Structural connectivity:
- We will provide high-resolution endpoints in native/MNI for all mapped streamlines (only the ends of tractograms, to reduce size)
- We will also provide the following a wide range of connectivity measures (streamline count, FBC, density, length, etc.) mapped to different atlases.

We'll need to ensure that we can somehow upload three sets of bulk data for every individual back to the UKB storage:

atlases
functional time-series
structural connectivity measures

@caioseguin would you be able to enquire from UKbiobank to see if they will accept that and whether there are certain limits that we need to adhere to?

caioseguin · 2021-11-24T09:39:20Z

I will ask them and get back to you.

sina-mansour · 2023-09-06T09:01:09Z

This issue has been left dormant for a while. The last update is that we were able to return the results to the UK biobank over a secure sFTP connection (using MediaFlux).

UKB has informed us that the resource should be made available in a new release (planned for November 2023).

sina-mansour added the Awaiting discussion This issue is under discussion. label Nov 24, 2021

sina-mansour mentioned this issue Nov 24, 2021

Connectivity measures to be sampled #10

Closed

Lestropie mentioned this issue Nov 28, 2021

MSMT CSD runtime #14

Closed

sina-mansour closed this as completed Sep 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UKB storage to upload processed bulk data #11

UKB storage to upload processed bulk data #11

sina-mansour commented Nov 24, 2021

sina-mansour commented Nov 24, 2021

caioseguin commented Nov 24, 2021

sina-mansour commented Sep 6, 2023

UKB storage to upload processed bulk data #11

UKB storage to upload processed bulk data #11

Comments

sina-mansour commented Nov 24, 2021

sina-mansour commented Nov 24, 2021

caioseguin commented Nov 24, 2021

sina-mansour commented Sep 6, 2023