You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As we discussed in the deployment meeting, we are moving towards data storage in zarr or ome-zarr format. As we start to think about optimizing the storage, we should also begin on how to structure subgroups of zarr datastores so that a data reader/writer module will be able to correctly load data into the correct place.
For example, Ivan has different stokes data stored as subgroups in a larger datastore that corresponds to all of the data for a single FOV. These subgroups can have names, such as "Stokes_1_FOV1", and when you need to load this array, you will have to specify the name of this subgroup. For a data reader, we would need a way to standardize the naming format of these subgroups, or assign subgroup attributes, so that we can intuitively scan through a datastore to find the correct subgroups to load.
I suggest that we come up with a standardized way of assigning attributes to these subgroups, so that we do not necessarily need to adhere to strict naming conventions. For example, if we are saving computed stokes channels as different subgroups in a datastore, where the structure is as such: store/Stokes/Stokes0_zstack where the array is Stokes0_zstack. We can assign attributes in the following way:
store_path = '/home/camfoltz2/Stokes_FOV1.zarr'
store = zarr.open(store_path)
store['Stokes']['Stokes0_zstack'].attrs['Type'] = 'S0'
by calling .attrs['Type'] = 'S0' we can create an attribute of the array called 'Type' and then define the 'Type' as 'S0'. You can then search through this dictionary of attributes when loading data. These attributes do not have to be strings, they can also be lists and numbers.
I think a good idea here would be to standardize the attributes of all of our data, and have the data io module assign these standardized attributes to arrays upon saving. @bryantChhun and others what do you think here? I think there will need to be some type of data labeling as we move to the zarr storage format.
The text was updated successfully, but these errors were encountered:
As we discussed in the deployment meeting, we are moving towards data storage in zarr or ome-zarr format. As we start to think about optimizing the storage, we should also begin on how to structure subgroups of zarr datastores so that a data reader/writer module will be able to correctly load data into the correct place.
For example, Ivan has different stokes data stored as subgroups in a larger datastore that corresponds to all of the data for a single FOV. These subgroups can have names, such as "Stokes_1_FOV1", and when you need to load this array, you will have to specify the name of this subgroup. For a data reader, we would need a way to standardize the naming format of these subgroups, or assign subgroup attributes, so that we can intuitively scan through a datastore to find the correct subgroups to load.
I suggest that we come up with a standardized way of assigning attributes to these subgroups, so that we do not necessarily need to adhere to strict naming conventions. For example, if we are saving computed stokes channels as different subgroups in a datastore, where the structure is as such:
store/Stokes/Stokes0_zstack
where the array isStokes0_zstack
. We can assign attributes in the following way:by calling
.attrs['Type'] = 'S0'
we can create an attribute of the array called 'Type' and then define the 'Type' as 'S0'. You can then search through this dictionary of attributes when loading data. These attributes do not have to be strings, they can also be lists and numbers.I think a good idea here would be to standardize the attributes of all of our data, and have the data io module assign these standardized attributes to arrays upon saving. @bryantChhun and others what do you think here? I think there will need to be some type of data labeling as we move to the zarr storage format.
The text was updated successfully, but these errors were encountered: