Skip to content

Using the intern.array API

Jordan Matelsky edited this page Apr 28, 2021 · 4 revisions

The intern package has an array API that makes it easy to upload and download data from BossDB.

To start, install intern. Then, from a Python session, you can access public data:

from intern import array

em = array("bossdb://morgan2020/lgn/em")
data = em[5025:5026, 100016:100528, 99977:100489]

The text that looks like bossdb://X/Y/Z is called a "URI" and you can think of it like a folder that contains projects. In this example, we downloaded data from the morgan2020 project. You can see a full list of public datasets here.

If you want to know how big a dataset is before you start downloading, you can just use the numpy-style shape attribute:

em.shape

Even though this dataset spans many teravoxels, intern will only download the parts that you request using the numpy-style indexing (like above).

For a more in-depth tutorial, visit bossdb.org/get-started, or check out video-tutorials on the BossDB YouTube channel.

You can also run an interactive Binder example here:

Binder

Complete argument list reference

Argument Type Description
channel (intern.resource.boss.ChannelResource) The channel from which data will be downloaded.
resolution (int = 0) The native resolution or MIP to use
volume_provider (VolumeProvider) The remote-like to use
axis_order (str = AxisOrder.ZYX) The axis-ordering to use for data cutouts. Defaults to ZYX. DOES NOT affect the voxel_size or extents arguments to this constructor.
create_new (bool False)
dtype (str) Only required if create_new = True. Specifies the numpy-style datatype for this new dataset (e.g. "uint8").
description (str) Only required if create_new = True. Sets the description for the newly-created collection, experiment, channel, and coordframe resources.
extents Optional[Tuple[int, int, int]] Only required if create_new = True. Specifies the total dataset extents of this new dataset, in ZYX order.
voxel_size Optional[Tuple[int, int, int]] Only required if create_new = True. Specifies the voxel dimensions of this new dataset, in ZYX order.
voxel_unit Optional[str] Only required if create_new = True. Specifies the voxel-dimension unit. For example, "nanometers".
downsample_levels (int = 6) The number of downsample levels.
downsample_method (Optional[str]) The type of downsample to use. If unset, defaults to 'anisotropic'.
coordinate_frame_name (Optional[str]) If set, the name to use for the newly created coordinate frame. If not set, the name of the coordinate frame will be chosen automatically.
coordinate_frame_desc (Optional[str]) If set, the description text to use for the newly created coordinate frame. If not set, the description will be chosen automatically.
collection_desc (Optional[str]) The description text to use for a newly created collection. If not set, the description will be chosen automatically.
experiment_desc (Optional[str]) The description text to use for a newly created experiment. If not set, the description will be chosen automatically.
source_channel (Optional[str]) The channel to use as the source for this new channel, if create_new is True and this is going to be an annotation channel (dtype!=uint8).
boss_config (Optional[dict]) The BossRemote configuration dict to use in order to authenticate with a BossDB remote. This option is mutually exclusive with the VolumeProvider configuration. If the volume_provider arg is set, this will be ignored.

Point to an existing dataset

from intern import array

# Save a cutout to a numpy array in ZYX order:
em = array("bossdb://witvliet2020/Dataset_4/em")
data = em[210:220, 7969:8993, 11532:12556]

Creating a new dataset

Create a new dataset with 2048 pixels in the Z direction, 4096 in Y, and (it doesn't have to be a power of two!) 5123 pixels in the X direction. (Datasets are initialized with all zeros.)

from intern import array

dataset = array(
    "bossdb://my_collection/my_experiment/my_channel", 
    create_new=True,
    extents=[2048, 4096, 5123],
    voxel_size=[2, 1, 1]
)