Skip to content

Reading Public Data Examples

William Silversmith edited this page Apr 12, 2020 · 15 revisions

If you're new to CloudVolume, it can be helpful to try using it with a public dataset to get a feel for it. Start by following the instructions to install CloudVolume, then come back here!

We'll use the dataset by Kasthuri et al for these examples. Open Viewer

Kasthuri, Narayanan, et al. "Saturated reconstruction of a volume of neocortex." Cell 162.3 (2015): 648-661. (link)

You can find additional datasets at The Open Connectome Project.

Downloading EM Images

Neuroglancer datasets are often accessed as precomputed://protocol://bucket/dataset/layer. To initialize a CloudVolume instance, we can omit the precomputed:// part. In this example, we'll use the gs:// protocol, which means it's on Google Cloud Storage, but s3:// and other protocols are possible as well. For public datasets hosted on Google Cloud Storage, if you replace the gs:// protocol with https://storage.googleapis.com, you'll be able to read them without an authentication token.

In this example, we download a 512x512x64 voxel patch of electron microscope images from the Kasthrui et al dataset at the highest resolution.

from cloudvolume import CloudVolume, view

# 1. Initialize a CloudVolume object which will know how to read from this dataset layer. 
cv = CloudVolume(
   'https://storage.googleapis.com/neuroglancer-public-data/kasthuri2011/image_color_corrected', 
    progress=True, # shows progress bar
    cache=True, # cache to disk to avoid repeated downloads
    # parallel=True, # uncomment to try parallel download!
)

# 2. Download context around the point in the Neuroglancer link above
#    into a numpy array.
# argument one is the (x,y,z) coordinate from neuroglancer
# mip=resolution level (smaller mips are higher resolution, highest is 0)
# size is in voxels
img = cv.download_point( (5188, 9096, 1198), mip=0, size=(512, 512, 64) )

# 3. Visualize the image! 
# Open your browser to https://localhost:8080 to view
# Press ctrl-C to continue script execution.
view(img)

# 4. When you're done experimenting, clean up the space we used on disk.
# cv.cache.flush() 

Downloading Segmentation & Meshes

Working with segmentation is very similar, but you may also have access to meshes and skeletons. This dataset only has meshes, so we won't demonstrate skeletons in this example. In the previous example, we were reading from the image_color_corrected layer, but in this one we'll read from the ground_truth layer.

from cloudvolume import CloudVolume, view 

cv = CloudVolume(
    'https://storage.googleapis.com/neuroglancer-public-data/kasthuri2011/ground_truth',
    progress=True, # shows progress bar
    cache=True, # cache to disk to avoid repeated downloads
    # parallel=True, # uncomment to try parallel download!
)

img = cv.download_point( (5188, 9096, 1198), mip=0, size=(512, 512, 64) )

# segmentation=True activates the segmentation mode
# of the microviewer. If it was False, it would display
# as a raw image, which might be very dark if the label
# values are small.
view(img, segmentation=True)

# Get as mesh object
mesh = cv.mesh.get(13)
# Save to disk at ./13.obj which can be visualized in MeshLab or Blender
cv.mesh.save(13, file_format='obj')

# cv.cache.flush()

FlyWire Variation

This is a simplified example for new users in the FlyWire (flywire.ai) community.

You'll first need to install ~/.cloudvolume/secrets/chunkedgraph-secret.json for the dataset you are working on. This provides authenticated access to the server that contains the proofread representation of neurons.

from cloudvolume import CloudVolume

# use_https makes sure that when Google Cloud Storage servers are accessed, 
# it is through the public https interface rather than the authenticated 
# interface. This is for the raw images, not the graph representation 
# of the proofread labels.
segid = 720575940631525604
cv = CloudVolume('URL_GOES_HERE', progress=True, use_https=True)
cv.mesh.save(segid) # Saves an OBJ to disk as 720575940631525604.obj