Spatial Queries for Skeletons #874
william-silversmith
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi Jeremy,
I've been talking to some of the folks at AIBS about the use of skeletons in Neuroglancer and there are some features they've been asking for to visualize dense skeletons. For example, they might want to visualize a large number of skeletons in a given bounding box or to query a bounding box for what skeleton ids are present.
A lot of this can be done using the architecture backing polylines. For example, Diffusion Tensor Imaging (an MRI technique) generates hundreds of thousands of brain-spanning dense branch-free streamlines that can be treated as polylines. Skeletons however are branched structures and don't quite fit this mold. Skeletons also need to be treated differently in terms of low-resolution representations, you can't just use random points or you lose topological integrity (and of course, that means below a certain resolution the simplifier must be smarter to pick the least destructive edge restructuring).
I was thinking that a possible extension to skeletons could be along the same architecture as polylines with a "spatial index" hierarchy (a term which is used differently in the context of Neuroglancer than in CloudVolume where it means a JSON or SQL index for querying ids).
That is, skeletons are chunked, placed into some kind of container bricks and loaded into shards using the identity hash. We can make a simple container format, but one possible container format that is almost as simple is https://github.com/seung-lab/mapbuffer/ which allows for efficient filtering by ID on disk via binary search.
As I mentioned above, CloudVolume also includes a JSON index for querying IDs in space which is very useful particularly when segmentation is not available for skeletons. Perhaps we could either directly use the same format or devise a slightly more efficient one. The cloudvolume format is a grid of JSON files formatted like
{ "12481249281": [ 0, 0, 0, 100, 100, 100 ], ... }which lets you query for axis aligned bounding box intersections very finely, though I admit when parsing entire files in Python it can become slow primarily due to Python object creation. I use thepysimdjsonlibrary to try to accelerate some queries.Some alternatives might include a binary version of this, omitting the bounding box component for compactness, or some kind of R-tree or octree to make filtering more efficient. In the simplest case, we could omit this kind of spatial index entirely and using the raw data cleverly. However, I've found in some simple experiments that searching shards for label data is unusably slow, but perhaps I could be more clever about parallelizing or condensing it.
Let me know what you think!
Thanks so much Jeremy.
Will
Beta Was this translation helpful? Give feedback.
All reactions