# Precomputed_annotation_sharded
The purpose of this notebook is to figure out how to create sharded precomputed annotation layers and to determine when they are needed for performance in terms of number of annotations. Refer to this document: https://github.com/google/neuroglancer/blob/master/src/neuroglancer/datasource/precomputed/annotations.md for the general structure of these precomputed annotation layers.

So far, I have been able to successfully create a precomputed annotation layer without sharding. This works well for up to a 10^5 annotations, but once you get above that the load time starts to suffer. To get faster load times, sharding is recommended. There are multiple steps to take to enable sharding. Here are some things that I know do NOT change. Note that that I'm not using relationships

In the info file the following keys stay the same:
- "@type"
- "annotation_type"
- "dimensions"
- "lower_bound"
- "upper_bound"
- "properties"

That leaves the remaining properties which may change:
- "by_id"
- "spatial"

Here is an example public layer which uses sharding: https://storage.googleapis.com/neuroglancer-20191211_fafbv14_buhmann2019_li20190805/. To download the info file do:
```
gsutil cp gs://neuroglancer-20191211_fafbv14_buhmann2019_li20190805/info .
```

In this example info file the `by_id` key looks like this:
```
"by_id" : {
  "key" : "by_id",
  "sharding" : {
     "@type" : "neuroglancer_uint64_sharded_v1",
     "data_encoding" : "gzip",
     "hash" : "murmurhash3_x86_128",
     "minishard_bits" : 13,
     "minishard_index_encoding" : "gzip",
     "preshift_bits" : 0,
     "shard_bits" : 5
  }
},
```

And there are multiple entries to the `spatial` key. The docs state that this key is defined as:
```
Array of JSON objects specifying the spatial index levels from coarse to fine
```
The coarse to fine suggests that this works like downsampling where we have multiple mip levels. 
the first of these objets is one that looks like what we are used to from not sharding:
```
{
     "chunk_size" : [ 884397, 442585, 282601 ],
     "grid_shape" : [ 1, 1, 1 ],
     "key" : "spatial0",
     "limit" : 10000
  },
```
But unlike in our example, this chunk size is huge and needs downsampling. According to the docs this should be the coarsest level (i.e. largest chunks with fewest of them). That does appear to be the case since this chunk size is the size of `upper_bound` - `lower_bound` . The `grid_shape` is defined as `Array of rank positive integers specifying the number of cells along each grid dimension for this spatial index level.` So a cell must be how the space is divided up. So in this case there is just a single cell over the entire volume. This must be a very low resolution chunk since not all points would be able to be rendered. 

The next entry to the `spatial` array looks like this:
```
{
     "chunk_size" : [ 442198.5, 442585, 282601 ],
     "grid_shape" : [ 2, 1, 1 ],
     "key" : "spatial1",
     "limit" : 10000,
     "sharding" : {
        "@type" : "neuroglancer_uint64_sharded_v1",
        "data_encoding" : "gzip",
        "hash" : "identity",
        "minishard_bits" : 0,
        "minishard_index_encoding" : "gzip",
        "preshift_bits" : 0,
        "shard_bits" : 0
     }
  },
```
Here we see this is `spatial1` and the chunk size is half the size in the x dimension. The grid shape indicates two cells in that dimension. The other dimensions are the same as `spatial0`. Note the addition of the `sharding` key to this dictionary. This is how we tell Neuroglancer that this is a sharded level. In fact all of the subsequent spatial levels have this key. The keys of this sharding dictionary are explained here: https://github.com/google/neuroglancer/blob/master/src/neuroglancer/datasource/precomputed/sharded.md#sharding-specification. The keys are:

- "@type": Must be "neuroglancer_uint64_sharded_v1".
- "preshift_bits": Specifies the number of low-order bits of the chunk ID that do not contribute to the hashed chunk ID. The hashed chunk ID is computed as hash(chunk_id >> preshift_bits).
- "hash": Specifies the hash function used to map chunk IDs to shards. Must be one of:
"identity": The identity function.
"murmurhash3_x86_128": The MurmurHash3_x86_128 hash function applied to the shifted chunk ID in little endian encoding. The low 8 bytes of the resultant hash code are treated as a little endian 64-bit number.
- "minishard_bits": Specifies the number of bits of the hashed chunk ID that determine the minishard number. The number of minishards within each shard is equal to 2**minishard_bits. The minishard number is equal to bits [0, minishard_bits) of the hashed chunk id.
- "shard_bits": Specifies the number of bits of the hashed chunk ID that determine the shard number. The number of shards is equal to 2**shard_bits. The shard number is equal to bits [minishard_bits, minishard_bits+shard_bits) of the hashed chunk ID.
- "minishard_index_encoding": Specifies the encoding of the "minishard index". If specified, must be "raw" (to indicate no compression) or "gzip" (to indicate gzip compression). If not specified, equivalent to "raw".
- "data_encoding": Specifies the encoding of the actual chunk data, in the same way as "minishard_index_encoding". In the case of multiscale meshes, this encoding applies to the manifests but not to the mesh fragment data.

There are a number of things I still don't understand here, but maybe I can understand them by comparing the contents of the layer shown here: https://storage.googleapis.com/neuroglancer-20191211_fafbv14_buhmann2019_li20190805/ with what's in the info file. Let's start with `spatial0`. There exists a folder called `spatial0` and its only contents are the file: `0_0_0`. This is the same as when we were not doing sharding. Now it's not clear whether this file would contain every single point. Having downloaded that file it is a very small file: 480K. So I bet it is just a small fraction of the points. Let's see if we can use `struct.unpack()` to looks at the contents of that binary file. We should be able to figure out the total number of points since they are listed at the beginning of the file as a uint64 little endian. I was able to do that with:
```
with open(spatial0_0_0_0_file, "rb") as f:
    byte = f.read(8) # the first 8 bytes represent the uint64 value for total number of annotations
    print(struct.unpack('<Q',byte))
```
And the answer is 10071 annotations in this file. 

OK, so either there are many more annotations than this 

Questions for Jeremy: 
- How do you decide which points to show in the coarsest spatial0/0_0_0 file? 
- What are the by_id/00.shard files? In the unsharded example I don't even have this by_id folder.

In [3]:
import numpy as np
import os
import csv
import struct
import json
from cloudvolume import CloudVolume
import matplotlib.pyplot as plt
import neuroglancer
%matplotlib inline

In [4]:
spatial0_0_0_0_file = '/home/ahoag/ngdemo/demo_bucket/test_annotations/neuroglancer-20191211_fafbv14_buhmann2019_li20190805/spatial0/0_0_0'

In [13]:
with open(spatial0_0_0_0_file, "rb") as f:
    byte = f.read(8) # the first 8 bytes represent the uint64 value for total number of annotations
    print(struct.unpack('<Q',byte))
#     while byte != b"":
#         # Do stuff with byte.
#         byte = f.read(1)

(10071,)


In [11]:
byte

b"W'\x00\x00\x00\x00\x00\x00"

(10071,)

In [2]:
# get the raw-space cells file and load it in
animal_id = 4
pth=os.path.join('/jukebox/wang/Jess/lightsheet_output',
        '201904_ymaze_cfos','processed',f'an{animal_id}','clearmap_cluster_output',
        'cells.npy')
converted_points = np.load(pth)

In [3]:
converted_points

array([[ 459, 1398,   50],
       [ 459, 1443,   50],
       [ 462, 1412,   49],
       ...,
       [1546, 1242,  569],
       [1547, 1316,  570],
       [1646, 1328,  574]])

In [4]:
len(converted_points)

524170

In [None]:
np.random.shuffle(converted_points) # does it in place

## Just coordinates - something we already know how to do
The reference for the struct.pact() formatting can be found here: https://docs.python.org/3.1/library/struct.html#module-struct, where short means 16-bit, long means 32-bit and long long means 64-bit

In [12]:
# We already know how to encode just the coordinates. Do it like so for the first 100 points
filename = '/home/ahoag/ngdemo/demo_bucket/test_annotations/test_coords/spatial0/0_0_0'
coordinates = converted_points[0:1000]
total_count = len(coordinates)
with open(filename,'wb') as outfile:
    buf = struct.pack('<Q',total_count)
    pt_buf = b''.join(struct.pack('<3f',x,y,z) for (x,y,z) in coordinates)
    buf += pt_buf
    id_buf = struct.pack('<%sQ' % len(coordinates), *range(len(coordinates)))
    buf += id_buf
    outfile.write(buf)
print(f"wrote {filename}")

wrote /home/ahoag/ngdemo/demo_bucket/test_annotations/test_coords/spatial0/0_0_0


In [13]:
# and the info file needs to look like this:
info = {
  "@type": "neuroglancer_annotations_v1",
  "annotation_type": "POINT",
  "by_id": {
    "key": "by_id"
  },
  "dimensions": {
    "x": [
      "5e-06",
      "m"
    ],
    "y": [
      "5e-06",
      "m"
    ],
    "z": [
      "1e-05",
      "m"
    ]
  },
  "lower_bound": [
    0,
    0,
    0
  ],
  "properties": [],
  "relationships": [],
  "spatial": [
    {
      "chunk_size": [
        2160,
        2560,
        687
      ],
      "grid_shape": [
        1,
        1,
        1
      ],
      "key": "spatial0",
      "limit": 1
    }
  ],
  "upper_bound": [
    2160,
    2560,
    687
  ]
}

In [40]:
info_filename = '/home/ahoag/ngdemo/demo_bucket/test_annotations/test_coords/info'
with open(info_filename,'w') as outfile:
    json.dump(info,outfile,indent=2)

Got this to work. Next let's try to add a single property, cell type.

## Single property -- cell type

Let's make this a uint8 such that values can range from 0-31. We will randomly assign cell types to each of the 1000 cells. This comment instructs how to encode when you have multiple properties: https://github.com/google/neuroglancer/issues/227#issuecomment-913895464: 
```
In order to minimize the padding bytes required, properties that require 4 byte alignment (uint32, int32, float32) are encoded first, followed by properties that require 2 byte alignment (uint16, int16), followed by properties that require 1 byte alignment (uint8, int8, rgb, rgba). For a given alignment, the properties are encoded in which the properties are specified in the info file.
```

Since we only have a single property this doesn't matter yet. We do, however, need to know where in the byte string to put the properties. From this file: https://github.com/google/neuroglancer/blob/master/src/neuroglancer/datasource/precomputed/annotations.md#multiple-annotation-encoding the answer is in the same entries as the coordinates. 

In [49]:
filename = '/home/ahoag/ngdemo/demo_bucket/test_annotations/test_singleprop/spatial0/0_0_0'
coordinates = converted_points[0:1000]
total_count = len(coordinates)
cell_types = np.random.randint(0,32,(1000,1))
# combine the coordinates and cell types into a single array
cell_array = np.hstack((coordinates,cell_types))
with open(filename,'wb') as outfile:
    buf = struct.pack('<Q',total_count) # 64-bit little endian
    pt_buf = b''.join(struct.pack('<3fH2B',x,y,z,c,0,0) for (x,y,z,c) in cell_array) 
    buf += pt_buf
    id_buf = struct.pack('<%sQ' % len(coordinates), *range(len(coordinates)))
    buf += id_buf
    outfile.write(buf)
print(f"wrote {filename}")

wrote /home/ahoag/ngdemo/demo_bucket/test_annotations/test_singleprop/spatial0/0_0_0


Now write the info file. According to this: https://github.com/google/neuroglancer/blob/master/src/neuroglancer/datasource/precomputed/annotations.md#info-json-file-format, the properties key must have this structure:
```
"properties": Array of JSON objects, each with the following members:
"id": String value specifying unique identifier for the property. Must match the regular expression /^[a-z][a-zA-Z0-9_]*$/.
"type": String value specifying the property type. Must be one of: rgb (represented as 3 uint8 values), rgba (represented as 4 uint8 values), uint8, int8, uint16, int16, uint32, int32, or float32.
"description": Optional. String value specifying textual description of property shown in UI.
"enum_values": Optional. If "type" is a numeric type (not "rgb" or "rgba"), this property may specify an array of values (compatible with the specified data type). These values correspond to the labels specified by "enum_labels", which are shown in the UI.
"enum_labels": Must be specified if, and only if, "enum_values" is specified. Must be an array of strings of the same length as "enum_values" specifying the corresponding labels for each value.

```

In [50]:
info = {
  "@type": "neuroglancer_annotations_v1",
  "annotation_type": "POINT",
  "by_id": {
    "key": "by_id"
  },
  "dimensions": {
    "x": [
      "5e-06",
      "m"
    ],
    "y": [
      "5e-06",
      "m"
    ],
    "z": [
      "1e-05",
      "m"
    ]
  },
  "lower_bound": [
    0,
    0,
    0
  ],
  "properties": [
      {"id":"celltype",
      "type":"uint16"
      }
  ],
  "relationships": [],
  "spatial": [
    {
      "chunk_size": [
        2160,
        2560,
        687
      ],
      "grid_shape": [
        1,
        1,
        1
      ],
      "key": "spatial0",
      "limit": 1
    }
  ],
  "upper_bound": [
    2160,
    2560,
    687
  ]
}

In [51]:
info_filename = '/home/ahoag/ngdemo/demo_bucket/test_annotations/test_singleprop/info'
with open(info_filename,'w') as outfile:
    json.dump(info,outfile,indent=2)

I can see the celltype property! The cell type shows up as a float, which in my case isn't ideal since I want to show it as an integer. How can I structure the packing so that I get three floats followed by a uint16 integer? What about `<3fH` in the byte encoding and `uint16` in the props dict? Nope that didn't show any of the points.  Maybe I need a byte offset to get to multiple of 4. So two zeros?

Yes, that did it! OK, let's try multiple properties.

## Multiple properties -- cell type, cell size
For multiple properties, the order of the properties is prioritized by byte size first then order in the info file at a given byte size, as per this comment:
https://github.com/google/neuroglancer/issues/227#issuecomment-913895464: 
```
In order to minimize the padding bytes required, properties that require 4 byte alignment (uint32, int32, float32) are encoded first, followed by properties that require 2 byte alignment (uint16, int16), followed by properties that require 1 byte alignment (uint8, int8, rgb, rgba). For a given alignment, the properties are encoded in which the properties are specified in the info file.
```
So let's say we want to use a float32 for the cell size and a uint16 for the cell type. We would do:
```
struct.pack('<4fH2B',(x,y,z,cell_size,cell_type))
```
where the final `2B` are padding bytes to make the total number of bytes divisible by 4.

Let's randomly generate some cell sizes and try to write this out

In [79]:
coordinates

array([[ 870, 1762,  407],
       [ 270,  803,  226],
       [ 296, 1538,  469],
       ...,
       [ 679, 1678,  152],
       [ 883, 1923,  307],
       [ 516, 1180,  202]])

In [81]:
coordinates.

array([[ 870., 1762.,  407.],
       [ 270.,  803.,  226.],
       [ 296., 1538.,  469.],
       ...,
       [ 679., 1678.,  152.],
       [ 883., 1923.,  307.],
       [ 516., 1180.,  202.]], dtype=float32)

In [85]:
np.hstack((coordinates.astype('f'),cell_sizes.astype('f'),cell_types.astype('uint16')))

array([[ 870.       , 1762.       ,  407.       ,   44.201786 ,
          22.       ],
       [ 270.       ,  803.       ,  226.       ,   37.620766 ,
          19.       ],
       [ 296.       , 1538.       ,  469.       ,   97.47168  ,
          11.       ],
       ...,
       [ 679.       , 1678.       ,  152.       ,   36.99874  ,
          14.       ],
       [ 883.       , 1923.       ,  307.       ,    3.7889192,
          30.       ],
       [ 516.       , 1180.       ,  202.       ,   86.47215  ,
           9.       ]], dtype=float32)

In [89]:
filename = '/home/ahoag/ngdemo/demo_bucket/test_annotations/test_multiprops/spatial0/0_0_0'
coordinates = converted_points[0:1000]
total_count = len(coordinates)
cell_sizes = np.random.uniform(0,100,(1000,1))
cell_types = np.random.randint(0,32,(1000,1))
# combine the coordinates, cell sizes, and cell types into a single array
# cell_array = np.hstack((coordinates,cell_sizes,cell_types))
cell_array = np.hstack((coordinates,cell_sizes,cell_types))
with open(filename,'wb') as outfile:
    buf = struct.pack('<Q',total_count) # 64-bit little endian
    pt_buf = b''.join(struct.pack('<4fH2B',x,y,z,s,int(c),0,0) for (x,y,z,s,c) in cell_array) 
    buf += pt_buf
    id_buf = struct.pack('<%sQ' % len(coordinates), *range(len(coordinates)))
    buf += id_buf
    outfile.write(buf)
print(f"wrote {filename}")

wrote /home/ahoag/ngdemo/demo_bucket/test_annotations/test_multiprops/spatial0/0_0_0


In [88]:
# Now write the info file. Not clear what the order of the properties should be. 
# I think since we have two properties at different byte sizes then the order will 
# be figured out by order of byte size so the order in the info file doesn't actually 
# matter. Can try both orders and see what happens. 
info = {
  "@type": "neuroglancer_annotations_v1",
  "annotation_type": "POINT",
  "by_id": {
    "key": "by_id"
  },
  "dimensions": {
    "x": [
      "5e-06",
      "m"
    ],
    "y": [
      "5e-06",
      "m"
    ],
    "z": [
      "1e-05",
      "m"
    ]
  },
  "lower_bound": [
    0,
    0,
    0
  ],
  "properties": [
      {"id":"celltype",
      "type":"uint16"
      },
      {"id":"size",
      "type":"float32"
      }
  ],
  "relationships": [],
  "spatial": [
    {
      "chunk_size": [
        2160,
        2560,
        687
      ],
      "grid_shape": [
        1,
        1,
        1
      ],
      "key": "spatial0",
      "limit": 1
    }
  ],
  "upper_bound": [
    2160,
    2560,
    687
  ]
}

In [90]:
info_filename = '/home/ahoag/ngdemo/demo_bucket/test_annotations/test_multiprops/info'
with open(info_filename,'w') as outfile:
    json.dump(info,outfile,indent=2)

This order worked!!

If we had two float32 type properties, the order we would put them in the struct string would be the same as the order in which they appear in the info file. Let's try that

In [91]:
filename = '/home/ahoag/ngdemo/demo_bucket/test_annotations/test_multiprops/spatial0/0_0_0'
coordinates = converted_points[0:1000]
total_count = len(coordinates)
cell_sizes = np.random.uniform(0,100,(1000,1))
cell_stds = np.random.uniform(0,1,(1000,1))
cell_types = np.random.randint(0,32,(1000,1))
# combine the coordinates, cell sizes, and cell types into a single array
# cell_array = np.hstack((coordinates,cell_sizes,cell_types))
cell_array = np.hstack((coordinates,cell_sizes,cell_stds,cell_types))
with open(filename,'wb') as outfile:
    buf = struct.pack('<Q',total_count) # 64-bit little endian
    pt_buf = b''.join(struct.pack('<5fH2B',x,y,z,s,std,int(c),0,0) for (x,y,z,s,std,c) in cell_array) 
    buf += pt_buf
    id_buf = struct.pack('<%sQ' % len(coordinates), *range(len(coordinates)))
    buf += id_buf
    outfile.write(buf)
print(f"wrote {filename}")

wrote /home/ahoag/ngdemo/demo_bucket/test_annotations/test_multiprops/spatial0/0_0_0


In [92]:
# Now write the info file. Not clear what the order of the properties should be. 
# I think since we have two properties at different byte sizes then the order will 
# be figured out by order of byte size so the order in the info file doesn't actually 
# matter. Can try both orders and see what happens. 
info = {
  "@type": "neuroglancer_annotations_v1",
  "annotation_type": "POINT",
  "by_id": {
    "key": "by_id"
  },
  "dimensions": {
    "x": [
      "5e-06",
      "m"
    ],
    "y": [
      "5e-06",
      "m"
    ],
    "z": [
      "1e-05",
      "m"
    ]
  },
  "lower_bound": [
    0,
    0,
    0
  ],
  "properties": [
      {"id":"celltype",
      "type":"uint16"
      },
      {"id":"size",
      "type":"float32"
      },
      {"id":"std",
      "type":"float32"
      }
  ],
  "relationships": [],
  "spatial": [
    {
      "chunk_size": [
        2160,
        2560,
        687
      ],
      "grid_shape": [
        1,
        1,
        1
      ],
      "key": "spatial0",
      "limit": 1
    }
  ],
  "upper_bound": [
    2160,
    2560,
    687
  ]
}

In [93]:
info_filename = '/home/ahoag/ngdemo/demo_bucket/test_annotations/test_multiprops/info'
with open(info_filename,'w') as outfile:
    json.dump(info,outfile,indent=2)

That worked. It appears that the properties are limited to float and int types, however, there is some functionality to use the `enum_values` key which might allow for strings.

## Adding string properties

```
"enum_values": Optional. If "type" is a numeric type (not "rgb" or "rgba"), this property may specify an array of values (compatible with the specified data type). These values correspond to the labels specified by "enum_labels", which are shown in the UI.
"enum_labels": Must be specified if, and only if, "enum_values" is specified. Must be an array of strings of the same length as "enum_values" specifying the corresponding labels for each value.
```
So I think this might offer a way to render the cell types as strings without encoding anything else. Let's try this with a new info file:

In [121]:
# Now write the info file. Not clear what the order of the properties should be. 
# I think since we have two properties at different byte sizes then the order will 
# be figured out by order of byte size so the order in the info file doesn't actually 
# matter. Can try both orders and see what happens. 
info = {
  "@type": "neuroglancer_annotations_v1",
  "annotation_type": "POINT",
  "by_id": {
    "key": "by_id"
  },
  "dimensions": {
    "x": [
      "5e-06",
      "m"
    ],
    "y": [
      "5e-06",
      "m"
    ],
    "z": [
      "1e-05",
      "m"
    ]
  },
  "lower_bound": [
    0,
    0,
    0
  ],
  "properties": [
      {"id":"celltype",
      "type":"uint16",
       "enum_values":[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31],
       "enum_labels":['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F']
      },
      {"id":"size",
      "type":"float32"
      },
      {"id":"std",
      "type":"float32"
      }
  ],
  "relationships": [],
  "spatial": [
    {
      "chunk_size": [
        2160,
        2560,
        687
      ],
      "grid_shape": [
        1,
        1,
        1
      ],
      "key": "spatial0",
      "limit": 1
    }
  ],
  "upper_bound": [
    2160,
    2560,
    687
  ]
}

In [122]:
info_filename = '/home/ahoag/ngdemo/demo_bucket/test_annotations/test_multiprops/info'
with open(info_filename,'w') as outfile:
    json.dump(info,outfile,indent=2)

This worked! It shows the label and the enum integer value in the properties panel.

Now let's explore the rgb properties. 

## Multiple properties -- cell type, cell size and cell color

The docs say that an rgb property is encoded as 3 uint8 values and rgba property as 4 uint8 values. What is the struct format code for a uint8 value? It is the `B` code (see: https://docs.python.org/2/library/struct.html and this comment: https://github.com/google/neuroglancer/issues/227#issuecomment-913152865). Since uint8 is the smallest of the byte sizes compared to the other properties, it goes last in the format string. Let's just say we want to encode rgba and skip rgb. To encode coordinates (float32), cell size (float32), cell type (uint16), and rgba (4x uint8), the format string should be:
```
struct.pack('<4fH4B2B',x,y,z,s,int(c),r,g,b,a,0,0) for (x,y,z,s,c,r,g,b,a) in cell_array) 
```
Where the final `2B` are two padding bytes to make the sum of bytes divisible by 4 (I think I could have just used `6B` instead of `4B2B` but it would be less clear). The breakdown of byte sizes is:
```
4f = 4x4 = 16
H = 2
4B = 4x1 = 4
2B = 2x1 = 2
```
The sum of which is `16+2+4+2 = 24` which is divisible by 4.

In [132]:
filename = '/home/ahoag/ngdemo/demo_bucket/test_annotations/test_multiprops_color/spatial0/0_0_0'
coordinates = converted_points[0:1000]
total_count = len(coordinates)
cell_sizes = np.random.uniform(0,100,(1000,1))
cell_types = np.random.randint(0,32,(1000,1))
rgba = np.random.randint(0,256,(1000,4))
# combine the coordinates, cell sizes, and cell types into a single array
# cell_array = np.hstack((coordinates,cell_sizes,cell_types))
cell_array = np.hstack((coordinates,cell_sizes,cell_types,rgba))
with open(filename,'wb') as outfile:
    buf = struct.pack('<Q',total_count) # 64-bit little endian
    pt_buf = b''.join(struct.pack('<4fH4B2B',x,y,z,s,int(c),int(r),int(g),int(b),int(a),0,0) for (
        x,y,z,s,c,r,g,b,a) in cell_array) 
    buf += pt_buf
    id_buf = struct.pack('<%sQ' % len(coordinates), *range(len(coordinates)))
    buf += id_buf
    outfile.write(buf)
print(f"wrote {filename}")

wrote /home/ahoag/ngdemo/demo_bucket/test_annotations/test_multiprops_color/spatial0/0_0_0


Now the info file:

In [136]:
# The order of the properties here is up to us and is not related to the order in the struct format string
# That is because there are no properties of the same byte size. If there were, then the order
# in the struct format string would need to match the relative order of those properties in the info file
info = {
  "@type": "neuroglancer_annotations_v1",
  "annotation_type": "POINT",
  "by_id": {
    "key": "by_id"
  },
  "dimensions": {
    "x": [
      "5e-06",
      "m"
    ],
    "y": [
      "5e-06",
      "m"
    ],
    "z": [
      "1e-05",
      "m"
    ]
  },
  "lower_bound": [
    0,
    0,
    0
  ],
  "properties": [
      {"id":"celltype",
      "type":"uint16",
       "enum_values":[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31],
       "enum_labels":['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F']
      },
      {"id":"size",
      "type":"float32"
      },
      {"id":"color",
      "type":"rgba"
      }
  ],
  "relationships": [],
  "spatial": [
    {
      "chunk_size": [
        2160,
        2560,
        687
      ],
      "grid_shape": [
        1,
        1,
        1
      ],
      "key": "spatial0",
      "limit": 1
    }
  ],
  "upper_bound": [
    2160,
    2560,
    687
  ]
}

In [137]:
info_filename = '/home/ahoag/ngdemo/demo_bucket/test_annotations/test_multiprops_color/info'
with open(info_filename,'w') as outfile:
    json.dump(info,outfile,indent=2)

That worked!