Writing to new Neuroglancer dataset in C++ #94

LarsKoeppel · 2023-04-18T10:04:50Z

I am trying to write to a new Neuroglancer dataset from C++.

For this i open the dataset with the following code. The info file in correctly generated and placed in the output folder.

auto contextSpec = tensorstore::Context::Spec::FromJson({{"cache_pool", {{"total_bytes_limit", 100000000}}}});
tensorstore::Context context = tensorstore::Context(*contextSpec);

auto store =
tensorstore::Open({{"driver", "neuroglancer_precomputed"},
                   {"kvstore", {{"driver", "file"},
                                {"path", destdir}}},
                   {"dtype", "uint8"},
                   {"schema",
                    {{"chunk_layout", {{"grid_origin", {0, 0, 0, 0}},
                                       {"inner_order", {3, 2, 1, 0}},
                                       {"read_chunk", {{"shape", {chunkSize[0], chunkSize[1], chunkSize[2], numChannels}}}},
                                       {"write_chunk", {{"shape", {chunkSize[0], chunkSize[1], chunkSize[2], numChannels}}}}
                      }},
                     {"codec", {{"driver", "neuroglancer_precomputed"}, {"encoding", "raw"}}},
                     {"dimension_units", {{3, "nm"}, {3, "nm"}, {1, "nm"}, {}}},
                     {"domain", {{"exclusive_max", {sizeX, sizeY, sizeZ, numChannels}},
                                 {"inclusive_min", {0, 0, 0, 0}},
                                 {"labels", {"x", "y", "z", "channel"}}
                      }}}
                   }},
                  context,
                  tensorstore::OpenMode::create,
                  tensorstore::RecheckCached{false},
                  tensorstore::ReadWriteMode::write).result();

After this i prepare an array that should be written to the dataset. And write it to the dataset with this code. But nothing is written in the file system and the result has the status kMovedFromString: "Status accessed after move."

std::vector<int64_t> shape = {numChannels, chunkSize[2], chunkSize[1], chunkSize[0]};
auto arr = tensorstore::Array(dataPrint.data(), shape, tensorstore::c_order); 

auto writeFuture = tensorstore::Write(tensorstore::UnownedToShared(arr), store);
auto result = writeFuture.result();

What am i missing?

The text was updated successfully, but these errors were encountered:

LarsKoeppel · 2023-04-18T14:22:33Z

I managed to write some data. I have to check if it is correct or the order is messed up. But this is how far i have come.

auto interval1 = tensorstore::Dims(0).HalfOpenInterval(start1, end1);
auto interval2 = tensorstore::Dims(1).HalfOpenInterval(start2, end2);
auto interval3 = tensorstore::Dims(2).HalfOpenInterval(start3, end3);

std::vector<int64_t> shape = {chunkSize[0], chunkSize[1], chunkSize[2], numChannels};
auto arr = tensorstore::Array(dataPrint.data(), shape, tensorstore::fortran_order); 

auto writeFuture = tensorstore::Write(tensorstore::UnownedToShared(arr), store | interval1 | interval2 | interval3);
writeFuture.commit_future.value();

auto result = writeFuture.result();

laramiel · 2023-04-19T02:47:49Z

It's hard to know exactly what you want, but there are a few examples which may help you.At some point we may try and make them more prominent & expand them. See also, for example, the small benchmark we use for quick read/write test or individual tests used in the drivers:

You might try using a somewhat reduced spec to start with. neuroglancer_precomputed should have some reasonable defaults:

Off the cuff (so, not at all tested), you want something like:

 // open or create a tensorstore 
TENSORSTORE_CHECK_OK_AND_ASSIGN(
  auto store, tensorstore::Open({
      {"driver", "neuroglancer_precomputed"},
      {"kvstore",
       {
           {"driver", "memory"},
           {"path", "prefix/"},
       }},
      {"multiscale_metadata",
       {
           {"data_type", "uint8"},
           {"num_channels", 4},
           {"type", "image"},
       }},
      {"scale_metadata",
       {
           {"resolution", {1, 1, 1}},
           {"encoding", "raw"},
           {"chunk_size", {16, 16, 16}},
           {"size", {64, 64, 64}},
       }},
  },
      tensorstore::OpenMode::open | tensorstore::OpenMode::create,
      tensorstore::ReadWriteMode::read_write).result());

  // shared_array is an array of uint8_t with dimensions {2,3,2,2}
  auto shared_array =  tensorstore::MakeArray<std::uint8_t>(
            {{{{0x71, 0x72}, {0x81, 0x82}},
              {{0x91, 0x99}, {0xa1, 0xa2}},
              {{0xb1, 0xb2}, {0xc1, 0xc2}}},
             {{{0x11, 0x12}, {0x21, 0x22}},
              {{0x31, 0x32}, {0x41, 0x42}},
              {{0x51, 0x52}, {0x61, 0x62}}}});

 // write the array to {0,0,0,0}
TENSORSTORE_CHECK_OK(
 tensorstore::Write(shared_array, store | tensorstore::AllDims().SizedInterval(
                               {0, 0, 0, 0}, {2, 3, 2, 2})).commit_future.result());

LarsKoeppel · 2023-04-20T09:23:13Z

Thank you very much for the provided information. It was realy helpful.

Now i want to expand on this excample and use sharding. For this i found this issue: #13
It is mentioned there that work is being done on a possibility of automatically determining the parameters for the shards.
Is this already available? How can it be used?

Or should I calculate the parameters myself and integrate them into the tensorstore::open as in line 1281 of the driver_test?

jbms · 2023-05-18T19:34:44Z

Yes, there is now support for that.

If you want to specify the shard size explicitly, you can specify a write_chunk shape in the schema that is a multiple of the read chunk shape, and tensorstore will determine the corresponding sharding parameters automatically. Note, however, that the neuroglancer precomputed sharded format unfortunately has some limitations on the possible write chunk shapes due to how it is represented in terms of the morton code. For example, if your read chunk shape is (64, 64, 64, numChannels), you can make the write_chunk shape (128, 128, 128, numChannels), but not (64, 128, 64, numChannels). Since it may be tricky to figure out exactly which write chunk shapes are valid, if you don't care what it is precisely you can specify it in the schema as a soft constraint.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Writing to new Neuroglancer dataset in C++ #94

Writing to new Neuroglancer dataset in C++ #94

LarsKoeppel commented Apr 18, 2023

LarsKoeppel commented Apr 18, 2023 •

edited

laramiel commented Apr 19, 2023 •

edited

LarsKoeppel commented Apr 20, 2023

jbms commented May 18, 2023

Writing to new Neuroglancer dataset in C++ #94

Writing to new Neuroglancer dataset in C++ #94

Comments

LarsKoeppel commented Apr 18, 2023

LarsKoeppel commented Apr 18, 2023 • edited

laramiel commented Apr 19, 2023 • edited

LarsKoeppel commented Apr 20, 2023

jbms commented May 18, 2023

LarsKoeppel commented Apr 18, 2023 •

edited

laramiel commented Apr 19, 2023 •

edited