Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writing to new Neuroglancer dataset in C++ #94

Open
LarsKoeppel opened this issue Apr 18, 2023 · 4 comments
Open

Writing to new Neuroglancer dataset in C++ #94

LarsKoeppel opened this issue Apr 18, 2023 · 4 comments

Comments

@LarsKoeppel
Copy link

I am trying to write to a new Neuroglancer dataset from C++.

For this i open the dataset with the following code. The info file in correctly generated and placed in the output folder.

auto contextSpec = tensorstore::Context::Spec::FromJson({{"cache_pool", {{"total_bytes_limit", 100000000}}}});
tensorstore::Context context = tensorstore::Context(*contextSpec);

auto store =
tensorstore::Open({{"driver", "neuroglancer_precomputed"},
                   {"kvstore", {{"driver", "file"},
                                {"path", destdir}}},
                   {"dtype", "uint8"},
                   {"schema",
                    {{"chunk_layout", {{"grid_origin", {0, 0, 0, 0}},
                                       {"inner_order", {3, 2, 1, 0}},
                                       {"read_chunk", {{"shape", {chunkSize[0], chunkSize[1], chunkSize[2], numChannels}}}},
                                       {"write_chunk", {{"shape", {chunkSize[0], chunkSize[1], chunkSize[2], numChannels}}}}
                      }},
                     {"codec", {{"driver", "neuroglancer_precomputed"}, {"encoding", "raw"}}},
                     {"dimension_units", {{3, "nm"}, {3, "nm"}, {1, "nm"}, {}}},
                     {"domain", {{"exclusive_max", {sizeX, sizeY, sizeZ, numChannels}},
                                 {"inclusive_min", {0, 0, 0, 0}},
                                 {"labels", {"x", "y", "z", "channel"}}
                      }}}
                   }},
                  context,
                  tensorstore::OpenMode::create,
                  tensorstore::RecheckCached{false},
                  tensorstore::ReadWriteMode::write).result();

After this i prepare an array that should be written to the dataset. And write it to the dataset with this code. But nothing is written in the file system and the result has the status kMovedFromString: "Status accessed after move."

std::vector<int64_t> shape = {numChannels, chunkSize[2], chunkSize[1], chunkSize[0]};
auto arr = tensorstore::Array(dataPrint.data(), shape, tensorstore::c_order); 

auto writeFuture = tensorstore::Write(tensorstore::UnownedToShared(arr), store);
auto result = writeFuture.result();

What am i missing?

@LarsKoeppel
Copy link
Author

LarsKoeppel commented Apr 18, 2023

I managed to write some data. I have to check if it is correct or the order is messed up. But this is how far i have come.

auto interval1 = tensorstore::Dims(0).HalfOpenInterval(start1, end1);
auto interval2 = tensorstore::Dims(1).HalfOpenInterval(start2, end2);
auto interval3 = tensorstore::Dims(2).HalfOpenInterval(start3, end3);

std::vector<int64_t> shape = {chunkSize[0], chunkSize[1], chunkSize[2], numChannels};
auto arr = tensorstore::Array(dataPrint.data(), shape, tensorstore::fortran_order); 

auto writeFuture = tensorstore::Write(tensorstore::UnownedToShared(arr), store | interval1 | interval2 | interval3);
writeFuture.commit_future.value();

auto result = writeFuture.result();

@laramiel
Copy link
Collaborator

laramiel commented Apr 19, 2023

It's hard to know exactly what you want, but there are a few examples which may help you.At some point we may try and make them more prominent & expand them. See also, for example, the small benchmark we use for quick read/write test or individual tests used in the drivers:

You might try using a somewhat reduced spec to start with. neuroglancer_precomputed should have some reasonable defaults:

Off the cuff (so, not at all tested), you want something like:

 // open or create a tensorstore 
TENSORSTORE_CHECK_OK_AND_ASSIGN(
  auto store, tensorstore::Open({
      {"driver", "neuroglancer_precomputed"},
      {"kvstore",
       {
           {"driver", "memory"},
           {"path", "prefix/"},
       }},
      {"multiscale_metadata",
       {
           {"data_type", "uint8"},
           {"num_channels", 4},
           {"type", "image"},
       }},
      {"scale_metadata",
       {
           {"resolution", {1, 1, 1}},
           {"encoding", "raw"},
           {"chunk_size", {16, 16, 16}},
           {"size", {64, 64, 64}},
       }},
  },
      tensorstore::OpenMode::open | tensorstore::OpenMode::create,
      tensorstore::ReadWriteMode::read_write).result());

  // shared_array is an array of uint8_t with dimensions {2,3,2,2}
  auto shared_array =  tensorstore::MakeArray<std::uint8_t>(
            {{{{0x71, 0x72}, {0x81, 0x82}},
              {{0x91, 0x99}, {0xa1, 0xa2}},
              {{0xb1, 0xb2}, {0xc1, 0xc2}}},
             {{{0x11, 0x12}, {0x21, 0x22}},
              {{0x31, 0x32}, {0x41, 0x42}},
              {{0x51, 0x52}, {0x61, 0x62}}}});

 // write the array to {0,0,0,0}
TENSORSTORE_CHECK_OK(
 tensorstore::Write(shared_array, store | tensorstore::AllDims().SizedInterval(
                               {0, 0, 0, 0}, {2, 3, 2, 2})).commit_future.result());

@LarsKoeppel
Copy link
Author

Thank you very much for the provided information. It was realy helpful.

Now i want to expand on this excample and use sharding. For this i found this issue: #13
It is mentioned there that work is being done on a possibility of automatically determining the parameters for the shards.
Is this already available? How can it be used?

Or should I calculate the parameters myself and integrate them into the tensorstore::open as in line 1281 of the driver_test?

@jbms
Copy link
Collaborator

jbms commented May 18, 2023

Yes, there is now support for that.

If you want to specify the shard size explicitly, you can specify a write_chunk shape in the schema that is a multiple of the read chunk shape, and tensorstore will determine the corresponding sharding parameters automatically. Note, however, that the neuroglancer precomputed sharded format unfortunately has some limitations on the possible write chunk shapes due to how it is represented in terms of the morton code. For example, if your read chunk shape is (64, 64, 64, numChannels), you can make the write_chunk shape (128, 128, 128, numChannels), but not (64, 128, 64, numChannels). Since it may be tricky to figure out exactly which write chunk shapes are valid, if you don't care what it is precisely you can specify it in the schema as a soft constraint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants