New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Boolean Mask ROI Tagging for Images #290
Comments
Hi @adityachivu, I am sorry to admit, that nix at the moment does not support tagging using ROI masks. Storing this information is easily possible but it cannot be used for tagging and automated data retrieval as is supported by the Tag and MultiTag entities. We are discussing about changing the linking mechanism and also discussed exactly the use-case you describe. |
Thank you for your prompt response! I would love to have your suggestion on how to do the above. I was thinking of using a unique block for each 3D image and storing the boolean mask as an separate DataArray of the same dimensionality as my image. So my HDF structure would consist of multiple blocks where the block index would correspond to the timestamp and each block would contain two DataArrays, one with the actual data and the other with the ROI mask. What could be a better alternative? An immediate drawback I see is that different "sessions" can not be distinguished at the block level of the hierarchy. Thank you for your time! |
Hi @adityachivu, that would be one option. Does your ROI mask change over time? If it does not, you could do it a bit more efficient. Am I correct to assume the first three dimensions are the x-, y- and z- extent? The fourth dimension would be time? Or do you also have a number of color channels? That would leave it 5-D right?
Given, this you have all the data in one
You could then retrieve the tagged data from the tag during analysis:
If you want to keep order in you file, and have lots of entities, or at least make it easy to find those entities that belong together, you may want to use I hope this helps, if you need more information, let me know. Feel also free to post questions in our IRC channel gnode at freenode. |
Hi @jgrewe . I was able to convert our data to NIX reading your detailed explanation. Thank you very much for it. The data shape is 4D, i.e. I have slices of 2D b/w images (single channel) over time. Now I have defined a generator to sequentially read each 3D image i.e. one timestep. I found that this was rather surprisingly slow, and wanted to know if I was doing something incorrectly: # data.shape = [num_images, num_slices, height, width]
def img_generator(data_arr):
shape = data_arr.shape # [1500, 20, 1628, 1292] dtype = np.uint16
for i in range(shape[0]):
yield data_arr[i, :, :, :]
data_arr = nix_file.blocks['test'].data_arrays['test_data']
img_gen = img_generator(data_arr)
for img in img_gen:
pass
# Total time: 21m |
Hi @adityachivu I do not think you do something wrong. This is a huge dataset. I just played with just creating data of that size in c++ with and without storing it in nix. #include <boost/multi_array.hpp>
int main() {
typedef boost::multi_array<int16_t, 4> array_type;
typedef array_type::index index;
array_type data(boost::extents[1][20][1628][1292]);
for (int16_t value = 0; value < 10; ++value) {
for(index i = 0; i < 20; ++i)
for(index j = 0; j < 1628; ++j)
for (index k = 0; k < 1292; ++k)
data[0][i][j][k] = value;
}
return 0;
} This runs for about 50 seconds on my laptop. Writing it to nix adds a few seconds and switching on compression adds another 4s. I did not try the reading, so far. Would you mind doing a similar test with python, just creating a few numpy.ones() with that shape. |
well, numpy.ones is much faster, using random values makes it in about 15s. Guess, this needs some research... |
img = data[0, 0, 0, 0]
img = data[0, 0, 0, :]
img = data[0, 0, :, 0]
img = data[0, 0, :, :]
img = data[0, :, 0, 0] # Error
img = data[0, :, 0, :] # Error
img = data[0, :, :, 0]
img = data[0, :, :, :]
img = data[:, 0, 0, 0] # Error
img = data[:, 0, 0, :] # Error
img = data[:, 0, :, :] # Too large, didn't try
img = data[:, :, 0, 0] # TypeError: Can't broadcast (1500, 20, 1, 1) -> (1500, 20, 1) The error traceback is the following
|
they are not mutually exclusive. When opening the file you can specify which backend to use. From the stacktrace is is obvious, that you are using the h5py directly, not using the nix cpp lib. Do I understand correctly, that you NIX vs Numpy test is reading from file in the NIX case and creating in memory for the numpy case? If this is so, then the file io is not too bad... |
Sorry for being unclear, I compared only read times between Numpy memory maps and NIX (19m vs 21m respectively in my case). I haven't compared write times, but for NIX it takes around 65 seconds per image for writing to file. about 1600 minutes in total. This is just for your information. It seems that read times are optimal. Further, could you please tell me how to specify the backend? I wasn't able to find out how. Also, would you like me to open a different issue for the read errors as it is unrelated to the original issue? Thank you very much for your prompt responses! |
Hello @adityachivu. You can specify a backend when opening a file: f = nix.File.open('ImageStack.nix', nix.FileMode.Overwrite, backend='h5py') # or backend='hdf5'
That would be great, thanks. If you could include a minimal working example to help me investigate the issue, I'd greatly appreciate it. |
the write times are indeed very long. Would you mind pasting the code? As I said, in the c++ example above, writing the 10 x 20 x 1628 x 1292 takes roughly one minute including compression. Of these 60s, most of the time is spent creating the dataset. So it should end up with 15 mins total, assuming linearity. One issue might be the chunking of the data. Depending on the way you created the DataArray the chunk size will be different. For example, if you initialize the DataArray with an initial size of (1,1,1,1) and then write a single stack (1,20,1628, 1292) will require a huge number of io actions. |
Hopefully I read the documentation thoroughly enough, but I was wondering if nix supports boolean mask ROI tagging for images? Thanks!
The text was updated successfully, but these errors were encountered: