Large files issue #505

manoaman · 2021-11-04T00:34:12Z

Hello,

I tried running CloudVolume on a TIFF stack which is 245GB in size. (155MB each for about 1600+ slices.) I realized the number of chunk files created in a directory have hit 1000001 and that seems like an upper bound from what I can create. (Probably this value is configurable but I'm not sure. Any thoughts?) The following is the error I see while running CloudVolume.

I/O error(31): Too many links
[Errno 31] Too many links:

Should the tiff files be downsized before running CloudVolume? If you could advise me on the approaches, it would be nice to know.

Thank you!
-m

The text was updated successfully, but these errors were encountered:

fcollman · 2021-11-04T00:38:31Z

what chunk size did you use? this will control how many files you create. Also a code snippet would help debug what was going on in your info file definition in particular.

manoaman · 2021-11-04T01:00:35Z

Interesting, maybe I should try 1024 instead? I used chunk_size=[256, 256, 1] for chunking. The following is the code snippet for the info file definition.

info = CloudVolume.create_new_info(  # 'image' or 'segmentation'
                                     # can pick any popular uint
                                     # other options: 'jpeg', 'compressed_segmentation' (req. uint32 or uint64)
                                     # X,Y,Z values in nanometers
                                     # values X,Y,Z values in voxels
                                     # rechunk of image X,Y,Z in voxels
                                     # X,Y,Z size in voxels
    num_channels=1,
    layer_type='image',
    data_type='uint16',
    encoding='raw',
//    resolution=[4000, 4000, 4000],
    resolution=[1850, 1850, 4000],
    voxel_offset=[0, 0, 0],
    chunk_size=[256, 256, 1],
    volume_size=[7370, 8768, 1621]
    )

fcollman · 2021-11-04T01:32:11Z

(7370/256)*(8768/256)*(1621/1) = 1,598,347 chunks. do you want single z sections to be the chunks? Right now you have ~1MB chunk files (256*256*16 bits)/(1024*1024 bits/MB) You could probably get away with 5 MB chunks. 128x128x16 would be ~400K chunks.

…

On Wed, Nov 3, 2021 at 6:00 PM manoaman ***@***.***> wrote: Interesting, maybe I should try 1024 instead? I used chunk_size=[256, 256, 1] for chunking. The following is the code snippet for the info file definition. info = CloudVolume.create_new_info( # 'image' or 'segmentation' # can pick any popular uint # other options: 'jpeg', 'compressed_segmentation' (req. uint32 or uint64) # X,Y,Z values in nanometers # values X,Y,Z values in voxels # rechunk of image X,Y,Z in voxels # X,Y,Z size in voxels num_channels=1, layer_type='image', data_type='uint16', encoding='raw', resolution=[4000, 4000, 4000], voxel_offset=[0, 0, 0], chunk_size=[256, 256, 1], volume_size=[7370, 8768, 1621] ) — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#505 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAF7ABL3LAOQTTMPYDAQGP3UKHSM3ANCNFSM5HKIBQVQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

william-silversmith · 2021-11-04T01:48:23Z

I overall think Forrest has a good suggestion, but I think bits and bytes might be mixed up.

512x512x1 chunks would give you a 4x reduction in the number of files and would be 512x512x1 x 2 bytes = 512 kiB each without compression. Chunking in Z can give better performance while scrolling though the initial upload is a little more complex to do.

If you think you'd like to compact the number of files even further (it wasn't clear to me whether the file quote was per folder or for your whole account), after you upload you can use igneous to transfer to the sharded format using igneous xfer SRC DEST --sharded --queue QUEUE and then delete the original upload once you're satisfied.

manoaman · 2021-11-04T04:54:42Z

thank you @fcollman !!

@william-silversmith

Initially with 256x256 chunk size, CloudVolume exited so it was quoted against a single folder. (I think the failure has to do with the storage's upper bound on how many files allowed to create.)

If you think you'd like to compact the number of files even further (it wasn't clear to me whether the file quote was per folder or for your whole account), after you upload you can use igneous to transfer to the sharded format using igneous xfer SRC DEST --sharded --queue QUEUE and then delete the original upload once you're satisfied.

I intend to do the chunking in Z with Igneous. Can you tell me a little bet more on how igneous xfer SRC DEST --sharded --queue QUEUE works? What is sharded format and could this be run after chunking in Z? What do I specify in QUEUE ? (Sorry Will for asking so many questions here..)

Thank you all!!

william-silversmith · 2021-11-04T05:30:10Z

Glad we were able to help!

The sharded format is a method for storing many chunks into a single file while still retaining random access to individual chunks. There's a slight performance penalty, but CloudVolume can read them just like the regular chunked format. You won't be able to write the sharded format easily without specialized knowledge except through Igneous (so no patching missing tiles).

As an example of how to use igneous to generate the sharded version. The QUEUE variable is either an AWS sqs:// queue or a file folder that will be populated with queue files. You can read more here.

igneous xfer ./source-dir ./dest-dir --sharded --queue ./queue --chunk-size 128,128,16
igneous execute ./queue #  you can run as many of these in parallel as you want

Make sure you have the latest igneous version as there was a bug fix in the last update. I tried to make sure that the shard generation takes a reasonable amount of RAM by sizing the files appropriately. The default uncompressed target size is 3.5GB each (could use up to 2x that, the generated shard will be smaller due to compression).

You can see more options for the transfer with: igneous xfer --help

One other thing to keep in mind is that downsampling sharded volumes generates only one additional level of heirarchy at a time. This can introduce a small integer truncation error per level. The regular down-sampling method avoids this issue for 5 mips at a time. This is because generating multiple sharded levels at a time would use unreasonable amounts of memory.

You can read more about sharding here: https://github.com/seung-lab/cloud-volume/wiki/Sharding:-Reducing-Load-on-the-Filesystem

manoaman · 2021-11-06T00:58:30Z

Oh cool, I did not know igneous was available from pip install. I've given a try with igneous xfer ... by adding "precomputed://" for the target directories and getting this error. Am I missing in providing arguments here?

Traceback (most recent call last):
  File "/mypath/.conda/envs/igneous_test2/bin/igneous", line 8, in <module>
    sys.exit(main())
  File "/mypath/.conda/envs/igneous_test2/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/mypath/.conda/envs/igneous_test2/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/mypath/.conda/envs/igneous_test2/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/mypath/.conda/envs/igneous_test2/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/mypath/.conda/envs/igneous_test2/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/mypath/.conda/envs/igneous_test2/lib/python3.7/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/mypath/.conda/envs/igneous_test2/lib/python3.7/site-packages/igneous_cli/cli.py", line 254, in xfer
    encoding=encoding, memory_target=memory, clean_info=clean_info
  File "/mypath/.conda/envs/igneous_test2/lib/python3.7/site-packages/igneous/task_creation/image.py", line 415, in create_image_shard_transfer_tasks
    src_vol = CloudVolume(src_layer_path, mip=mip)
  File "/mypath/.conda/envs/igneous_test2/lib/python3.7/site-packages/cloudvolume/cloudvolume.py", line 207, in __new__
    return REGISTERED_PLUGINS[path.format](**kwargs)
  File "/mypath/.conda/envs/igneous_test2/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/__init__.py", line 37, in create_precomputed
    cloudpath=get_cache_path(cache, cloudpath),
  File "/mypath/.conda/envs/igneous_test2/lib/python3.7/site-packages/cloudvolume/datasource/__init__.py", line 88, in get_cache_path
    return get_cache_path_helper(base, cloudpath)
  File "/mypath/.conda/envs/igneous_test2/lib/python3.7/site-packages/cloudvolume/datasource/__init__.py", line 97, in get_cache_path_helper
    base, path.protocol, basepath, path.layer
  File "/mypath/.conda/envs/igneous_test2/lib/python3.7/posixpath.py", line 94, in join
    genericpath._check_arg_types('join', a, *p)
  File "/mypath/.conda/envs/igneous_test2/lib/python3.7/genericpath.py", line 149, in _check_arg_types
    (funcname, s.__class__.__name__)) from None
TypeError: join() argument must be str or bytes, not 'NoneType'

william-silversmith · 2021-11-06T02:28:15Z

Hi m,

The pip install / CLI version of igneous is newer so not everyone has learned about it yet. I'm glad you find it convenient! Can you provide a more complete command? It's a little hard to debug without seeing the path that triggered the error.

manoaman · 2021-11-06T05:02:35Z

Ops, sorry about that. I've given several tries after attempting the protocol and the format warnings and this is what I'm seeing so far. The cli command is something as the following:

igneous xfer precomputed://file://../../../../../source_dir precomputed://file://../../../../../dest_dir --sharded --queue precomputed://file://../../../../../queue --chunk-size 128,128,16

source_dir contains the chunked files in Z.

  File "/mypath/.conda/envs/igneous_test2/lib/python3.7/site-packages/cloudvolume/paths.py", line 133, in extract
    fmt, protocol, cloudpath = extract_format_protocol(cloudpath)
  File "/mypath/.conda/envs/igneous_test2/lib/python3.7/site-packages/cloudvolume/paths.py", line 57, in extract_format_protocol
    raise error # e.g. ://test_bucket, test_bucket, wow//test_bucket
cloudvolume.exceptions.UnsupportedProtocolError:

Trying to follow this rule.

Cloud Path must conform to FORMAT://PROTOCOL://BUCKET/PATH
Examples: 
  precomputed://gs://test_bucket/em
  gs://test_bucket/em
  graphene://https://example.com/image/em

Supported Formats: None (precomputed), graphene, precomputed, boss
Supported Protocols: gs, file, s3, matrix, http, https

william-silversmith · 2021-11-06T05:18:37Z

You can write simply:

igneous xfer ../../../../../source_dir ../../../../../dest_dir --sharded --queue ../../../../../queue --chunk-size 128,128,16

precomputed:// is probably fine for the source and dest, but the queue is a totally different mechanism and the prefix makes no sense there. The appropriate prefixes for queue are sqs:// (Amazon SQS) and fq:// (File Queue, default).

manoaman · 2021-11-06T17:32:22Z

Okay. I tried different combinations of FORMAT and PROTOCAL prefixes, and also without the prefixes. It turns out, I had to explicitly specify them. The following command seemed to run okay.

igneous xfer precomputed://file://../../../../../source_dir precomputed://file://../../../../../dest_dir --sharded --queue fq://../../../../../queue --chunk-size 128,128,16

Waiting on the igneous execute to finish and so far the info logs are indicating success and I'm hopeful it will finish. I'll let you know once I verify on Neuroglancer.😁

INFO Deleting 20e51e44-a0ff-4c62-9ebb-839d5955e946
INFO FunctionTask 20e51e44-a0ff-4c62-9ebb-839d5955e946 succesfully executed in 87.30 sec.
...
``

william-silversmith · 2021-11-06T18:48:43Z

That is fantastic! Just FYI, you can monitor queue progress with the command ptq status ../../../../../queue ptq also has some other commands to help you manage the queue. Check ptq --help.

ptq = Python Task Queue

manoaman · 2021-11-07T01:15:41Z

The process still seems to be running and here is what I see from ptq status .... I'll try and give some time to check back later. Looks completed from the status?

Inserted: 140
Enqueued: 0 (0.0% left)
Completed: 140 (100.0%)
Leased: 0 (--%) of queue

william-silversmith · 2021-11-07T01:31:11Z

It's done! It doesn't automatically exit.

…

On Sat, Nov 6, 2021, 9:15 PM manoaman ***@***.***> wrote: The process still seems to be running and here is what I see from ptq status .... I'll try and give some time to check back later. Looks completed from the status? Inserted: 140 Enqueued: 0 (0.0% left) Completed: 140 (100.0%) Leased: 0 (--%) of queue — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#505 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AATGQSJANLW6W767PWDAJSDUKXONPANCNFSM5HKIBQVQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

manoaman · 2021-11-08T21:54:39Z

Hi Will,

I tested out this morning on Neuroglancer and the sharded file formats load great. I do see the size reduction in number of files generated and the total size the folder takes up on the storage. (144GB to 106G, 413193 files to 141 files) This is nice.

a) Sharded files: 32MB ~ 1.1GB shard files (.sd)

$ ls -l ../sharded_chunks/ | wc -l
141

Total size:
$ du -sh ../sharded_chunks
106G
-----------
b) Before sharded: 4.7KB ~ 442KB precomputed files (.gz)

$ ../precomputed/ | wc -l
413193

$ du -sh ../precomputed
144G

One thing I realized is that the chunks (loading tiles) on Neuroglancer seem to gotten much smaller. Is this because I specified --chunk-size 128,128,16 instead of --chunk-size 512,512,16? When I ran the transfer task in a separate python code, (this is before I ran igneous xfer and how I created the source to generate shard format), and load the precomputed files, I see bigger tiles loading. Could this be something I can tweak with --chunk-size?

This is the python script where I generate the precomputed chunks on z.

 with LocalTaskQueue(parallel=8) as tq:
        tasks = tc.create_transfer_tasks(
          src_layer_path, dest_layer_path,
          chunk_size=(64,64,64), shape=Vec(512,512,512),
          fill_missing=False, translate=(0,0,0),
          bounds=None, mip=0, preserve_chunk_size=True,
          encoding=None, skip_downsamples=False,
          delete_black_uploads=False
        )

And after from igneous cli.

igneous xfer precomputed://file://../../../../../source_dir precomputed://file://../../../../../dest_dir --sharded --queue precomputed://file://../../../../../queue --chunk-size 128,128,16

william-silversmith · 2021-11-08T21:57:54Z

Hi m,

Yep, the chunk size parameter is what is controlling the size of the tiles. Unfortunately, you'll need to generate a new shard layer. The existing one can't be modified in place. You can do this either from the original tiles or from the existing shards as a source.

manoaman · 2021-11-09T00:57:36Z

Okay. Let me try increasing the chunk size to 512,512,64 to see the change. It seems to be taking longer this time so I would have to see how the results come out. 😁

This is a bet off topic. Before getting the image stack to CloudVolume/Igneous, I had tough time splitting multi-page tiff about 245GB in size. It took me about 900GB~1TB memory to allocate on the high performance computing node, and use ImageMagick for splitting. (Anything smaller in memory size resulted in "out of memory" errors.) Is this typical with handling large files before even getting to the chunking stage?

william-silversmith · 2021-11-09T03:03:41Z

Those are pretty big chunks (33 MB) so your neuroglancer loading may become pretty slow. Might I suggest something closer to 256 x 256 x 16 (2 MB), 256 x 256 x 32 (4 MB), or 512 x 512 x 16 (8 MB)?

I'll admit I haven't worked with very large single TIFF files myself, usually the files are split into single image slices. However, if you find a good TIFF library or use the right features from it, it should be very possible to work with it slice by slice instead of reading the whole thing into memory at once.

You might have some luck perusing the documentation for the tifffile python package. If the images are not compressed, it seems you can read them as a memory mapped file: cgohlke/tifffile#52

If the package doesn't have what you need, it also links in the documentation to a number of other scientific TIFF packages that might have what you want..

manoaman · 2021-11-09T18:21:34Z

Agh, thank you for reminding me about that. I should have considered the size of the chunks... From your experience, do you suggest each chunk to be somewhere 2MB ~ 8MB in size? (In fact, I was getting an error with 512,512,64 so I ended up using 512,512,16 instead.)

It does seem like reading slice by slice would be a better approach to tackle large files. More z-stacks would give more challenges for sure. 😱 Thank you again for suggesting the tifffile package!

william-silversmith · 2021-11-09T18:45:05Z

I think it depends a lot on the expected storage technology and internet connection. I think somewhere around 500 KiB to 2 MiB is a good range if you have gigabit. To cover an XY plane, you'll need to download at least dozens of chunks, so fully consuming your bandwidth with a few chunks isn't ideal. You can go higher, it's just the latency will become more noticeable. It's also important not to go too thick as that will really increase latency somewhat uselessly. If you push that too far, Neuroglancer will limit the number of chunks downloaded because too much memory will be used by non-visible depth.

Everything is chunking from the bottom of computer architecture to high minded stuff like petascale volumes. 😁 Hope the package is helpful!

manoaman · 2021-11-10T17:56:02Z

Hi Will,

Sorry to bother you again with the continuous questions. If the shard format is going to be used in the first place, would it still matter what chunk sizes I specify in the pre-chunking stages with CloudVolume and Igneous (create_transfer_tasks)? Will igneous xfer --chunk-size override the predefined chunks and this should be considered as the final result? Still trying to understand the translation of chunks defined in each step.

Thanks!
-m

william-silversmith · 2021-11-10T18:03:24Z

The final transfer command that creates the shards will also use whatever chunk size you specify. The previous chunk sizes are irrelevant, so you should pick them to be convenient for the initial uploading.

manoaman · 2022-06-25T04:29:35Z

Hi @william-silversmith ,

It's been awhile but I should have asked this question to begin with. What is the largest file size a 3d volumetric image can take before processing in CloudVolume for practical Neuroglancer viewing? What I mean by "practical" here is, chunks are fully loaded in the browser without hitting the RAM limit. (No black tiles in the display.)

In this example, 3d volumetric image (TIFF) was 245GB in size. Is that too large to begin with?

william-silversmith · 2022-06-25T05:06:10Z

I see Jeremy answered your question in the linked discussion and I agree with him. Make sure to downsample your volume after uploading the initial set of tiles (pick a chunk size like 128x128x64). If you are still having problems visualizing the data, run downsampling again using the top mip level that was generated in the last step. This will build an even taller image pyramid. Once the pyramid is sufficiently tall, you will have no problems at all.

manoaman · 2022-06-27T17:03:29Z

Hi @william-silversmith , what do you mean by run downsampling again in CloudVolume/Igneous terms? Are you referring to DownSample tasks? https://github.com/seung-lab/igneous#downsampling-downsampletask.

Will DownSample task work after rechunking (https://github.com/seung-lab/igneous#data-transfer--rechunking-transfertask)? So in the actual code, would it be...something like this?

with LocalTaskQueue(parallel=8) as tq:
  tasks = tc.create_transfer_tasks(
    src_layer_path, dest_layer_path, 
    chunk_size=(64,64,64), shape=Vec(512,512,512),
    fill_missing=False, translate=(0,0,0), 
    bounds=None, mip=0, preserve_chunk_size=True,
    encoding=None, skip_downsamples=False,
    delete_black_uploads=False
  )  
  tq.insert_all(tasks)

print("Done!")

# downsample from here ???
tasks = create_downsampling_tasks(
    layer_path, # e.g. 'gs://bucket/dataset/layer'
    mip=0, # Start downsampling from this mip level (writes to next level up)
    fill_missing=False, # Ignore missing chunks and fill them with black
    axis='z', 
    num_mips=5, # number of downsamples to produce. Downloaded shape is chunk_size * 2^num_mip
    chunk_size=None, # manually set chunk size of next scales, overrides preserve_chunk_size
    preserve_chunk_size=True, # use existing chunk size, don't halve to get more downsamples
    sparse=False, # for sparse segmentation, allow inflation of pixels against background
    bounds=None, # mip 0 bounding box to downsample 
    encoding=None # e.g. 'raw', 'compressed_segmentation', etc
    delete_black_uploads=False, # issue a delete instead of uploading files containing all background
    background_color=0, # Designates the background color
    compress='gzip', # None, 'gzip', and 'br' (brotli) are options
    factor=(2,2,1), # common options are (2,2,1) and (2,2,2)
  )
  tq.insert_all(tasks)

If you are still having problems visualizing the data, run downsampling again using the top mip level that was generated in the last step. This will build an even taller image pyramid.

Once the pyramid is sufficiently tall, you will have no problems at all.
How do you tell if the pyramid is sufficiently tall?

Thank you Will,
-m

william-silversmith · 2022-06-27T17:22:59Z

Hi m,

I think you will find it easier to use the Igneous command line interface if you can. The transfer tasks will automatically create a few levels of downsamples, so if you run downsampling from mip 0 again, you probably won't see much improvement. You'll probably also enjoy using FileQueue more as you can stop and restart jobs without starting again from the beginning.

With XY dimension chunk size 128 and a task size of 1024, you should expect three downsamples to be generated.

Try this:

igneous image xfer SRC DEST --mip 0 --chunk-size 128,128,64 --shape 1024,1024,64 --queue ./queue
igneous -p 8 execute -x ./queue
igneous image downsample SRC --mip 3 --num-mips 4 --queue ./queue
igneous -p 8 execute -x ./queue

manoaman · 2022-06-30T21:10:24Z

Hi @william-silversmith ,

Okay, I've tried testing with 3D image volume (7332 x 10131 x 3900; tiff stacks, 329GB in total size) and I don't know if I succeeded at a downsampling stage. The files sizes don't seem to change in the destination folder before and after the downsample. I see an error running a CLI command so I could be designing the chunk or shape sizes incorrectly... Would you be able to advise what am I doing wrong?

Here are the steps I took.

Run CloudVolume to chunk XY dimension. (I chose 1024,1024,1 so that I won't face I/O error on creating too many files. It seems that 1,000,000 files is the storage limit. Maybe configurable on the storage to increase this limit to allow smaller chunks?)

Configured parameters in a CloudVolume script:

    chunk_size=[1024, 1024, 1],    
    volume_size=[7332, 10131, 3900],

Output files/folders in the destination folder:

$ ls

1800_1800_2000  info  progress  provenance

Next, rechunked on XYZ with Igneous CLI.

CLI:

$ igneous image xfer SRC DEST  --mip 0 --chunk-size 128,128,64 --shape 1024,1024,64 --queue ./queue
$ igneous -p 36 execute -x ./queue

Output files/folders in the destination folder:

$ du -sh ./

35M     ./14400_14400_2000
2.9G    ./1800_1800_2000
569M    ./3600_3600_2000
138M    ./7200_7200_2000
24K     ./info
24K     ./provenance

Lastly, the failing step. Downsample.

CLI:

$ igneous image downsample SRC --mip 3 --num-mips 4 --queue ./queue
$ igneous -p 36 execute -x ./queue

Output files/folders in the destination folder:

$ du -sh ./

35M     ./14400_14400_2000
2.9G    ./1800_1800_2000
569M    ./3600_3600_2000
138M    ./7200_7200_2000
24K     ./info
24K     ./provenance

Error message:

Quite a few cloudvolume.exceptions.EmptyVolumeException printed on the terminal due to missing chunks.

ERROR FunctionTask(('igneous.tasks.image.image', 'DownsampleTask'),[],{'layer_path': 'file:///folder_name', 'mip': 3, 'shape': [2048, 2048, 64], 'offset': [0, 0, 1088], 'axis': 'z', 'fill_missing': False, 'sparse': False, 'delete_black_uploads': False, 'background_color': 0, 'dest_path': None, 'compress': None, 'factor': [2, 2, 1]},"327db0a1-27d1-4394-a20e-05c4cb7a2cea") raised 14400_14400_2000/0-128_256-384_1088-1152
 Traceback (most recent call last):
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/taskqueue/taskqueue.py", line 375, in poll
    task.execute(*execute_args, **execute_kwargs)
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/taskqueue/queueablefns.py", line 78, in execute
    self(*args, **kwargs)
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/taskqueue/queueablefns.py", line 87, in __call__
    return self.tofunc()()
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/igneous/tasks/image/image.py", line 467, in DownsampleTask
    factor=factor,
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/igneous/tasks/image/image.py", line 426, in TransferTask
    src_bbox, agglomerate=agglomerate, timestamp=timestamp
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/frontends/precomputed.py", line 709, in download
    bbox.astype(np.int64), mip, parallel=parallel, renumber=bool(renumber)
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/image/__init__.py", line 183, in download
    background_color=int(self.background_color),
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/image/rx.py", line 281, in download
    green=green, secrets=secrets, background_color=background_color
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/image/rx.py", line 560, in download_chunks_threaded
    green=green,
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/scheduler.py", line 104, in schedule_jobs
    return schedule_threaded_jobs(fns, concurrency, progress, total)
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/scheduler.py", line 30, in schedule_threaded_jobs
    tq.put(updatefn(fn))
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/threaded_queue.py", line 257, in __exit__
    self.wait(progress=self.with_progress)
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/threaded_queue.py", line 227, in wait
    self._check_errors()
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/threaded_queue.py", line 191, in _check_errors
    raise err
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/threaded_queue.py", line 153, in _consume_queue
    self._consume_queue_execution(fn)
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/threaded_queue.py", line 180, in _consume_queue_execution
    fn()
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/scheduler.py", line 23, in realupdatefn
    res = fn()
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/image/rx.py", line 528, in process
    decode_fn, decompress
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/image/rx.py", line 509, in download_chunk
    background_color=background_color)
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/image/rx.py", line 582, in decode
    mip, background_color,
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/image/rx.py", line 629, in _decode_helper
    raise EmptyVolumeException(input_bbox)
cloudvolume.exceptions.EmptyVolumeException: 14400_14400_2000/0-128_256-384_1088-1152

william-silversmith · 2022-07-01T14:39:03Z

Hi m, The empty volume error appears if your image does not completely fill the space or if you're pointed at an incorrect location. I noticed you reset the chunk size on the command line after you set it in the info file which may not be what you want. You can try using the --fill-missing flag which will write zeroed data instead of throwing an exception. For the transfer step, you can also try using --sharded to reduce the number of files written dramatically though no downsamples will generate from that step (so it all will need to be done via the downsample command).

…

On Thu, Jun 30, 2022, 5:10 PM manoaman ***@***.***> wrote: Hi Will, Okay, I've tried testing with 3D image volume (7332 x 10131 x 3900; tiff stacks, 329GB in total size) and I don't know if I succeeded at a downsampling stage. The files sizes don't seem to change in the destination folder before and after the downsample. I see an error running a CLI command so I could be designing the chunk or shape sizes incorrectly... Would you be able to advise what am I doing wrong? Here are the steps I took. ------------------------------ 1. Run CloudVolume to chunk XY dimension. (I chose 1024,1024,1 so that I won't face I/O error on creating too many files. It seems that 1,000,000 files is the storage limit. Maybe configurable on the storage to increase this limit to allow smaller chunks?) *Configured parameters in a CloudVolume script:* chunk_size=[1024, 1024, 1], volume_size=[7332, 10131, 3900], *Output files/folders in the destination folder:* $ ls 1800_1800_2000 info progress provenance ------------------------------ 1. Next, rechunked on XYZ with Igneous CLI. *CLI:* $ igneous image xfer SRC DEST --mip 0 --chunk-size 128,128,64 --shape 1024,1024,64 --queue ./queue $ igneous -p 36 execute -x ./queue *Output files/folders in the destination folder:* $ du -sh ./ 35M ./14400_14400_2000 2.9G ./1800_1800_2000 569M ./3600_3600_2000 138M ./7200_7200_2000 24K ./info 24K ./provenance ------------------------------ 1. Lastly, the failing step. Downsample. *CLI:* $ igneous image downsample SRC --mip 3 --num-mips 4 --queue ./queue $ igneous -p 36 execute -x ./queue *Output files/folders in the destination folder:* $ du -sh ./ 35M ./14400_14400_2000 2.9G ./1800_1800_2000 569M ./3600_3600_2000 138M ./7200_7200_2000 24K ./info 24K ./provenance *Error message:* Quite a few cloudvolume.exceptions.EmptyVolumeException printed on the terminal due to missing chunks. ERROR FunctionTask(('igneous.tasks.image.image', 'DownsampleTask'),[],{'layer_path': 'file:///folder_name', 'mip': 3, 'shape': [2048, 2048, 64], 'offset': [0, 0, 1088], 'axis': 'z', 'fill_missing': False, 'sparse': False, 'delete_black_uploads': False, 'background_color': 0, 'dest_path': None, 'compress': None, 'factor': [2, 2, 1]},"327db0a1-27d1-4394-a20e-05c4cb7a2cea") raised 14400_14400_2000/0-128_256-384_1088-1152 Traceback (most recent call last): File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/taskqueue/taskqueue.py", line 375, in poll task.execute(*execute_args, **execute_kwargs) File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/taskqueue/queueablefns.py", line 78, in execute self(*args, **kwargs) File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/taskqueue/queueablefns.py", line 87, in __call__ return self.tofunc()() File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/igneous/tasks/image/image.py", line 467, in DownsampleTask factor=factor, File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/igneous/tasks/image/image.py", line 426, in TransferTask src_bbox, agglomerate=agglomerate, timestamp=timestamp File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/frontends/precomputed.py", line 709, in download bbox.astype(np.int64), mip, parallel=parallel, renumber=bool(renumber) File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/image/__init__.py", line 183, in download background_color=int(self.background_color), File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/image/rx.py", line 281, in download green=green, secrets=secrets, background_color=background_color File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/image/rx.py", line 560, in download_chunks_threaded green=green, File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/scheduler.py", line 104, in schedule_jobs return schedule_threaded_jobs(fns, concurrency, progress, total) File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/scheduler.py", line 30, in schedule_threaded_jobs tq.put(updatefn(fn)) File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/threaded_queue.py", line 257, in __exit__ self.wait(progress=self.with_progress) File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/threaded_queue.py", line 227, in wait self._check_errors() File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/threaded_queue.py", line 191, in _check_errors raise err File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/threaded_queue.py", line 153, in _consume_queue self._consume_queue_execution(fn) File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/threaded_queue.py", line 180, in _consume_queue_execution fn() File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/scheduler.py", line 23, in realupdatefn res = fn() File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/image/rx.py", line 528, in process decode_fn, decompress File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/image/rx.py", line 509, in download_chunk background_color=background_color) File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/image/rx.py", line 582, in decode mip, background_color, File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/image/rx.py", line 629, in _decode_helper raise EmptyVolumeException(input_bbox) cloudvolume.exceptions.EmptyVolumeException: 14400_14400_2000/0-128_256-384_1088-1152 — Reply to this email directly, view it on GitHub <#505 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AATGQSMOL3MD6FNZB6LNEW3VRYEMVANCNFSM5HKIBQVQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

manoaman · 2022-10-07T20:16:54Z

For converting into shard format, the following command did not convert to .shard files. I still see .gz files in the folder. Execution of tasks looked okay without throwing any errors.

igneous image downsample file:///nfs/precomputed/1024x1024x1_128x128x64 --mip 0 --num-mips 1 --queue ./queue --sharded

Obviously, I can't shard all the levels at once. How do I go about to convert each mip level to shard formats?

igneous image downsample file:///nfs/precomputed/1024x1024x1_128x128x64 --mip 0 --num-mips 5 --queue ./queue --sharded

igneous: sharded downsamples only support producing one mip at a time.

william-silversmith · 2022-10-07T23:40:58Z

Hi m, I think it would make sense to either use larger chunks or transfer using the --sharded flag. This will circumvent your limitation. You then generate all mip levels by downsampling as sharded one at a time However, the upper levels have many fewer chunks so it may be convinient to generate only the first mip as sharded and the upper levels as unsharded. Will

…

On Fri, Oct 7, 2022, 4:17 PM manoaman ***@***.***> wrote: For converting into shard format, the following command did not convert to .shard files. I still see .gz files in the folder. Execution of tasks looked okay without throwing any errors. igneous image downsample file:///nfs/precomputed/1024x1024x1_128x128x64 --mip 0 --num-mips 1 --queue ./queue --sharded Obviously, I can't shard all the levels at once. How do I go about to convert each mip level to shard formats? igneous image downsample file:///nfs/precomputed/1024x1024x1_128x128x64 --mip 0 --num-mips 5 --queue ./queue --sharded igneous: sharded downsamples only support producing one mip at a time. — Reply to this email directly, view it on GitHub <#505 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AATGQSM75GLVI2BV7DVTNP3WCCAMBANCNFSM5HKIBQVQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

manoaman · 2022-10-07T23:59:02Z

Ohh, I see. It is okay to mix shard and unshard formats in different mip levels. I didn't think about that and that sure makes sense. Thank you Will for your advise!

william-silversmith · 2022-10-11T07:06:01Z

If you specify chunk-size on the command line it will overwrite your previous settings. If you already set it just omit it from the command. shape corresponds to the size of a task and should be a power of two the size of the chunks to ensure standalone downsamples generate efficiently. Regardless, the shape must be chunk aligned or errors will result. For the downsample command, each additional mip level quadruples the size of the task (though it will cap at the maximum number of downsamples when a single downsample is a chunk). It looks like your info file is ok and is maximally downsampled for your chunk size. Are you sure your queue is totally empty? Maybe it didn't finish processing. One more tip: if you set --encoding png you can save almost 2x the disk space losslessly at the expense of compression speed.

…

On Fri, Jul 1, 2022, 1:59 PM manoaman ***@***.***> wrote: Thanks for the feedbacks Will, --fill-missing option did work. Thank you. I didn't quite understand what you mean by "reset the chunk size". What chunk size should I be using in the first place? I think I'm a little confused how to define chunk sizes transitioning from a CloudVolume script to Igneous CLI and the use of --shape option. I noticed you reset the chunk size on the command line after you set it in the info file which may not be what you want. The results so far I see in the viewer are partially loaded chunks and stops loading so I'm not sure if I succeeded in downsampling. I went as deep as --num-mips 8 and here are the generated files. du -sh ./* 29M ./14400_14400_2000 2.0G ./1800_1800_2000 19M ./28800_28800_2000 487M ./3600_3600_2000 6.1M ./57600_57600_2000 117M ./7200_7200_2000 24K ./info 24K ./provenance $ cat ./info { "data_type": "uint16", "num_channels": 1, "scales": [ { "chunk_sizes": [ [ 128, 128, 64 ] ], "encoding": "raw", "key": "1800_1800_2000", "resolution": [ 1800, 1800, 2000 ], "size": [ 7332, 10131, 3900 ], "voxel_offset": [ 0, 0, 0 ] }, { "chunk_sizes": [ [ 128, 128, 64 ] ], "encoding": "raw", "key": "3600_3600_2000", "resolution": [ 3600, 3600, 2000 ], "size": [ 3666, 5066, 3900 ], "voxel_offset": [ 0, 0, 0 ] }, { "chunk_sizes": [ [ 128, 128, 64 ] ], "encoding": "raw", "key": "7200_7200_2000", "resolution": [ 7200, 7200, 2000 ], "size": [ 1833, 2533, 3900 ], "voxel_offset": [ 0, 0, 0 ] }, { "chunk_sizes": [ [ 128, 128, 64 ] ], "encoding": "raw", "key": "14400_14400_2000", "resolution": [ 14400, 14400, 2000 ], "size": [ 917, 1267, 3900 ], "voxel_offset": [ 0, 0, 0 ] }, { "chunk_sizes": [ [ 128, 128, 64 ] ], "encoding": "raw", "key": "28800_28800_2000", "resolution": [ 28800, 28800, 2000 ], "size": [ 459, 634, 3900 ], "voxel_offset": [ 0, 0, 0 ] }, { "chunk_sizes": [ [ 128, 128, 64 ] ], "encoding": "raw", "key": "57600_57600_2000", "resolution": [ 57600, 57600, 2000 ], "size": [ 230, 317, 3900 ], "voxel_offset": [ 0, 0, 0 ] } ], "type": "image" } Any thoughts? — Reply to this email directly, view it on GitHub <#505 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AATGQSPZ4MNWUJ7UR4CKVD3VR4WZZANCNFSM5HKIBQVQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

manoaman · 2022-10-12T03:26:17Z

Hi Will,

Do you know how to get around these errors? One error occurs when igneous image xfer is executed without --sharded option. And the other error is downsampling the remainder of mip levels to sharded format.

Thank you,
-m

igneous image xfer file:///nfs/precomputed_1024x1024x1 file:///nfs/precomputed_rechunked_isotropic --mip 0 --chunk-size 128,128,128 --fill-missing --queue ./queue

/homedir/.conda/envs/igneous/lib/python3.7/site-packages/igneous/task_creation/common.py:58: RuntimeWarning: divide by zero encountered in true_divide
  return int(reduce(operator.mul, np.ceil(bounds.size3() / shape)))
Traceback (most recent call last):
  File "/homedir/.conda/envs/igneous/bin/igneous", line 8, in <module>
    sys.exit(main())
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/igneous_cli/cli.py", line 279, in xfer
    clean_info=clean_info, no_src_update=no_src_update
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/igneous/task_creation/image.py", line 872, in create_transfer_tasks
    return TransferTaskIterator(dest_bounds, shape)
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/igneous/task_creation/common.py", line 72, in __init__
    self.end = num_tasks(bounds, shape)
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/igneous/task_creation/common.py", line 58, in num_tasks
    return int(reduce(operator.mul, np.ceil(bounds.size3() / shape)))
OverflowError: cannot convert float infinity to integer

igneous image downsample file:///nfs/precomputed_rechunked_isotropic --mip 1 --num-mips 1 --volumetric --queue ./queue --sharded

ERROR FunctionTask(('igneous.tasks.image.image', 'ImageShardDownsampleTask'),['file:///nfs/precomputed_rechunked_isotropic'],{'shape': [2048, 2048, 1024], 'offset': [2048, 0, 2048], 'mip': 1, 'fill_missing': False, 'sparse': False, 'agglomerate': False, 'timestamp': None, 'factor': [2, 2, 1]},"2e10bddf-b1df-4be2-b439-ccbd6699ab43") raised could not broadcast input array from shape (1024,1024,128,1) into shape (1024,1024,0,1)
 Traceback (most recent call last):
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/taskqueue/taskqueue.py", line 375, in poll
    task.execute(*execute_args, **execute_kwargs)
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/taskqueue/queueablefns.py", line 78, in execute
    self(*args, **kwargs)
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/taskqueue/queueablefns.py", line 87, in __call__
    return self.tofunc()()
  File "/homedir/.conda/envs/igneous/lib/python3.7/site-packages/igneous/tasks/image/image.py", line 599, in ImageShardDownsampleTask
    output_img[:,:,(z*chunk_size.z):(z+1)*chunk_size.z] = ds_img
ValueError: could not broadcast input array from shape (1024,1024,128,1) into shape (1024,1024,0,1)

william-silversmith · 2022-10-12T04:26:04Z

Hi m,

That function num_tasks gets called with the destination bounds for that mip. Check your info file to make sure all dimensions are non-zero.

In the second error, the chunk size appears to be zero?

This seems like either an info file error or a mistake I made clamping values somewhere.

manoaman · 2022-10-13T01:27:34Z

Attaching info files from 1 and 2 data sources. They look okay to me. No?

info_1.txt
info_2.txt

william-silversmith · 2022-10-13T03:40:47Z

Hi m,

I wasn't able to reproduce the sharded issue, but I found a bug in my task shape calculation when the specified z chunk size was greater than the computed chunk shape. I released igneous-pipeline==4.10.0 which contains a fix for that issue. Thanks for reporting the bug (and providing the info files which allowed me to reproduce it easily)! Let me know if that helps.

manoaman · 2022-10-13T15:42:43Z

Hi Will,

I'm glad to hear that I could contribute!! And thank you for the quick release!! I'll upgrade my Igneous to the latest version.

-m

manoaman · 2023-10-31T21:20:41Z

Hello @william-silversmith

I encountered a new error which I haven't experienced before and I have a couple of questions. I'm tackling a 3D volume with the volume resolution in (41088 x 28416 x 10240), uint8. Initially used cloud-volume and chunked xy plane in (4096 x 4096 x 1). And then, used igneous by (128 x 128 x 64) chunk which I encountered the following error during the process. This is ran on a high RAM compute machine (1TB RAM, 36 cpu cores). And I'm running again at the moment to reproduce the issue on a different machine.

How should I tackle in fixing this error and could this be avoided from chunk design?
Do cloud-volume and igneous support GPU to expedite the process?

Perhaps run the failed task queue jobs again would fix?

Any guidance here would be appreciated. Thank you!

Happy Halloween!
-m

 Traceback (most recent call last):
  File "/ifshome/manoaman/.conda/envs/chunking/lib/python3.7/site-packages/taskqueue/taskqueue.py", line 375, in poll
    task.execute(*execute_args, **execute_kwargs)
  File "/ifshome/manoaman/.conda/envs/chunking/lib/python3.7/site-packages/taskqueue/queueablefns.py", line 78, in execute
    self(*args, **kwargs)
  File "/ifshome/manoaman/.conda/envs/chunking/lib/python3.7/site-packages/taskqueue/queueablefns.py", line 87, in __call__
    return self.tofunc()()
  File "/ifshome/manoaman/.conda/envs/chunking/lib/python3.7/site-packages/igneous/tasks/image/image.py", line 469, in DownsampleTask
    factor=factor,
  File "/ifshome/manoaman/.conda/envs/chunking/lib/python3.7/site-packages/igneous/tasks/image/image.py", line 428, in TransferTask
    src_bbox, agglomerate=agglomerate, timestamp=timestamp
  File "/ifshome/manoaman/.conda/envs/chunking/lib/python3.7/site-packages/cloudvolume/frontends/precomputed.py", line 732, in download
    bbox.astype(np.int64), mip, parallel=parallel, renumber=bool(renumber)
  File "/ifshome/manoaman/.conda/envs/chunking/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/image/__init__.py", line 217, in download
    background_color=int(self.background_color),
  File "/ifshome/manoaman/.conda/envs/chunking/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/image/rx.py", line 260, in download
    renderbuffer = np.zeros(shape, dtype=dtype, order=order)
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 174. GiB for an array with shape (10272, 7104, 2560, 1) and data type uint8

INFO Deleting 205f6273-9d7f-47c1-aed3-4c1518f1a82f
INFO FunctionTask 205f6273-9d7f-47c1-aed3-4c1518f1a82f succesfully executed in 2584.17 sec.

william-silversmith · 2023-10-31T21:27:44Z

Hi m,

Can you show me the parameters you are using for downsampling? If you use the newest version of Igneous, it should try to keep these tasks to a reasonable memory limit. On the old version, you can try setting num_mips to a smaller number (3 or 4).

The error you are encountering is likely an inability for malloc to find a contiguous memory region of that size. Smaller memory segments will probably make this go away.

manoaman · 2023-10-31T21:57:01Z

Yes. I ran the following commands in sequence. That reminds me I have not updated the igneous-pipeline. The version I used is Version: 4.17.0. Perhaps I should go ahead and upgrade to 4.20.1.

igneous image xfer file:///nfs/3d_volume/xy_precomputed/ file:///nfs/3d_volume/ch0/ --mip 0 --chunk-size 128,128,64 --fill-missing --queue ./queue --sharded &&
igneous -p 36 execute -x ./queue &&

ptq purge ./queue &&

igneous image downsample file:///nfs/3d_volume/ch0/ --mip 0 --num-mips 2 --volumetric --fill-missing --queue ./queue &&

igneous -p 36 execute -x ./queue &&

ptq purge ./queue &&

igneous image downsample file:///nfs/3d_volume/ch0/ --mip 2 --num-mips 7 --volumetric --fill-missing --queue ./queue &&

igneous -p 36 execute -x ./queue &&

ptq purge ./queue;

william-silversmith · 2023-10-31T22:03:19Z

The line with num_mips 7 is possibly the culprit. Each additional mip requires 4x the memory. I'd change that to 4 or 5 and see if it helps.

…

On Tue, Oct 31, 2023, 5:57 PM manoaman ***@***.***> wrote: Yes. I ran the following commands in sequence. That reminds me I have not updated the igneous-pipeline. The version I used is Version: 4.17.0. Perhaps I should go ahead and upgrade to 4.20.1. igneous image xfer file:///nfs/3d_volume/xy_precomputed/ file:///nfs/3d_volume/ch0/ --mip 0 --chunk-size 128,128,64 --fill-missing --queue ./queue --sharded && igneous -p 36 execute -x ./queue && ptq purge ./queue && igneous image downsample file:///nfs/3d_volume/ch0/ --mip 0 --num-mips 2 --volumetric --fill-missing --queue ./queue && igneous -p 36 execute -x ./queue && ptq purge ./queue && igneous image downsample file:///nfs/3d_volume/ch0/ --mip 2 --num-mips 7 --volumetric --fill-missing --queue ./queue && igneous -p 36 execute -x ./queue && ptq purge ./queue; — Reply to this email directly, view it on GitHub <#505 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AATGQSJSYZHGJDRIG6PANV3YCFX3RAVCNFSM5HKIBQV2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCNZYHAYDSMZRG4YA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

manoaman · 2023-11-07T17:48:47Z

Hi @william-silversmith ,

It looks like low mip levels (400_400_800, 800_800_1600, 1600_1600_3200) are created over 1 million chunks in a folder and causing the break in the storage side. I'm going to change the chunk size to 128,128,64 to 128,128,128 instead. However, I cannot figure out how to make the second and the third level chunks (800_800_1600, 1600_1600_3200) to convert into a shard format chunks. Can you advise how I can specify specific levels to shard format? A sequence of commands I'm using is attached.

Thank you,
-m

igneous image xfer file:///nfs/3d_volume/xy_precomputed/ file:///nfs/3d_volume/ch0/ --mip 0 --chunk-size 128,128,128 --fill-missing --queue ./queue --sharded &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&


igneous image downsample file:///nfs/3d_volume/ch0/ --mip 0 --num-mips 2 --volumetric  --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue;
igneous image downsample file:///nfs/3d_volume/ch0/ --mip 2 --num-mips 4 --volumetric  --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue;
igneous image downsample file:///nfs/3d_volume/ch0/ --mip 4 --num-mips 6 --volumetric  --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue;
igneous image downsample file:///nfs/3d_volume/ch0/ --mip 6 --num-mips 8 --volumetric  --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue;

william-silversmith · 2023-11-07T17:54:01Z

Hi m,

You can created sharded downsamples by adding --sharded, however, you are limited to creating 1 new mip level at a time with this method. It may be possible to improve this in the future, but the issue is the large memory usage of shards.

manoaman · 2023-11-07T17:56:30Z

Is this correct for the first three mip levels?

igneous image downsample file:///nfs/3d_volume/ch0/ --mip 0 --num-mips 1 --sharded  --volumetric  --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue;

igneous image downsample file:///nfs/3d_volume/ch0/ --mip 1 --num-mips 1 --sharded  --volumetric  --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue;

igneous image downsample file:///nfs/3d_volume/ch0/ --mip 2 --num-mips 1 --sharded  --volumetric  --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue;

william-silversmith · 2023-11-07T17:59:21Z

Yes, that should work. The volumetric flag doesn't get tested too regularly, let me know if you run into problems.

manoaman · 2023-11-28T16:41:38Z

Hi @william-silversmith

I've tested igneous image downsample and it appears that with --sharded option, the view on two other planes (xy, yz) become corrupted viewing from Neuroglancer. It's almost as if same images are overlayed many times with offset slightly shifted and looks blurry. If I try to add --sharded option to subsequent downsample commands on a deeper mip levels, the image corruption becomes worse.

Looks like I can get away with this without using --sharded option with igneous image downsample. However, the number of chunks becomes really large. Do you have any other suggestions on downsample images and use shard format volumetrically? I was thinking of using shard format for the first 2~3 levles (mip 0,1,2).

Thank you,
-m

william-silversmith · 2023-11-28T17:56:20Z

That's really weird! Can you show me the command you are using and the info file? Does the problem resolve when you zoom in?

…

On Tue, Nov 28, 2023, 11:41 AM manoaman ***@***.***> wrote: Hi @william-silversmith <https://github.com/william-silversmith> I've tested igneous image downsample and it appears that with --sharded option, the view on two other planes (xy, yz) become corrupted viewing from Neuroglancer. It's almost as if same images are overlayed many times with offset slightly shifted and looks blurry. If I try to add --sharded option to subsequent downsample commands on a deeper mip levels, the image corruption becomes worse. Looks like I can get away with this without using --sharded option with igneous image downsample. However, the number of chunks becomes really large. Do you have any other suggestions on downsample images and use shard format volumetrically? I was thinking of using shard format for the first 2~3 levles (mip 0,1,2). Thank you, -m Screenshot.2023-11-28.at.8.33.19.AM.png (view on web) <https://github.com/seung-lab/cloud-volume/assets/47464840/10b89f22-68f0-4df5-a3b1-4bea4cc34ce8> Screenshot.2023-11-28.at.8.33.04.AM.png (view on web) <https://github.com/seung-lab/cloud-volume/assets/47464840/6ecb6483-2187-4b6e-814f-8b589f566fc5> — Reply to this email directly, view it on GitHub <#505 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AATGQSP6T5M5XDYLIDQEET3YGYH4ZAVCNFSM5HKIBQV2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBTGAZDOMBTG42Q> . You are receiving this because you were mentioned.Message ID: ***@***.***>

manoaman · 2023-11-28T18:46:49Z

Hi Will, this is the command and info file attached. Any clue I might be processing incorrectly?

igneous image xfer file:///nafs/precomputed_xy/ file:///nafs/precomputed_xyz/ --mip 0 --chunk-size 128,128,128 --fill-missing --queue ./queue --sharded &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nafs/precomputed_xyz/ --mip 0 --num-mips 1 --sharded --volumetric --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nafs/precomputed_xyz/ --mip 1 --num-mips 1 --sharded --volumetric --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nafs/precomputed_xyz/ --mip 2 --num-mips 1 --volumetric --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nafs/precomputed_xyz/ --mip 3 --num-mips 1 --volumetric --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nafs/precomputed_xyz/ --mip 4 --num-mips 1 --volumetric --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nafs/precomputed_xyz/ --mip 5 --num-mips 1 --volumetric --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue;

info.txt

william-silversmith · 2023-11-28T20:22:20Z

Huh. This is pretty weird. Can you give it a try without using --volumetric (just one mip) and see if that works? I think I recently fixed --volumetric to work with sharded, but it's possible there are still bugs.

manoaman · 2023-11-28T20:53:05Z

Okay, running the modified command lines. Probably it'll finish tomorrow so I'll get back to you once I see the results.

manoaman · 2023-11-29T16:42:39Z

Hi @william-silversmith , images (xz, yz planes) still appear the same as before. Could this be the issue with using "--sharded" during downsample since I did not see this effect? I used igneous-pipeline (4.20.1).

igneous image xfer file:///nafs/precomputed_xy/ file:///nfs/precomputed_xyz/ --mip 0 --chunk-size 128,128,128 --fill-missing --queue ./queue --sharded &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nfs/precomputed_xyz/ --mip 0 --num-mips 1 --sharded --volumetric --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nfs/precomputed_xyz/ --mip 1 --num-mips 1 --sharded --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nfs/precomputed_xyz/ --mip 2 --num-mips 1 --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nfs/precomputed_xyz/ --mip 3 --num-mips 1 --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nfs/precomputed_xyz/ --mip 4 --num-mips 1 --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nfs/precomputed_xyz/ --mip 5 --num-mips 1 --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue;

info.txt

william-silversmith · 2023-11-29T16:56:57Z

Does the volume look like you would expect when you scroll in the z section on the xy plane? Are you sure this isn't the data? Take a look at it with only the highest resolution images.

…

On Wed, Nov 29, 2023, 11:42 AM manoaman ***@***.***> wrote: Hi @william-silversmith <https://github.com/william-silversmith> , images (xz, yz planes) still appear the same as before. Could this be the issue with using "--sharded" during downsample since I did not see this effect? I used igneous-pipeline (4.20.1). igneous image xfer file:///nafs/precomputed_xy/ file:///nfs/precomputed_xyz/ --mip 0 --chunk-size 128,128,128 --fill-missing --queue ./queue --sharded && igneous -p 36 execute -x ./queue && ptq purge ./queue && igneous image downsample file:///nfs/precomputed_xyz/ --mip 0 --num-mips 1 --sharded --volumetric --fill-missing --queue ./queue && igneous -p 36 execute -x ./queue && ptq purge ./queue && igneous image downsample file:///nfs/precomputed_xyz/ --mip 1 --num-mips 1 --sharded --fill-missing --queue ./queue && igneous -p 36 execute -x ./queue && ptq purge ./queue && igneous image downsample file:///nfs/precomputed_xyz/ --mip 2 --num-mips 1 --fill-missing --queue ./queue && igneous -p 36 execute -x ./queue && ptq purge ./queue && igneous image downsample file:///nfs/precomputed_xyz/ --mip 3 --num-mips 1 --fill-missing --queue ./queue && igneous -p 36 execute -x ./queue && ptq purge ./queue && igneous image downsample file:///nfs/precomputed_xyz/ --mip 4 --num-mips 1 --fill-missing --queue ./queue && igneous -p 36 execute -x ./queue && ptq purge ./queue && igneous image downsample file:///nfs/precomputed_xyz/ --mip 5 --num-mips 1 --fill-missing --queue ./queue && igneous -p 36 execute -x ./queue && ptq purge ./queue; Screenshot.2023-11-29.at.8.28.56.AM.png (view on web) <https://github.com/seung-lab/cloud-volume/assets/47464840/fb92b4a8-bf54-4815-b7c4-d65025e63010> Screenshot.2023-11-29.at.8.34.54.AM.png (view on web) <https://github.com/seung-lab/cloud-volume/assets/47464840/5b579fd0-9f4f-4756-9229-bd155d4aafe7> info.txt <https://github.com/seung-lab/cloud-volume/files/13503158/info.txt> — Reply to this email directly, view it on GitHub <#505 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AATGQSOAQR4PVDY2MH7NCHDYG5QYVAVCNFSM5HKIBQV2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBTGIZTCMBVGUZQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

manoaman · 2023-11-29T18:28:13Z

Yes, the xy plane looks okay to me. Looking at xz and yz planes while gradually zooming in, I see the borders of the images get corrected at the highest resolution. I no longer see these blurred borders. (And maybe the second highest too from looking at the chunk statistic which is kind of hard to tell).

When I compared with the version of chunks without the use of "--sharded" option in igneous image downsample, I don't see this multi overlayed blurry effect on xz and yz planes.

manoaman · 2023-11-30T21:48:38Z

Let me try on a different image if I can reproduce the same issue. I know for sure that first two mip levels generate more than a million chunks which causes an issue on the storage. There might be errors I overlooked in subsequent downsample steps... kind of difficult to debug at the moment.

william-silversmith · 2023-11-30T21:59:20Z

This is a really good idea if you can reproduce something I can def help debug it

…

On Thu, Nov 30, 2023, 4:48 PM manoaman ***@***.***> wrote: Let me try on a different image if I can reproduce the same issue. I know for sure that first two mip levels generate more than a million chunks which causes an issue on the storage. There might be errors I overlooked in subsequent downsample steps... kind of difficult to debug at the moment. — Reply to this email directly, view it on GitHub <#505 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AATGQSOA5FEQLTR43E2BQBDYHD5MBAVCNFSM5HKIBQV2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBTGQ3DCNRTGM4A> . You are receiving this because you were mentioned.Message ID: ***@***.***>

manoaman · 2023-12-02T21:45:16Z

Okay, debugging in progress. I try to simplify the test by only limiting downsample commands to first two~three mip levels. The following command lines where --sharded is excluded in the first downsample allows me to show the images as expected except the storage can only hold up to 1 million chunks per folder so I see tiles with black holes. I think this is excepted.

igneous image xfer file:///nafs/precomputed_xy/ file:///nfs/precomputed_xyz/ --mip 0 --chunk-size 128,128,128 --fill-missing --queue ./queue --sharded &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nfs/precomputed_xyz/ --mip 0 --num-mips 1 --volumetric --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue;

Next, I started over by running with --sharded option in the first downsample command line, I found the visualization breaking in both xz and yz planes.

(Questionable line)
igneous image downsample file:///nfs/precomputed_xyz/ --mip 0 --num-mips 1 --sharded --volumetric --fill-missing --queue ./queue

I will move on to test without --volumetric and run it over the weekend. My guess is that there is something strange thing happening in the first downsample attempt.

manoaman · 2023-12-05T07:06:18Z

Hi @william-silversmith , the following commands generated the result which was satisfying on the Neuroglancer viewer. As you mentioned, I think there is an issue with using --volumetric at the very first downsample step running igneous along with --sharded option.

igneous image xfer file:///nfs/xy_precomputed/ file:///nfs/xy_no_volumetric/ --mip 0 --chunk-size 128,128,64 --fill-missing --queue ./queue --sharded &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nfs/xy_no_volumetric/ --mip 0 --num-mips 1 --sharded --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nfs/xy_no_volumetric/ --mip 1 --num-mips 1 --sharded --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nfs/xy_no_volumetric/ --mip 2 --num-mips 1 --sharded --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nfs/xy_no_volumetric/ --mip 3 --num-mips 1 --sharded --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nfs/xy_no_volumetric/ --mip 4 --num-mips 1 --sharded --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nfs/xy_no_volumetric/ --mip 5 --num-mips 1 --sharded --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue &&
igneous image downsample file:///nfs/xy_no_volumetric/ --mip 6 --num-mips 1 --sharded --fill-missing --queue ./queue &&
igneous -p 36 execute -x ./queue &&
ptq purge ./queue;

One thing I found weird running after the questionable command, ... igneous image downsample file:///nfs/precomputed_xyz/ --mip 0 --num-mips 1 --sharded --volumetric --fill-missing --queue ./queue

I see the taskqueue with more completed tasks than what has been inserted. This is something I have not seen before.

$ ptq status ./queue
Inserted: 1064
Enqueued: 0 (0.0% left)
Completed: 1071 (100.7%)
Leased: 0 (--%) of queue

I know 2-2-1 downsample works but is there an alternative way I can accomplish 2-2-2 downsample with shard format for optimal viewing on Neuroglancer?

william-silversmith · 2023-12-07T08:14:37Z

I think this will take some debugging but I am otherwise occupied at the moment unfortunately... If possible, I would recommend sticking with 2x2x1 downsampling for shards for now.

william-silversmith added the question What is going on??? :thinking emoji: label Nov 4, 2021

manoaman mentioned this issue Jun 25, 2022

What is the largest 3d volumetric image for the practical Neuroglancer viewing? google/neuroglancer#399

Closed

Large files issue #505

Large files issue #505

Comments

manoaman commented Nov 4, 2021

fcollman commented Nov 4, 2021

manoaman commented Nov 4, 2021 • edited Loading

fcollman commented Nov 4, 2021 via email

william-silversmith commented Nov 4, 2021 • edited Loading

manoaman commented Nov 4, 2021

william-silversmith commented Nov 4, 2021 • edited Loading

manoaman commented Nov 6, 2021

william-silversmith commented Nov 6, 2021

manoaman commented Nov 6, 2021

william-silversmith commented Nov 6, 2021 • edited Loading

manoaman commented Nov 6, 2021

william-silversmith commented Nov 6, 2021 • edited Loading

manoaman commented Nov 7, 2021

william-silversmith commented Nov 7, 2021 via email

manoaman commented Nov 8, 2021

william-silversmith commented Nov 8, 2021

manoaman commented Nov 9, 2021

william-silversmith commented Nov 9, 2021

manoaman commented Nov 9, 2021 • edited Loading

william-silversmith commented Nov 9, 2021

manoaman commented Nov 10, 2021

william-silversmith commented Nov 10, 2021

manoaman commented Jun 25, 2022

william-silversmith commented Jun 25, 2022

manoaman commented Jun 27, 2022

william-silversmith commented Jun 27, 2022 • edited Loading

manoaman commented Jun 30, 2022 • edited Loading

william-silversmith commented Jul 1, 2022 via email

manoaman commented Oct 7, 2022

william-silversmith commented Oct 7, 2022 via email

manoaman commented Oct 7, 2022

william-silversmith commented Oct 11, 2022 via email

manoaman commented Oct 12, 2022

william-silversmith commented Oct 12, 2022

manoaman commented Oct 13, 2022

william-silversmith commented Oct 13, 2022 • edited Loading

manoaman commented Oct 13, 2022

manoaman commented Oct 31, 2023 • edited Loading

william-silversmith commented Oct 31, 2023

manoaman commented Oct 31, 2023

william-silversmith commented Oct 31, 2023 via email

manoaman commented Nov 7, 2023

william-silversmith commented Nov 7, 2023

manoaman commented Nov 7, 2023

william-silversmith commented Nov 7, 2023

manoaman commented Nov 28, 2023

william-silversmith commented Nov 28, 2023 via email

manoaman commented Nov 28, 2023

william-silversmith commented Nov 28, 2023

manoaman commented Nov 28, 2023

manoaman commented Nov 29, 2023

william-silversmith commented Nov 29, 2023 via email

manoaman commented Nov 29, 2023

manoaman commented Nov 30, 2023

william-silversmith commented Nov 30, 2023 via email

manoaman commented Dec 2, 2023

manoaman commented Dec 5, 2023

william-silversmith commented Dec 7, 2023

manoaman commented Nov 4, 2021 •

edited

Loading

william-silversmith commented Nov 4, 2021 •

edited

Loading

william-silversmith commented Nov 4, 2021 •

edited

Loading

william-silversmith commented Nov 6, 2021 •

edited

Loading

william-silversmith commented Nov 6, 2021 •

edited

Loading

manoaman commented Nov 9, 2021 •

edited

Loading

william-silversmith commented Jun 27, 2022 •

edited

Loading

manoaman commented Jun 30, 2022 •

edited

Loading

william-silversmith commented Oct 13, 2022 •

edited

Loading

manoaman commented Oct 31, 2023 •

edited

Loading