CacheDataset shared cache issues

Due to the changes introduced in the PR:
https://github.com/Project-MONAI/MONAI/pull/5630

it became very inconvenient to use shared cache, and leads to potential crashes 

previously, we introduced shared cache (via ListProxy) to speed up training and validation workflows significantly for users during multi-gpu training (with a minimal user changes, by simply setting CacheDataset(runtime_cahe=True))
https://github.com/Project-MONAI/MONAI/pull/5365



with changes merged in PR https://github.com/Project-MONAI/MONAI/pull/5630, all that simplicity was removed, and shared cache allocation and management is left to a user.  A user needs to allocate it,  and synchronize between processes, and even manually set it to be of proper length.  

Internally CacheDataset assigns self.cache_num to keep track of a number of cached elements.  Due to  PR https://github.com/Project-MONAI/MONAI/pull/5630, there is disconnect between  self.cache_num and self._cache, they are not of the same length, and a user doesn't know the self.cache_num to allocate a proper length.

 the new (and only way to use shared cache is) CacheDataset(runtime_cache = list_proxy)

1. bug: len(list_proxy) == 0,  and self.num_cache > 0,  crash
2. bug: len(list_proxy) < self.num_cache , crash
3. potential bug: len(list_proxy) > self.num_cache, disconnect in length, can lead to unforeseen bugs in the future
4. major inconvenience: a User needs to allocate cache manually as Manager().list() in master process and pass it to children. OR a user needs to allocate it in child process and use manual broadcasting  to synchronize.  Then ensure  a proper length. All these steps will be same for all users and use cases.  So now, there will be much more redundant coding every time someone wants to use shared memory caching. 





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CacheDataset shared cache issues #5633

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CacheDataset shared cache issues #5633

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions