Split the chunk offset tables to reduce memory usage while dumping #5

ph0llux · 2023-04-07T13:52:09Z

To reduce the memory usage while dumping a very large amount of data (e.g. a physical disk with 20TB) to a single segment, the chunk offset table should be split in multiple tables - this tables should be written periodically into the Zff container (currently, the full table is cached in memory while dumping the data - a single table entry needs 2*8 bytes, so you need 500MB memory space for each TB of data by using a chunk size of 32kB).
At the end of the segment there should be an additional table which contains the appropriate offsets to the chunk-offset tables.
Due this tables are sorted HashMaps (or BTreeMaps), they can be variable in their size.

ph0llux added the Zffv3 label Apr 7, 2023

ph0llux added this to the Zffv3 milestone Apr 7, 2023

ph0llux added the Format redesign label Apr 7, 2023

ph0llux closed this as completed Dec 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split the chunk offset tables to reduce memory usage while dumping #5

Split the chunk offset tables to reduce memory usage while dumping #5

ph0llux commented Apr 7, 2023

Split the chunk offset tables to reduce memory usage while dumping #5

Split the chunk offset tables to reduce memory usage while dumping #5

Comments

ph0llux commented Apr 7, 2023