pass around metadata #765

ThomasWaldmann · 2016-03-17T22:00:30Z

in borg, there is a flow of data through the different stages / layers, but metadata of files / of content is missing mostly. we could use a metadata dict and pass it around with the data, e.g. as a tuple (meta, data).

e.g. currently, the compression component is statically set up, you choose it via commandline param.
it could be that an entry in the meta dict determines the compression that will be used.
the entry could default to whatever commandline says, but it could also be changed dynamically (e.g. if the file reader knows it is .mp3 and can't be compressed, so it sets meta["compression"]='none' for that data).

e.g. it could be also used for sparse files, so hole=True/False can get passed around.

Chunk is a namedtuple of (meta, data), create chunks using mkchunk(data, **meta). This does not yet have any visible functionality, meta is always empty dict right now.

pass meta-data around, fixes #765

ThomasWaldmann · 2020-12-13T18:27:59Z

related to #14

ThomasWaldmann · 2020-12-13T18:59:45Z

The infrastructure added in #934 was removed again in #2364. Trying to find out why...

ThomasWaldmann · 2020-12-13T19:04:53Z

Can't find out why.

I guess we need this for sparse handling (and maybe also for other metadata later).

@enkore do you remember?

ThomasWaldmann · 2021-01-15T22:18:04Z

Anyway, in #5620 I re-added a little bit of it to support communicating between chunker and hasher.

After the hasher, everything is still as it was - the compressor will not get any metadata yet.

ThomasWaldmann · 2023-04-02T20:05:25Z

Considering the metadata (about sparseness) produced by the chunker (mostly by the fixed chunker, a bit less by the buzhash chunker):

if it is a piece of DATA: metadata not useful to compressor, the compressor will need to compress the data anyway
ZEROS / SPARSE HOLE: the compressor will compress the each first <size> block of zeros and the chunk will be stored in the repo. any other all zeros chunk of same size will be deduplicated, so we do not need to pass this metadata to the compressor either.

ThomasWaldmann self-assigned this Mar 17, 2016

This was referenced Mar 17, 2016

advanced sparse file support #14

Open

file type based chunking / compression heuristic #82

Closed

ThomasWaldmann mentioned this issue Apr 17, 2016

small-steps integration of multithreading #929

Open

ThomasWaldmann closed this as completed in a345b34 Apr 18, 2016

ThomasWaldmann added a commit that referenced this issue Apr 18, 2016

Merge pull request #934 from ThomasWaldmann/pass-meta

87e211f

pass meta-data around, fixes #765

ThomasWaldmann reopened this Dec 13, 2020

ThomasWaldmann added this to the 2.0.0b6 milestone Apr 2, 2023

ThomasWaldmann closed this as completed Apr 2, 2023

ThomasWaldmann removed this from the 2.0.0b6 milestone Apr 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pass around metadata #765

pass around metadata #765

ThomasWaldmann commented Mar 17, 2016

ThomasWaldmann commented Dec 13, 2020

ThomasWaldmann commented Dec 13, 2020

ThomasWaldmann commented Dec 13, 2020

ThomasWaldmann commented Jan 15, 2021

ThomasWaldmann commented Apr 2, 2023 •

edited

Loading

pass around metadata #765

pass around metadata #765

Comments

ThomasWaldmann commented Mar 17, 2016

ThomasWaldmann commented Dec 13, 2020

ThomasWaldmann commented Dec 13, 2020

ThomasWaldmann commented Dec 13, 2020

ThomasWaldmann commented Jan 15, 2021

ThomasWaldmann commented Apr 2, 2023 • edited Loading

ThomasWaldmann commented Apr 2, 2023 •

edited

Loading