Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chunked: store cache as binary and use a bloom filter #1870

Merged
merged 8 commits into from
Apr 19, 2024

Commits on Apr 19, 2024

  1. chunked: move cache file generation to separate function

    Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
    giuseppe committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    397943b View commit details
    Browse the repository at this point in the history
  2. chunked: store digest in binary format

    use the binary representation for a given digest, it helps reducing
    the file size by ~25%.
    
    Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
    giuseppe committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    3347254 View commit details
    Browse the repository at this point in the history
  3. chunked: store file offset and length in binary format

    it helps reducing the cache file size by ~25%.
    
    Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
    giuseppe committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    e6793e3 View commit details
    Browse the repository at this point in the history
  4. chunked: add implementation for a bloom filter

    Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
    giuseppe committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    6668761 View commit details
    Browse the repository at this point in the history
  5. chunked: use a bloom filter to speedup lookup

    use a bloom filter to speed up lookup of digests in a cache file.
    
    The biggest advantage is that it reduces page faults with the mmap'ed
    cache file.
    
    Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
    giuseppe committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    e9a96e0 View commit details
    Browse the repository at this point in the history
  6. chunked: store file names separately

    so that the same file path is stored only once in the cache file.
    
    After this change, the cache file measured on the fedora:{38,39,40}
    images is in average ~6% smaller.
    
    Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
    giuseppe committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    59ac039 View commit details
    Browse the repository at this point in the history
  7. chunked: store file locations as binary

    it reduces the cache file size by ~3%.
    
    Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
    giuseppe committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    9619a53 View commit details
    Browse the repository at this point in the history
  8. chunked: bump version number for cache file

    Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
    giuseppe committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    065a2f3 View commit details
    Browse the repository at this point in the history