Skip to content

Allow larger files #6

Open
Open
@killercup

Description

@killercup

As of #1 and its implementation in #2, the u64 values that the index gives us is split into two: 32bit for the position in the archive, and 32bit to specify the length of the file. The means 4GB for the archive in general, and 4GB max size per file. Here are some ways to change these limits.

Increase addressable archive size bye enforce write alignment

If we align the start of each file written to the archive by 2^n bytes, we get n more bits to use for addressing. For example: Align files so their address ends with …0000 and we can shift all addresses by 4 bits, yielding 2^4=16 times the addressable archive size. This of course introduces zero-ed gaps in the archive files.

Using more bits for addressing

Instead of splitting the 64bit integer into two 32bit integers, we might as split it into, for example 40bit and 24bit -- shifting the limits to 1TB archives containing files up to 16MB. This should work very well for the rustdoc use case.

This can of course be combined with the alignment option described above, to yield (2^36)*(2^4)=1TB archive files containing 4 byte aligned files up to 2^28=268MB.

Patching fst to allow other value types

The fst docs mention that in the future, it should be possible to map to something other than a u64. We can make that future happen. This seems to most complicated and time-consuming of all the options, though :)


Please correct my math, it's late and I had a few beers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions