Provide .ls endpoint #2

dasJ · 2020-01-14T14:13:47Z

Proposed solution

The upstream nixos cache provides an .ls endpoint which contain directory indexes for the nar files.
Tools like nix-index use that. It would be convenient to have the same funcitonality in eris so nix-index can be extended to fetch multiple caches.
Example URL: http://cache.nixos.org/wi96xcbm63zccfxi5f648b9pkak9d62k.ls is a listing of http://cache.nixos.org/wi96xcbm63zccfxi5f648b9pkak9d62k.narinfo

Alternatives considered

Don't implement it I guess

Additional context

Looking at the nix-index source code, the file is either uncompressed, brotli-compressed, or xz-compressed: https://github.com/bennofs/nix-index/blob/master/src/hydra.rs#L201
The example above is just a JSON with the file structure: curl -L -o - http://cache.nixos.org/wi96xcbm63zccfxi5f648b9pkak9d62k.ls | brotli -d

The text was updated successfully, but these errors were encountered:

thoughtpolice · 2020-01-20T20:52:06Z

Eris is currently stateless, so this requires a (recursive!) filesystem query for every .ls request to stat everything, in order to build the JSON representation. (Recursive stat is exactly how Nix itself builds this representation; lookup listNar in the Nix source code.)

This approach is "fine" for Hydra because it can generate these indexes at the time it builds the derivation, just before uploading them into the S3 bucket: the cost to query once the file has been built is zero. That isn't the case for us. stat is, generally, going to be very expensive -- mostly because it has to access the filesystem metadata blocks to get at the information, and on most filesystems that isn't going to be cheap (e.g. most metadata isn't going to fit in cache lines, things like that), and in fact many times probably implies locking somewhere. And in practice it's going to be very bad on any derivation that has a "shallow" directories with many, many files inside, which will be one of the worst cases. I wouldn't be surprised if you could easily DDOS/lockup Eris in such instances with repeated queries by just consuming all the HTTP worker threads with IO stalls.

This request is quite reasonable on the face of it all, but probably quite hard to implement well -- I expect this to be very, very high overhead per .ls request (meaning nix-index is probably going to thrash the hell out of it), and I don't see a good way to immediately implement it.

thoughtpolice · 2020-01-20T20:57:40Z

One way to do this that might not lock up the HTTP worker threads could be to asynchronously launch a task that performs this lookup in the background, and passes data back/notifies the worker thread responsible for the client when it's done, so it can return the info. This means that the actual worker threads for the Mojo server can handle other requests in the mean time. But on the other hand, it doesn't solve the underlying problem which is that simply querying this information on-demand simply isn't very efficient in the first place, and has other issues (how many processes should you leave outstanding at once? Should you allow bursting e.g. with token buckets? What happens if enough processes just eat up all your IOPS anyway and everything grinds to a halt? Etc etc.)

dasJ added the A-enhancement New feature or request label Jan 14, 2020

thoughtpolice added E-hard Hard issues P-low Low priority issues labels Jan 20, 2020

thoughtpolice added this to the Unknown milestone Feb 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide .ls endpoint #2

Provide .ls endpoint #2

dasJ commented Jan 14, 2020 •

edited

thoughtpolice commented Jan 20, 2020 •

edited

thoughtpolice commented Jan 20, 2020 •

edited

Provide .ls endpoint #2

Provide .ls endpoint #2

Comments

dasJ commented Jan 14, 2020 • edited

Proposed solution

Alternatives considered

Additional context

thoughtpolice commented Jan 20, 2020 • edited

thoughtpolice commented Jan 20, 2020 • edited

dasJ commented Jan 14, 2020 •

edited

thoughtpolice commented Jan 20, 2020 •

edited

thoughtpolice commented Jan 20, 2020 •

edited