implement CacheStorage? #2071

jimmywarting · 2023-04-14T15:23:46Z

I was wondering if you would be willing to look into implementing the Cache Storage api

it was something i was looking into in the past... but it got abandoned.
it is pretty much tightly coupled with Request/Response classes

and i would like to use it to avoid having to make extra requests if i have the key (Request) cached locally.

it's also in fact tightly coupled with the cache option in Request too.
it would be neat if this could just work with some default storage and Request / Responses where saved locally on the disc somewhere.

fetch(url, {
  cache: "no-cache", // *default, no-cache, reload, force-cache, only-if-cached
})

manual working with caches works to...

    caches.match(request).then((response) => {
      // caches.match() always resolves
      // but in case of success response will have value
      if (response !== undefined) {
        return response;
      } else {
        return fetch(event.request)
          .then((response) => {
            // response may be used only once
            // we need to save clone to put one copy in cache
            // and serve second one
            let responseClone = response.clone();

            caches.open("v1").then((cache) => {
              cache.put(event.request, responseClone);
            });
            return response;
          })
          .catch(() => caches.match("/gallery/myLittleVader.jpg"));
      }
    })

(taken from mdn)

maybe caches.open('node:undici') could store the default request / responses from fetch when using the cache option?

The text was updated successfully, but these errors were encountered:

ronag · 2023-04-14T15:29:04Z

I don't have time to do it myself but I'd be happy to see something like this added.

KhafraDev · 2023-04-14T15:29:27Z

I'll look into it.

KhafraDev · 2023-04-14T15:37:54Z

I believe we would need sqlite or some alternative to implement this correctly, which makes it a non-starter: "Furthermore, because CacheStorage requires file-system access...". Deno also uses sqlite for this.

jimmywarting · 2023-04-14T15:39:30Z

Oh, didn't know deno had cache storage, that is cool :)
then i think NodeJS should also definitely have it too!

KhafraDev · 2023-04-14T15:40:56Z

If node was to implement some form of sqlite database this would be actionable, but we can't really do anything better than an in-memory cache (which is worse than a userland alternative).

ronag · 2023-04-14T15:41:25Z

RocksDB with the BlobDB variant would be a perfect fit IMHO

ronag · 2023-04-14T15:42:01Z

@mcollina Any thoughts? Would vendoring a db into node core even be an option?

jimmywarting · 2023-04-14T15:44:14Z

I think it would be good if the nodes http-loader could utilize fetch + CacheStorage.
But also for other things such as caching static api responses like some config etc.

KhafraDev · 2023-04-14T15:46:09Z

I think it would be useful, but better as another library. Nothing requires fetch internals afaict so it should be possible. Adding a postinstall script and a million build dependencies doesn't seem like a good idea for undici.

jimmywarting · 2023-04-14T15:49:02Z

... but better as another library. Nothing requires fetch internals afaict so it should be possible

What if you want to use: fetch(url, { cache: 'only-if-cached' })?
fetch would have to do something...

KhafraDev · 2023-04-14T15:53:10Z

I was talking about the CacheStorage implementation itself, not really integrating it into fetch, which might need fetch internals. There are a number of fetch options that are either ignored or have no effect, so setting cache won't do anything. It is also possible for another library to wrap undici's fetch and implement the cache option themselves, similar as to what's been done with cookie jars and whatnot.

KhafraDev · 2023-04-14T19:00:32Z

Here's what I came up with before I got tired of working on it: https://github.com/KhafraDev/node-cachestorage

It implements CacheStorage and Cache. None of Cache is implemented other than matching so you can't really test it out. If node ever got a database of some sort I'd probably copy this implementation into undici so feel free to contribute 😄.

mcollina · 2023-04-14T20:34:07Z

To be honest, I don't think a naive implementation needs a database for this. We could just build it on top of a temporary folder and hashed URLs.

KhafraDev · 2023-04-14T22:12:59Z

yeah, but then it's a worse version of something that can be done elsewhere. It also adds a lot more complexity to an already complicated (badly written) spec.

jimmywarting · 2023-04-15T08:19:12Z

Having a IndexedDB would be something useful.
I know it's not the absolut loveliest api \w callbacks and what not. but it sure can save a lot of things structuralClone is able to clone. it includes: Circular ref, Object Literals, BigInts, Map, Set, ArrayBuffer, TypedArray, Blob, Files, Error, regex, FileSystem handles and some crypto stuff, etc.

and other databases are not so good at saving javascript objects that are structural cloneable. it goes very much hand in hand with the v8.Serializer to make it a searchable.

I could definitely see me building a CacheStorage on top of IndexedDB.
There are quite a bunch of npm package that expects that it exist so it can store curtain things without really having to care about the environment it's operating in. it's almost like how every android app have a built in nosql database
So it is much easier to build polyfills for this kind of stuff.

ronag · 2023-04-15T08:30:06Z

@benjamingr does implementing a db make us more web compliant?

jimmywarting · 2023-04-15T08:43:26Z

Now when i think about it... @mcollina have a good argument about using the file system.
Responses could be very large so being able to pipe the body to/from the disc without allocating much memory would also be useful. some ppl download pretty large stuff.

and doing things such as:

const responseClone = response.clone()
cache.put(request, responseClone)
return response

lets you operate on it much earlier in parallel then if you had to put everything in a database and wait for it to be copied over with a single ArrayBuffer or something

The cache storage pretty much lets you iterate over all Request/Responses and sees them.
so if the Responses could be constructed with a lazy stream or heck even a disc based blob then that would be a plus

So a file based storage would not actually be that bad if it's stream/memory friendlier

KhafraDev · 2023-04-15T15:07:12Z

I mentioned it previously but the CacheStorage/Cache spec is not written as concisely/cleanly as the fetch spec. It's actually much worse than the websocket spec, which is impressive. None of the steps are clear and it feels like steps are skipped over/not implemented (look at Cache.addAll, the spec doesn't mention caching the responses but it expects you to). Adding another layer of complexity here makes it even harder to implement.

Referencing Deno: their implementation is massively outdated and doesn't include many methods, either on Cache or CacheStorage. They're missing Cache.prototype.add/Cache.prototype.addAll, CacheStorage.prototype.matchAll, and probably a few others. It's also behind a flag I suspect, because logging globalThis.caches returns undefined.

Not to mention that the only way to open file-backed blobs is through fs.openAsBlob which was added in v19.8.0 and is still experimental.

However if you want to implement it (could be a ~~fun~~ weekend project), I have a branch that implements the outlines of both Cache and CacheStorage, with webidl conversions and everything needed.

ronag · 2023-04-15T15:32:48Z

I think using e.g. the chromium implementation is the best spec 🤷‍♂️

jimmywarting · 2023-04-15T15:51:26Z

maybe it dose not necessary have to be just a disc- based blob. openAsBlob would take a long time until everybody could actually start using it... Response / fetch just probably see blobs in the init and convert the body to streams anyway? body = blob.stream() or something and Response also support streams and async iterables too. so it could maybe just be as simple as just doing something like new Response(fs.createReadStream(path))

KhafraDev · 2023-04-15T16:07:48Z

However we also need to cache the body (and request headers and method) because each cache match is meant to return a new Response.

I took a look at the chromium implementation and it looks like each cache entry creates 3 temporary files (1 for the body, 1 for headers, and 1 for other request stuff). But since it's in c++ it's kinda hard to translate that to js.

jimmywarting · 2023-04-15T20:29:24Z

the Request body isn't saved, right? that's how i remembered it... only the request header & request url is saved?
would it not just be possible to save just 2 files... one for Response body and then a json with request and response info...

// 1.json
{
  url: "https://example.com/cat.png"
  request: {
    headers,
    url
  },
  response: {
    headers,
    status,
    statusText,
    url: response.url // the final response url after redirects
  },
}

// 1.bin
the hole response body...

KhafraDev · 2023-04-15T21:00:36Z

Yep, request body doesn't need to be saved, I was referring to the response body there. I'm not sure why they use 3 files, maybe I just misread because I can't find the comment again.

Their implementation creates a directory for each cache (ie. await caches.open('v1') will create a v1 directory, which is where the files are then saved. But we also don't want to name the files 1.json, 1.bin etc. because it wouldn't be very performant to do a cache lookup.

Some limitations/design details/questions:

We might not be able to create directories/files based on the hostname. On windows, for example, this is limited to 255 or 256 characters. It's also possible just to throw an error is the cache name is too long. Maybe we can hash it in a way that it's guaranteed to fit within that length?
One url can be cached multiple times depending on the vary header, so we should create new folders for each url (or move to a separate folder when 1 url is cached multiple times?).

benjamingr · 2023-04-17T11:21:51Z

@ronag

@benjamingr does implementing a db make us more web compliant?

Not as far as I'm aware but it would make it easier to write code that works both in browsers/node which is the point of web compliance in the first place.

jimmywarting added the enhancement New feature or request label Apr 14, 2023

KhafraDev mentioned this issue Apr 17, 2023

Cache storage #2076

Merged

KhafraDev closed this as completed in #2076 Apr 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement CacheStorage? #2071

implement CacheStorage? #2071

jimmywarting commented Apr 14, 2023

ronag commented Apr 14, 2023

KhafraDev commented Apr 14, 2023

KhafraDev commented Apr 14, 2023

jimmywarting commented Apr 14, 2023

KhafraDev commented Apr 14, 2023

ronag commented Apr 14, 2023 •

edited

ronag commented Apr 14, 2023

jimmywarting commented Apr 14, 2023

KhafraDev commented Apr 14, 2023

jimmywarting commented Apr 14, 2023 •

edited

KhafraDev commented Apr 14, 2023

KhafraDev commented Apr 14, 2023

mcollina commented Apr 14, 2023

KhafraDev commented Apr 14, 2023

jimmywarting commented Apr 15, 2023

ronag commented Apr 15, 2023

jimmywarting commented Apr 15, 2023 •

edited

KhafraDev commented Apr 15, 2023

ronag commented Apr 15, 2023

jimmywarting commented Apr 15, 2023

KhafraDev commented Apr 15, 2023

jimmywarting commented Apr 15, 2023

KhafraDev commented Apr 15, 2023 •

edited

benjamingr commented Apr 17, 2023

implement CacheStorage? #2071

implement CacheStorage? #2071

Comments

jimmywarting commented Apr 14, 2023

ronag commented Apr 14, 2023

KhafraDev commented Apr 14, 2023

KhafraDev commented Apr 14, 2023

jimmywarting commented Apr 14, 2023

KhafraDev commented Apr 14, 2023

ronag commented Apr 14, 2023 • edited

ronag commented Apr 14, 2023

jimmywarting commented Apr 14, 2023

KhafraDev commented Apr 14, 2023

jimmywarting commented Apr 14, 2023 • edited

KhafraDev commented Apr 14, 2023

KhafraDev commented Apr 14, 2023

mcollina commented Apr 14, 2023

KhafraDev commented Apr 14, 2023

jimmywarting commented Apr 15, 2023

ronag commented Apr 15, 2023

jimmywarting commented Apr 15, 2023 • edited

KhafraDev commented Apr 15, 2023

ronag commented Apr 15, 2023

jimmywarting commented Apr 15, 2023

KhafraDev commented Apr 15, 2023

jimmywarting commented Apr 15, 2023

KhafraDev commented Apr 15, 2023 • edited

benjamingr commented Apr 17, 2023

ronag commented Apr 14, 2023 •

edited

jimmywarting commented Apr 14, 2023 •

edited

jimmywarting commented Apr 15, 2023 •

edited

KhafraDev commented Apr 15, 2023 •

edited