Implement `only-if-cached` mode for HTTP #1954

bbockelm · 2023-03-11T20:31:04Z

This implements the HTTP only-if-cached mode, allowing the client to prevent triggering a download from the origin.

This is envisioned to be used in the case where pulling from the origin is expensive and the client may want to decide to go to another cache first.

This is based on top of #1953 - to be reviewed after that work is committed.

The HTTP `cache-control` header has a well-defined set of potential values. This helper class will allow us to central its parsing.

This provides the PFC with the ability to parse the `cache-control` CGI, using the XrdOucCacheDirective helper class, and understand the no-store and no-cache directives. *Note* this handles the simple cases only -- if the file in the cache was already opened by another client then we will serve from that file according to the original open rules.

Fixes xrootd#1886

With this, clients will receive the RFC standard 504 Gateway Timeout if the file is not already cached. This subtly changes the semantics of the creation time in the cinfo file to be when the first block of data is written as opposed to when the cinfo is created (as opening but not reading the file will create the cinfo file).

ccaffy · 2023-03-13T16:29:38Z

I am OK with the HTTP part. I never worked with the XrdPfc library so I leave it to somebody more expert on that :)

osschar · 2023-03-13T23:21:44Z

For only-if-cached ... isn't a cache supposed to return error when some non-cached data is requested? This could be implemented as a flag in read-request from IO.

I understand this might be a problem for xcache as there might be hole in a file causing the read to be going smoothly for a long time before it hits a hole in cached data.

Again (as for #1953) I think this should not be a state of XrdPfcFile. We can find a way to disable prefetching on such IO.

And, also again, for direct mode proxies we should find a way to coalesce clients with the same cache-control to the same IO. One has to think at this point how various control flags interact ... and which are clearly invalid, e.g., no-cache&only-if-cached.

bbockelm · 2023-03-13T23:40:44Z

For only-if-cached ... isn't a cache supposed to return error when some non-cached data is requested?

Possibly, but this is fairly impractical. Remember the HTTP response is sent up front. So, anything after the first byte and you lose the ability to send an error.

So, for now, calling “any cached byte means the response is cached” is good enough for me.

And, also again, for direct mode proxies we should find a way to coalesce clients with the same cache-control to the same IO.

that would require some serious surgery of the OFS layer to have multiple file handles per file and it’d break the threading contracts … I think the approximations done here are not perfect but better than the more serious surgery.

osschar · 2023-03-14T08:22:16Z

OK, agreed about the semantics. Let's see how we deal with #1953 and then proceed.

osschar · 2023-03-14T08:39:00Z

And I shamefully admit I don;t know how http file transfer works ... I was thinking about this from the perspective of xroot protocol (as these flags would apply there as well) where every sub read gets a status code.

Or you were only thinking of using http?

bbockelm · 2023-03-14T12:32:04Z

Or you were only thinking of using http?

I would like the feature to be generally useful for any protocol. However, since the driver is better HTTP 1.1 compliance, unless there's a need to push the functionality all the way through to the reads themselves, it seems simpler to leave it as a file-level property. I can see some downsides around code complexity in the read path.

abh3 · 2023-03-14T14:56:23Z

To that end. Could the actual flags be fully documented in a "TO-DO" file explaining what each one does. If this is applicable to file level only then it's easily extended to xroot protocol. At the moment no one (other than JT) is asking for this kind of functionality and when it is asked it's in the context of HTTP. I also agree that we don't want to start changing everthing to provide 100% consistency in terms of cache control. From what I understand, these flags are considered "best effort" and might be ignored if they cannot be fullfilled even in the HTTP world world. In the end, it all winds up being a compromise.

…

On Tue, 14 Mar 2023, Brian P Bockelman wrote: > Or you were only thinking of using http? I would like the feature to be generally useful for any protocol. However, since the driver is better HTTP 1.1 compliance, unless there's a need to push the functionality all the way through to the reads themselves, it seems simpler to leave it as a file-level property. I can see some downsides around code complexity in the read path. -- Reply to this email directly or view it on GitHub: #1954 (comment) You are receiving this because your review was requested. Message ID: ***@***.***>

abh3

I guess this is in the same comment as the max-age pull request. The same comments apply here.

amadio · 2023-11-01T10:29:59Z

Thank you for the idea, @bbockelm, this has been implemented in #2104 and will be in the next feature release.

bbockelm added 4 commits March 11, 2023 11:13

Add helper class for parsing cache-control header

1e20fbd

The HTTP `cache-control` header has a well-defined set of potential values. This helper class will allow us to central its parsing.

Add support for the cache-control header

e63f575

Fixes xrootd#1886

abh3 requested review from osschar, abh3 and ccaffy March 12, 2023 06:20

bbockelm mentioned this pull request Mar 13, 2023

Implement max-age directive #1957

Closed

abh3 reviewed Apr 14, 2023

View reviewed changes

alja mentioned this pull request Oct 13, 2023

Implement only-if-cached cache control using XrdPfcFsctl #2104

Merged

amadio closed this Nov 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement `only-if-cached` mode for HTTP #1954

Implement `only-if-cached` mode for HTTP #1954

bbockelm commented Mar 11, 2023

ccaffy commented Mar 13, 2023

osschar commented Mar 13, 2023

bbockelm commented Mar 13, 2023

osschar commented Mar 14, 2023

osschar commented Mar 14, 2023

bbockelm commented Mar 14, 2023

abh3 commented Mar 14, 2023 via email

abh3 left a comment

amadio commented Nov 1, 2023

Implement only-if-cached mode for HTTP #1954

Implement only-if-cached mode for HTTP #1954

Conversation

bbockelm commented Mar 11, 2023

ccaffy commented Mar 13, 2023

osschar commented Mar 13, 2023

bbockelm commented Mar 13, 2023

osschar commented Mar 14, 2023

osschar commented Mar 14, 2023

bbockelm commented Mar 14, 2023

abh3 commented Mar 14, 2023 via email

abh3 left a comment

Choose a reason for hiding this comment

amadio commented Nov 1, 2023

Implement `only-if-cached` mode for HTTP #1954

Implement `only-if-cached` mode for HTTP #1954