-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement only-if-cached
mode for HTTP
#1954
Conversation
The HTTP `cache-control` header has a well-defined set of potential values. This helper class will allow us to central its parsing.
This provides the PFC with the ability to parse the `cache-control` CGI, using the XrdOucCacheDirective helper class, and understand the no-store and no-cache directives. *Note* this handles the simple cases only -- if the file in the cache was already opened by another client then we will serve from that file according to the original open rules.
With this, clients will receive the RFC standard 504 Gateway Timeout if the file is not already cached. This subtly changes the semantics of the creation time in the cinfo file to be when the first block of data is written as opposed to when the cinfo is created (as opening but not reading the file will create the cinfo file).
I am OK with the HTTP part. I never worked with the XrdPfc library so I leave it to somebody more expert on that :) |
For only-if-cached ... isn't a cache supposed to return error when some non-cached data is requested? This could be implemented as a flag in read-request from IO. I understand this might be a problem for xcache as there might be hole in a file causing the read to be going smoothly for a long time before it hits a hole in cached data. Again (as for #1953) I think this should not be a state of XrdPfcFile. We can find a way to disable prefetching on such IO. And, also again, for direct mode proxies we should find a way to coalesce clients with the same cache-control to the same IO. One has to think at this point how various control flags interact ... and which are clearly invalid, e.g., no-cache&only-if-cached. |
Possibly, but this is fairly impractical. Remember the HTTP response is sent up front. So, anything after the first byte and you lose the ability to send an error. So, for now, calling “any cached byte means the response is cached” is good enough for me.
that would require some serious surgery of the OFS layer to have multiple file handles per file and it’d break the threading contracts … I think the approximations done here are not perfect but better than the more serious surgery. |
OK, agreed about the semantics. Let's see how we deal with #1953 and then proceed. |
And I shamefully admit I don;t know how http file transfer works ... I was thinking about this from the perspective of xroot protocol (as these flags would apply there as well) where every sub read gets a status code. Or you were only thinking of using http? |
I would like the feature to be generally useful for any protocol. However, since the driver is better HTTP 1.1 compliance, unless there's a need to push the functionality all the way through to the reads themselves, it seems simpler to leave it as a file-level property. I can see some downsides around code complexity in the read path. |
To that end. Could the actual flags be fully documented in a "TO-DO" file
explaining what each one does. If this is applicable to file level only
then it's easily extended to xroot protocol. At the moment no one (other
than JT) is asking for this kind of functionality and when it is asked
it's in the context of HTTP. I also agree that we don't want to start
changing everthing to provide 100% consistency in terms of cache control.
From what I understand, these flags are considered "best effort" and might
be ignored if they cannot be fullfilled even in the HTTP world world.
In the end, it all winds up being a compromise.
…On Tue, 14 Mar 2023, Brian P Bockelman wrote:
> Or you were only thinking of using http?
I would like the feature to be generally useful for any protocol. However, since the driver is better HTTP 1.1 compliance, unless there's a need to push the functionality all the way through to the reads themselves, it seems simpler to leave it as a file-level property. I can see some downsides around code complexity in the read path.
--
Reply to this email directly or view it on GitHub:
#1954 (comment)
You are receiving this because your review was requested.
Message ID: ***@***.***>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is in the same comment as the max-age pull request. The same comments apply here.
This implements the HTTP
only-if-cached
mode, allowing the client to prevent triggering a download from the origin.This is envisioned to be used in the case where pulling from the origin is expensive and the client may want to decide to go to another cache first.
This is based on top of #1953 - to be reviewed after that work is committed.