Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement only-if-cached mode for HTTP #1954

Closed
wants to merge 4 commits into from

Conversation

bbockelm
Copy link
Contributor

This implements the HTTP only-if-cached mode, allowing the client to prevent triggering a download from the origin.

This is envisioned to be used in the case where pulling from the origin is expensive and the client may want to decide to go to another cache first.

This is based on top of #1953 - to be reviewed after that work is committed.

The HTTP `cache-control` header has a well-defined set of potential
values.  This helper class will allow us to central its parsing.
This provides the PFC with the ability to parse the `cache-control`
CGI, using the XrdOucCacheDirective helper class, and understand
the no-store and no-cache directives.

*Note* this handles the simple cases only -- if the file in the
cache was already opened by another client then we will serve from that
file according to the original open rules.
With this, clients will receive the RFC standard 504 Gateway Timeout
if the file is not already cached.

This subtly changes the semantics of the creation time in the cinfo
file to be when the first block of data is written as opposed to when
the cinfo is created (as opening but not reading the file will create
the cinfo file).
@ccaffy
Copy link
Contributor

ccaffy commented Mar 13, 2023

I am OK with the HTTP part. I never worked with the XrdPfc library so I leave it to somebody more expert on that :)

@osschar
Copy link
Contributor

osschar commented Mar 13, 2023

For only-if-cached ... isn't a cache supposed to return error when some non-cached data is requested? This could be implemented as a flag in read-request from IO.

I understand this might be a problem for xcache as there might be hole in a file causing the read to be going smoothly for a long time before it hits a hole in cached data.

Again (as for #1953) I think this should not be a state of XrdPfcFile. We can find a way to disable prefetching on such IO.

And, also again, for direct mode proxies we should find a way to coalesce clients with the same cache-control to the same IO. One has to think at this point how various control flags interact ... and which are clearly invalid, e.g., no-cache&only-if-cached.

@bbockelm
Copy link
Contributor Author

For only-if-cached ... isn't a cache supposed to return error when some non-cached data is requested?

Possibly, but this is fairly impractical. Remember the HTTP response is sent up front. So, anything after the first byte and you lose the ability to send an error.

So, for now, calling “any cached byte means the response is cached” is good enough for me.

And, also again, for direct mode proxies we should find a way to coalesce clients with the same cache-control to the same IO.

that would require some serious surgery of the OFS layer to have multiple file handles per file and it’d break the threading contracts … I think the approximations done here are not perfect but better than the more serious surgery.

@osschar
Copy link
Contributor

osschar commented Mar 14, 2023

OK, agreed about the semantics. Let's see how we deal with #1953 and then proceed.

@osschar
Copy link
Contributor

osschar commented Mar 14, 2023

And I shamefully admit I don;t know how http file transfer works ... I was thinking about this from the perspective of xroot protocol (as these flags would apply there as well) where every sub read gets a status code.

Or you were only thinking of using http?

@bbockelm
Copy link
Contributor Author

Or you were only thinking of using http?

I would like the feature to be generally useful for any protocol. However, since the driver is better HTTP 1.1 compliance, unless there's a need to push the functionality all the way through to the reads themselves, it seems simpler to leave it as a file-level property. I can see some downsides around code complexity in the read path.

@abh3
Copy link
Member

abh3 commented Mar 14, 2023 via email

Copy link
Member

@abh3 abh3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is in the same comment as the max-age pull request. The same comments apply here.

@amadio
Copy link
Member

amadio commented Nov 1, 2023

Thank you for the idea, @bbockelm, this has been implemented in #2104 and will be in the next feature release.

@amadio amadio closed this Nov 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants