-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement rewrite API #2789
Implement rewrite API #2789
Conversation
I've implemented a basic gzip compression rewriter, which provides an API similar to what was discussed in #2716. FileServer::new("static", Options::None)
.rewrite(NormalizeDirs)
.rewrite(Index("index.txt"))
.rewrite(CachedCompression::new()), Full example, with implementation: https://github.com/the10thWiz/rocket-caching-layer This particular implementation generates the compressed version of the file once it's been requested once, and responds with the uncompressed version in the meantime. It's not too hard to imagine different implementations, with different trade-offs. I built this as a proof-of-concept, to show that the new API actually worked for the intended use-case. I've also marked most of the |
I'm about to review this. Note that I tried to implement this API myself some time ago and ran into a bunch of issues. Here's my attempt, in case it helps with anything: https://paste.rs/wEcVu.rs Hopefully you've manage to overcome the limitations I found! |
I'm not sure what the exact limitations you came across, but here are a few things I've done differently:
There are a few points we need to decide on before this implementation can be finalized:
|
The entire patch for my rewrite API implementation is here: https://paste.rs/AbR41.patch I'll be referring to it in my review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The primary difference I see between this API and the one I implemented, at a more fundamental level, is the degree to which it seeks to make FileServer
completely generic. In my implementation, FileServer
becomes purely a sequence of rewrites:
+pub struct FileServer {
+ rewriters: Vec<Arc<dyn Rewriter>>,
+}
Whereas here, we still have the notion of a root and some options. I think to consider this a successful API, we need something that looks more like the former: we want to be able to take a path from the client and progressively apply rewrites until we have the response we're looking for, with those rewrites being controlled completely in a generic manner.
To that effect, I do think we need to make FileServer
be:
pub struct FileServer {
rewriters: Vec<Arc<dyn Rewriter>>,
}
And then be able to recover the existing functionality as a series of rewrites that we provide out of the box.
The second core difference I note is in the treatment of responses, in particular with respect to headers. In my changeset, I introduce changes to NamedFile
in which I add a HeaderMap
to the struct
:
+pub struct NamedFile {
+ file: File,
+ path: PathBuf,
+ headers: HeaderMap<'static>,
+}
(In reality I also added Metadata
, but I think that was a mistake.)
I think this makes sense as it creates parity between NamedFile
and FileServer
. My intent was to allow the rewriter to return a Path
and optionally a HeaderMap
and then use those to construct a NamedFile
that gets used to generate the response. This proved to not be enough as we also need to support rewriting into redirects, and thus returning a Response
was added.
In your implementation, you have an enum
that effectively replaces the need for Response
in my changeset.
I think the right approach is somewhere in-between.
At the end of my experiment, and after taking a look at the code here, I'm left with the impression that the rewrite API I suggested is fundamentally trying to be monadic in Option<FileResponse>
of some sort, and that we should really approach that API a little more directly. That is, to boil things down to a sequence of calls of the form Option<FileResponse>
->
Option<FileResponse>
, where FileResponse
looks something like FileServerResponse
:
enum FileResponse<'a> {
File(File<'a>),
Redirect(Redirect<'a>),
}
struct File<'a> {
path: Cow<'a, Path>,
headers: HeaderMap<'a>,
}
Then, FileServer
is really just an Option<FileResponse<'a>>
that starts as a Some(FileResponse::File(File::from(req.uri())))
(or something morally equivalent) and proceeds through a series of and_then()
s, map()s
, and filter()
s. I think this presentation would also make the ordering argument a bit clearer as we're already used to the ordering consequences between these operations.
If this path is followed, then the Rewrite
trait becomes:
pub trait Rewrite: Send + Sync + Any {
fn rewrite<'r>(&self, response: Option<FileResponse<'r>>) -> Option<FileResponse<'r>>;
}
It can be implemented for all function of the same form (for<'r> F: Fn(Option<FileResponse<'r>>) -> Option<FileResponse<'r>>
) as well as the bind
form (FileResponse<'r> -> Option<FileResponse<'r>>
) and the map
form (FileResponse<'r> -> FileResponse<'r>
). We can also specialize implementations for common forms such as File<'a> -> File<'a>
, File<'a> -> Option<File<'a>>
, and File<'a> -> FileResponse<'r>
.
This would mean we can implement the basic usage as something like:
FileServer::new()
.filter_file(|file| !file.path.components().any(|c| c.starts_with('.')))
.map_file(|file| if file.path.is_dir() && !file.path.ends_with('/') {
Redirect(Redirect::to(file.path + '/'))
} else {
File(file)
})
.map_file(|file| if file.path.is_dir() && file.path.join("index.html").exists() {
File(file.path.join("index.html"))
} else {
file
})
Of course, we'd want to provide these as reusable components, but the point is that the implementation becomes significantly more trivial.
I think the APIs we have proposed (the two rewrite traits) are to some extent isomorphic. My trait has the request and file server root as additional parameters, but I'm pretty sure we could eliminate that without any issues. (I've already refactored away the need for The reason I opted to go with an explicit I opted to retain With that in mind, pub struct FileServer {
root: PathBuf,
rewrites: Vec<Arc<dyn Rewriter>>,
rank: isize,
} I think including Overall, I agree with most of the API changes you propose. The only changes I'm not willing to lose is hiding dot files by default, and some kind of startup check for whether the root directory exists. |
We don't need to force filtering dot files by default: it would suffice to make the default constructor include a rewrite that filters dotfiles. This way we avoid negative logic.
0.6 is right around the corner. And in general, we shouldn't compromise on the API to avoid a breaking change unless there's a huge benefit to be had, which I don't feel is the case here.
I think
|_| File(self.0)
Absolutely.
I think this is the case for
I agree, and it doesn't need to be in conflict with the changes I propose. For the former, we can include it as part of the default constructor, not as part of |
I've implemented changes based on this discussion, and cleaned up the API quite a bit. There are a couple specific pain points with the new API, but they are some pretty minor nits, and should be fixable without any sweeping changes. First, Second, I hoped to avoid any kind of Finally, I'm going to need to rewrite the documentation and examples for |
@SergioBenitez I think I've got this to a pretty good place now. I'm pretty sure there aren't any subtle regressions left, but I'm not 100% sure. The test suite does pass now, so that's a good sign. Looking forward, do we want to add a pre-computed compression rewrite to core or contrib? I'm happy to open another PR for an implementation. If not, I'll likely create and publish a separate crate. |
17d886d
to
54b80b2
Compare
I'm updated this PR (using the changes you've made), to pass CI. Overall, I like most of the changes. I'm not sure how I feel about To that end, I've changed a few things:
|
core/lib/src/fs/rewrite.rs
Outdated
impl<'r> From<File<'r>> for Option<Rewrite<'r>> { | ||
fn from(value: File<'r>) -> Self { | ||
Some(Rewrite::File(value)) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the utility of this impl
? I don't see it used anywhere, and I'm not sure we really want to hide the functionality under an opaque .into()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other impl (From<Redirect>
for Option<Rewrite>
) is actually used in at least one doc-test.
This makes it possible for a rewrite to return File { .. }.into()
, rather than Some(File { .. }.into())
. I think it makes the API more succinct, without losing any information. I think it might be worth looking at the other doc-tests, and using this impl if appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When adding documentation, now both are used (in the doc-tests for FileServer::filter
and FileServer::map
respectively).
core/lib/src/fs/server.rs
Outdated
/// Constructs a new `FileServer` that serves files from the file system | ||
/// `path` with a default rank. | ||
/// | ||
/// Adds a set of default rewrites: | ||
/// - [`filter_dotfiles`]: Hides all dotfiles. | ||
/// - [`prefix(path)`](prefix): Applies the root path. | ||
/// - [`normalize_dirs`]: Normalizes directories to have a trailing slash. | ||
pub fn directory<P: AsRef<Path>>(path: P) -> Self { | ||
Self::empty() | ||
.filter(|f, _| f.is_visible()) | ||
.rewrite(Prefix::checked(path)) | ||
.rewrite(TrailingDirs) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I had implemented this exact method but I removed it because I doubted its utility, given that this would make new()
exactly Self::directory(path).rewrite(DirIndex::unconditional("index.html"))
. If we're going to keep this, we should document the two methods well with why/when you would use one or the other. And we should use it in new()
's implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed something like that might have happened. It was still used in one of the doc-tests (of DirIndex
specifically). I think it's worth including, but I'm thinking it might be worth changing the name in addition to writing better docs.
core/lib/src/fs/rewrite.rs
Outdated
/// # Windows Note | ||
/// | ||
/// This does *not* check the file metadata on any platform, so hidden files | ||
/// on Windows will not be detected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make it so that it does check the metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is bit more complex than the dotfiles check. First, it would only work after Prefix
(much like is_dir
or exists
), and the current dotfiles check is typically done before Prefix
.
Second, to match the behavior of dotfiles, we would likely want to also check all parent folders, which would likely represent a non-trivial amount of work. Right now, each Rewrite
requires at most one syscall, and I think that's a reasonable limit.
Overall, I'm not sure if it's worthwhile. We could provide an alternate method that does check fs metadata, but I'm not sure how much value it would provide. We would also likely need to create a more complex test for this functionality, since I'm not sure git preserves Windows file properties.
Apparently, no_compile doesn't prevent the doctest from being compiled
- Rewriter is now Monadic on `Option<FileResponse>` - `new` (+variants) now adds a proper set of default rewrites - filter_dotfiles now removed dotfiles - Compatibility has been broken
Changes `File` to contain the full `Origin` uri from the request, so normalize_dir doesn't have to deal with platform specific behavior. This is also a more convient way to handle Redirects, as well as provide more consistent access to the query parameters.
It doesn't do what I need it do.
- Add #[non_exaustive] to allow adding more FileResponses in the future - Improve readability of `FileServer::handle` - Take advantage of `NameFile`'s handling missing files - Add `strip_trailing_slash` to ensure files can be read from disk
- Change name of `missing_root` - Update documentation to reference non-panicing variant - Fix issue caused by attempting to open directories
- Introduce `FileMap` and `FileFilter` traits
- `Rewrite` and `Rewriter` are no longer re-exported from `fs`. - Renamed `FileServer::directory` to `FileServer::new_without_index`. - Added missing documentation in a number of places.
ae40009
to
fb8df12
Compare
Merged! 65e3b87 |
The goal of this PR is to implement a Rewrite API, following the basic outline presented in #2716.
The goal is to replace the existing
Options
onFileServer
to instead use a rewrite based API. This will makeFileServer
pluggable, making it possible to implement things like cached compression implementable outside of Rocket.The basic usage example is as follows:
There are a couple of major limitations to the rewrite trait, specifically the file to be served must exist on the local file system, and the trait is synchronous, so it should not do any IO when rewriting paths. However, I would argue that if you want to move beyond these, you should just implement your own
Handler
, and fully replaceFileServer
.