Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use AnyP::Uri::Decode() for urlpath checks #1660

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

kinkie
Copy link
Contributor

@kinkie kinkie commented Feb 2, 2024

No description provided.

Copy link
Contributor Author

@kinkie kinkie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was runtime-tested as well, with configuration:

acl test1 urlpath_regex ^/foo
acl test2 urlpath_regex ^/bar
http_access allow test1
http_access deny test2

and commands

$ http_proxy=http://localhost:3128/ curl -v http://www.squid-cache.org/bar
$ http_proxy=http://localhost:3128/ curl -v http://www.squid-cache.org/foo

* more consistent with other similar code
* more consistent with the conversion function used
xfree(esc_buf);
return result;
auto decodedPath = AnyP::Uri::Decode(checklist->request->url.path());
return data->match(decodedPath.c_str());
Copy link
Contributor

@rousskov rousskov Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when url.path() contains an invalid encoding? Will this Decode() call throw or do we filter out those cases elsewhere? If this Decode() call throws, how does the overall transaction outcome differ compared to the outcome produced by official code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current code the exception will jump out to the exception handler for the async Dialer, terminating whichever access checks were being performed and the transaction.

Copy link
Contributor Author

@kinkie kinkie Feb 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question!
With this code in, the transaction is aborted. What current-squid does is try to interpret it let it pass, so that then Apache can throw a 400 bad request error. I guess what squid SHOULD do is just return a 400 bad request error

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Faced with an invalid encoding, Squid code should do several things, depending on that code location.

  • Uri::Decode() should throw. That is already done.

  • Acl::Node::match() caller probably should convert exceptions to "I do not know" answers/errors. I cannot check all the details right now, so I cannot tell whether that is already done. It might be. If it is not done, it should be done in a dedicated PR.

  • URI parsing code should probably decode paths and either throw on invalid ones (and store decoded ones) or mark them specially so that they are matched and forwarded "as is". Handling invalid path encoding feels similar to handling URI path whitespace, so we may want to be consistent with that feature, especially if we like it's default behavior. Any related changes probably deserve a dedicated PR (at least).

  • This particular Decode() caller should not call Decode() at all or should continue to call Decode() as it does in the current PR code, depending on whether the input path is guaranteed to be decoded already (see the bullet above).

We may have a situation where PR-modified code lines are correct, but the PR should not be merged until other PRs prepare the rest of Squid for this PR change.

In either case, an invalid HTTP request (target path) should result in an error response; it should not kill the transaction.

Please do keep these caveats and complexities in mind as you select Decode() upgrade targets. Not all upgrades will be as easy as the first one. As the PR author, you are responsible for thinking about and, where applicable, testing exceptions introduced by Decode(). Mentioning, in the PR description, why they are not relevant or how they are handled is usually a good idea and may reduce the number of review iterations.

Copy link
Contributor

@rousskov rousskov Feb 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amos: In the current code the exception will jump out to the exception handler for the async Dialer, terminating whichever access checks were being performed and the transaction.

The exception will be called by JobDialer::dial() only if the current Acl::UrlPathCheck::match() call was triggered by firing of an async job call. Not all tasks have been converted to async jobs. Not all job callbacks have been converted to AsyncCalls. Legacy tasks/callbacks do not have a general exception handling mechanism like JobDialer::dial(). I have not checked carefully, but it is likely that these new match() exceptions thrown while processing some ACL-driven directives will kill Squid worker (because they will not be caught at the transaction or lower level).

Alex: Acl::Node::match() caller probably should convert exceptions to "I do not know" answers/errors. I cannot check all the details right now, so I cannot tell whether that is already done. It might be. If it is not done, it should be done in a dedicated PR.

I checked a bit more and could not find that conversion code in Acl::Node::matches() or nearby. It is probably missing and should be added in a dedicated PR.

@rousskov rousskov added the S-waiting-for-author author action is expected (and usually required) label Feb 3, 2024
@kinkie kinkie marked this pull request as draft February 3, 2024 12:17
@kinkie kinkie added the M-ignored-by-merge-bots https://github.com/measurement-factory/anubis/blob/master/README.md#pull-request-labels label Mar 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
M-ignored-by-merge-bots https://github.com/measurement-factory/anubis/blob/master/README.md#pull-request-labels S-waiting-for-author author action is expected (and usually required)
Projects
None yet
3 participants