Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to remove query strings from access log #3576

Closed
tsaarni opened this issue Apr 14, 2021 · 8 comments · Fixed by #3694
Closed

Add option to remove query strings from access log #3576

tsaarni opened this issue Apr 14, 2021 · 8 comments · Fixed by #3694
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/needs-triage Indicates that an issue needs to be triaged by a project contributor.

Comments

@tsaarni
Copy link
Member

tsaarni commented Apr 14, 2021

Feature request

Envoy will log the complete request path, including the query string like ?supersecret=password in this example

[2021-04-14T16:36:00.361Z] "GET /foo?supersecret=password HTTP/1.1" 200 - 0 463 6 3 "-" "HTTPie/1.0.3" "837aa8dc-344f-4faa-b7d5-c9cce1028519" "localhost:8080" "127.0.0.1:8081"

It would be great to provide an option to remove query string from the access log, since it may contain sensitive information introducing a potential vulnerability. If the problem cannot be fixed in the proxied application itself, the vulnerability could be mitigated by sanitazing the access logs.

Potential options

  1. Contour could add HTTP filter called Header-To-Metadata and use it to remove query string from :PATH header and set the result in dynamic metadata, see formatter: print request header without query string envoyproxy/envoy#15711 (comment). Then Contour could provide the user an option to change the default access log format string, where they would replace %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% (including the query string) with %DYNAMIC_METADATA(projectcontour.io:path)% (query string sanitized)

  2. Proceed with the PR formatter: print request header without query string envoyproxy/envoy#15711 (or something similar) which intends to introduce a new extension command e.g. REQ_WITHOUT_QUERY for the purpose of sanitizing the headers. Next step would be for Contour to enable this extension in the access log configuration within HTTP Connection Manager. Then provide the user an option to change the default access log format string where they would replace %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% with %REQ_WITHOUT_QUERY(X-ENVOY-ORIGINAL-PATH?:PATH)%

Option (1) is possible to implement with existing versions of Envoy, while (2) would introduce a version dependency.

@tsaarni tsaarni added kind/feature Categorizes issue or PR as related to a new feature. lifecycle/needs-triage Indicates that an issue needs to be triaged by a project contributor. labels Apr 14, 2021
@youngnick
Copy link
Member

Thanks for this issue @tsaarni. I'm leaning towards option 2, maybe with a boolean to strip query strings from the access logs? The only other option I see is full configurability of the access log format, which, while worthwhile, would need some more work.

@tsaarni
Copy link
Member Author

tsaarni commented May 19, 2021

Thanks @youngnick!

My progress with the Envoy PR has been slow but with the information I have now, I think it would make sense to make the text based access log format configurable in Contour.

My reasoning:

  1. There already is similar configuration for JSON based access log that gives user an option to freely define the format. Adding option for text based access log could be seen as feature parity.
  2. The Envoy extension I'm working on could in theory become compile time option. If we would have boolean option in Contour to strip the query string, it could require Envoy binary with the extension enabled at compile time.
  3. The Envoy extension is using the new "formatter" extension point. In future there might be other new formatter extensions, similar to %REQ_WITHOUT_QUERY()%. Giving users the ability to customize access log format string, gives them access to all of future formatters without impacts to Contour (even when someone would implement proprietary extension in their Envoy binary)

It seems like adding access log format configurability could be quite straightforward. I created very initial draft / WiP change that I'll submit for comments. I'd be curious to hear if this kind of feature would be acceptable and discuss how to proceed.

@youngnick
Copy link
Member

Thanks for the draft PR, @tsaarni, and I'm sorry it took so long to get back to you.

For the JSON logging config, we actually do a bunch of validation on the fields supplied, because:

  • some fields can have arguments
  • if the person supplying the config makes any mistakes with field names, and we pass the bad config to Envoy, which rejects it, Contour has no way to know (because of internal/grpc: properly handle Envoy ACK/NACK protocol #1176). So, in that case, the person configuring logging would only know when no logs show up, or when they're formatted incorrectly or something.

The design for the more complex JSON logging is available if you want to take a look.

We've got a similar problem here, I think. Now, #1176 is not as serious as I previously thought (in that sending a bad config will only result in the bad piece of config being rejected), but it still seems to me like we might need to do some preemptive parsing. @tsaarni, you've used this functionality quite a bit, what is the behavior like when there's a problem with a field? Is it just that field that doesn't get replaced?

@tsaarni
Copy link
Member Author

tsaarni commented Jun 2, 2021

you've used this functionality quite a bit, what is the behavior like when there's a problem with a field? Is it just that field that doesn't get replaced?

@youngnick It does not look too bad to me:

When I include invalid field INVALID(:METHOD) in access log format, I get following error in Contour log:

ERRO[0001] Error adding/updating listener(s) ingress_http: Not supported field in StreamInfo: INVALID(:METHOD)  code=13 connection=1 context=xds node_id=envoy-ns5wj node_version=v1.18.3 response_nonce=1 version_info=2

and at the same time in Envoy log:

[2021-06-02 06:09:21.507][1][warning][config] [source/common/config/grpc_subscription_impl.cc:127] gRPC config for type.googleapis.com/envoy.config.listener.v3.Listener rejected: Error adding/updating listener(s) ingress_http: Not supported field in StreamInfo: INVALID(:METHOD)

The listener was rejected but since it was an update, the existing listener remained and traffic was not disturbed. And at least for this kind of error, I think the error message is not too bad since it points out the exact invalid field. After correcting the format and restarting Contour, the XDS update was successful and Envoy picked up also the new access log format, so need to restart Envoy, just like you wrote in #1176.

If this was initial start or if Envoy(s) would have restarted at the same time and lost its current configuration, then traffic would not be served but I guess the behavior would be likely the same with Contour-level validation as well? I assume configuration errors like this are not common enough to make it worthwhile to build logic that could recognize invalid format and gracefully remove access logger while still configuring listeners without access log?

@tsaarni
Copy link
Member Author

tsaarni commented Jun 3, 2021

Edit: I incorrectly said previously "Giving users the ability to customize access log format string, gives them access to all of future formatters without impacts to Contour" but I got confused again. Since the Envoy formatter extension specifically requires each extension to be activated - there is always an impact to Contour.

If the access log configuration part within HTTP connection manager previously looked like (in YAML format to visualize):

access_log:
  - name: fileaccesslog
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
      path: /dev/stdout
      log_format:
        text_format_source:
          inline_string: "CUSTOM FORMAT STRING\n"

Next an formatter extension was added, the configuration would need to be added with formatters array

access_log:
  - name: fileaccesslog
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
      path: /dev/stdout
      log_format:
        text_format_source:
          inline_string: "CUSTOM FORMAT STRING USING FORMATTER EXTENSION\n"
        formatters:
          - name: envoy.formatter.req_without_query
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.formatter.req_without_query.v3.ReqWithoutQuery

Maybe Contour could search through custom formats that user set (both text and JSON format) and if finding REQ_WITHOUT_QUERY keyword, only then it would add the formatters configuration. Then Envoy wouldn't reject the configuration if running with version that does not have the extension compiled.

Currently it seems that my formatter extension could become compiled in by default, but I guess some users might still prefer disabling extensions they don't use.

@youngnick
Copy link
Member

That sounds like a good idea @tsaarni, I'd like to see that.

@tsaarni
Copy link
Member Author

tsaarni commented Jun 18, 2021

That sounds like a good idea @tsaarni, I'd like to see that.

I've now proceeded with this idea in #3694. The Envoy extension also was now merged to master and it is enabled by default.

@youngnick
Copy link
Member

I'll review that one, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/needs-triage Indicates that an issue needs to be triaged by a project contributor.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants