New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rgw: add access log to the beast frontend #33083
Conversation
7f1b4e7
to
773ee51
Compare
|
added missing HTTP_REFERER header |
|
jenkins test make check |
src/rgw/rgw_process.cc
Outdated
| @@ -312,6 +313,22 @@ int process_request(rgw::sal::RGWRadosStore* const store, | |||
| handler->put_op(op); | |||
| rest->put_handler(handler); | |||
|
|
|||
| // access log line elements begin per Apache Combined Log Format with additions following | |||
| static bool beast_framework = s->cct->_conf->rgw_frontends.find("beast") != string::npos; | |||
| if ( beast_framework ) { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it would be preferable to move this code out of the generic process_request() code path, and into the beast frontend itself. the rgw_frontends configuration is very flexible, and would allow you to run multiple frontends (ie beast and civetweb simultaneously). there isn't a reliable way to detect which frontend issued the request at this point
src/rgw/rgw_process.cc
Outdated
| << ACCOUNTING_IO(s)->get_bytes_sent() + ACCOUNTING_IO(s)->get_bytes_received() << " " | ||
| << (referer_hdr ? "\"" : "") << rgw_env.get("HTTP_REFERER", "-") << (referer_hdr ? "\"" : "") << " " | ||
| << (user_agent_hdr ? "\"" : "") << rgw_env.get("HTTP_USER_AGENT", "-") << (user_agent_hdr ? "\"" : "") | ||
| << " " << rgw_env.get("HTTP_RANGE", "-"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
within the beast frontend, handle_connection() should have access to most of this information between the parser, real_client and real_client_io variables. the only stuff we don't have directly would be the http_ret (which we could cache from the call to rgw::asio::ClientIO::send_status() and the user_id (which civetweb doesn't show either)
1b87a62
to
e544444
Compare
|
@cbodley Thank you, moved to |
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
|
unstale please |
src/rgw/rgw_asio_frontend.cc
Outdated
| << (referer_hdr ? "\"" : "") << rgw_env.get("HTTP_REFERER", "-") << (referer_hdr ? "\"" : "") << " " | ||
| << (user_agent_hdr ? "\"" : "") << rgw_env.get("HTTP_USER_AGENT", "-") << (user_agent_hdr ? "\"" : "") | ||
| << " " << rgw_env.get("HTTP_RANGE", "-"); | ||
| ldout(cct, 1) << "beast: " << hex << &req << dec << ": " << rgw::crypt_sanitize::log_content(buf.str()) << dendl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not convinced that we need the crypt_sanitize part, because none of this output should contain key material from server side encryption requests. we have test coverage in teuthology that scans our rgw logs for these keys to detect leaks, so we can remove this part and verify whether or not it's needed
src/rgw/rgw_asio_frontend.cc
Outdated
| buf << rgw_env.get("REMOTE_ADDR", "") << " - - [" << s.time << "] \"" << s.info.method | ||
| << " " << s.info.request_uri << (!s.info.request_params.empty() ? "?" : "") << s.info.request_params | ||
| << " HTTP/" << rgw_env.get("HTTP_VERSION", "-") << "\" " << s.err.http_ret << " " | ||
| << ACCOUNTING_IO(&s)->get_bytes_sent() + ACCOUNTING_IO(&s)->get_bytes_received() << " " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need to call ACCOUNTING_IO() to get our rgw::io::Accounter*. the RGWRestfulIO client inherits rgw::io::AccountingFilter which inherits rgw::io::Accounter, so you can call client.get_bytes_sent/received() directly
src/rgw/rgw_asio_frontend.cc
Outdated
| bool referer_hdr = rgw_env.get("HTTP_REFERER") != nullptr; | ||
| buf << rgw_env.get("REMOTE_ADDR", "") << " - - [" << s.time << "] \"" << s.info.method | ||
| << " " << s.info.request_uri << (!s.info.request_params.empty() ? "?" : "") << s.info.request_params | ||
| << " HTTP/" << rgw_env.get("HTTP_VERSION", "-") << "\" " << s.err.http_ret << " " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s is a different instance of req_state than the one in process_request(), so some of these values may be empty or inconsistent. for example, s.err.http_ret is always 200 here
src/rgw/rgw_asio_frontend.cc
Outdated
| s.cio = &client; | ||
| std::stringstream buf; | ||
| bool user_agent_hdr = rgw_env.get("HTTP_USER_AGENT") != nullptr; | ||
| bool referer_hdr = rgw_env.get("HTTP_REFERER") != nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the beast::http::message that we read in with the parser has access to the method, http version, headers, etc. you can access this message with parser.get(), instead of going through the RGWEnv
|
i pushed an extra commit cbodley@44ec389 to branch https://github.com/cbodley/ceph/commits/wip-pr-33083 that uses the asio/beast types directly for this information |
be92fa8
to
4774c33
Compare
|
jenkins test make check |
|
investigating build issue on RHEL7 platform |
4774c33
to
c780d73
Compare
|
Teuthology run is OK (issues unrelated): |
Add to the Beast frontend an access log line similar to CivetWeb. attempting to adhere as much as possible to the Apache Combined Log Format. Fixes: https://tracker.ceph.com/issues/45920 rgw: use beast message for access log (cherry picked from commit 44ec389) Co-authored-by: Casey Bodley <cbodley@redhat.com> Signed-off-by: Mark Kogan <mkogan@redhat.com>
c780d73
to
5ea7bb8
Compare
|
^ just updated commit message |
Add to the Beast frontend an access log line similar to CivetWeb.
attempting to adhere as much as possible to the Apache Combined Log Format
(https://httpd.apache.org/docs/current/logs.html#common)
example output log line of the requests by civetweb and what would be in beast as per the PR below:
Fixes: https://tracker.ceph.com/issues/45920
Co-authored-by: Casey Bodley cbodley@redhat.com
Signed-off-by: Mark Kogan mkogan@redhat.com
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard backendjenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume tox