Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: add access log to the beast frontend #33083

Merged
merged 1 commit into from Jun 8, 2020

Conversation

mkogan1
Copy link
Contributor

@mkogan1 mkogan1 commented Feb 5, 2020

Add to the Beast frontend an access log line similar to CivetWeb.
attempting to adhere as much as possible to the Apache Combined Log Format
(https://httpd.apache.org/docs/current/logs.html#common)

example output log line of the requests by civetweb and what would be in beast as per the PR below:

2020-02-05T16:44:47.081+0200 7fb3e5aa8700  1 beast: 0x7fb3e5aa0740: 127.0.0.1 - cosbench [2020-02-05T16:44:47.045337+0200] "PUT /hotsauce-bench000000000002/0000
00005137 HTTP/1.1" 200 4096 - "aws-sdk-go/1.25.36 (go1.12.13; linux; amd64)" -
2020-02-05T16:44:47.081+0200 7fb3e5aa8700  1 civetweb: 0x4549138: 127.0.0.1 - - [05/Feb/2020:16:44:47 +0200] "PUT /hotsauce-bench000000000002/000000005137 HTTP/
1.1" 200 206 - aws-sdk-go/1.25.36 (go1.12.13; linux; amd64)

Fixes: https://tracker.ceph.com/issues/45920

Co-authored-by: Casey Bodley cbodley@redhat.com
Signed-off-by: Mark Kogan mkogan@redhat.com

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard backend
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

@mkogan1
Copy link
Contributor Author

mkogan1 commented Feb 5, 2020

added missing HTTP_REFERER header

@mkogan1
Copy link
Contributor Author

mkogan1 commented Feb 6, 2020

jenkins test make check

@@ -312,6 +313,22 @@ int process_request(rgw::sal::RGWRadosStore* const store,
handler->put_op(op);
rest->put_handler(handler);

// access log line elements begin per Apache Combined Log Format with additions following
static bool beast_framework = s->cct->_conf->rgw_frontends.find("beast") != string::npos;
if ( beast_framework ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it would be preferable to move this code out of the generic process_request() code path, and into the beast frontend itself. the rgw_frontends configuration is very flexible, and would allow you to run multiple frontends (ie beast and civetweb simultaneously). there isn't a reliable way to detect which frontend issued the request at this point

<< ACCOUNTING_IO(s)->get_bytes_sent() + ACCOUNTING_IO(s)->get_bytes_received() << " "
<< (referer_hdr ? "\"" : "") << rgw_env.get("HTTP_REFERER", "-") << (referer_hdr ? "\"" : "") << " "
<< (user_agent_hdr ? "\"" : "") << rgw_env.get("HTTP_USER_AGENT", "-") << (user_agent_hdr ? "\"" : "")
<< " " << rgw_env.get("HTTP_RANGE", "-");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

within the beast frontend, handle_connection() should have access to most of this information between the parser, real_client and real_client_io variables. the only stuff we don't have directly would be the http_ret (which we could cache from the call to rgw::asio::ClientIO::send_status() and the user_id (which civetweb doesn't show either)

@mkogan1 mkogan1 force-pushed the wip-rgw-add-beast-access-logs branch 2 times, most recently from 1b87a62 to e544444 Compare February 18, 2020 11:08
@mkogan1 mkogan1 requested a review from cbodley February 18, 2020 11:56
@mkogan1
Copy link
Contributor Author

mkogan1 commented Feb 18, 2020

@cbodley Thank you, moved to handle_connection()

@stale
Copy link

stale bot commented Apr 18, 2020

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@stale stale bot added the stale label Apr 18, 2020
@mkogan1
Copy link
Contributor Author

mkogan1 commented Apr 19, 2020

unstale please

@stale stale bot removed the stale label Apr 19, 2020
<< (referer_hdr ? "\"" : "") << rgw_env.get("HTTP_REFERER", "-") << (referer_hdr ? "\"" : "") << " "
<< (user_agent_hdr ? "\"" : "") << rgw_env.get("HTTP_USER_AGENT", "-") << (user_agent_hdr ? "\"" : "")
<< " " << rgw_env.get("HTTP_RANGE", "-");
ldout(cct, 1) << "beast: " << hex << &req << dec << ": " << rgw::crypt_sanitize::log_content(buf.str()) << dendl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not convinced that we need the crypt_sanitize part, because none of this output should contain key material from server side encryption requests. we have test coverage in teuthology that scans our rgw logs for these keys to detect leaks, so we can remove this part and verify whether or not it's needed

buf << rgw_env.get("REMOTE_ADDR", "") << " - - [" << s.time << "] \"" << s.info.method
<< " " << s.info.request_uri << (!s.info.request_params.empty() ? "?" : "") << s.info.request_params
<< " HTTP/" << rgw_env.get("HTTP_VERSION", "-") << "\" " << s.err.http_ret << " "
<< ACCOUNTING_IO(&s)->get_bytes_sent() + ACCOUNTING_IO(&s)->get_bytes_received() << " "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need to call ACCOUNTING_IO() to get our rgw::io::Accounter*. the RGWRestfulIO client inherits rgw::io::AccountingFilter which inherits rgw::io::Accounter, so you can call client.get_bytes_sent/received() directly

bool referer_hdr = rgw_env.get("HTTP_REFERER") != nullptr;
buf << rgw_env.get("REMOTE_ADDR", "") << " - - [" << s.time << "] \"" << s.info.method
<< " " << s.info.request_uri << (!s.info.request_params.empty() ? "?" : "") << s.info.request_params
<< " HTTP/" << rgw_env.get("HTTP_VERSION", "-") << "\" " << s.err.http_ret << " "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s is a different instance of req_state than the one in process_request(), so some of these values may be empty or inconsistent. for example, s.err.http_ret is always 200 here

s.cio = &client;
std::stringstream buf;
bool user_agent_hdr = rgw_env.get("HTTP_USER_AGENT") != nullptr;
bool referer_hdr = rgw_env.get("HTTP_REFERER") != nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the beast::http::message that we read in with the parser has access to the method, http version, headers, etc. you can access this message with parser.get(), instead of going through the RGWEnv

@cbodley
Copy link
Contributor

cbodley commented Apr 23, 2020

i pushed an extra commit cbodley@44ec389 to branch https://github.com/cbodley/ceph/commits/wip-pr-33083 that uses the asio/beast types directly for this information

@mkogan1 mkogan1 force-pushed the wip-rgw-add-beast-access-logs branch 4 times, most recently from be92fa8 to 4774c33 Compare June 3, 2020 16:18
@mkogan1
Copy link
Contributor Author

mkogan1 commented Jun 4, 2020

jenkins test make check

@mkogan1
Copy link
Contributor Author

mkogan1 commented Jun 4, 2020

investigating build issue on RHEL7 platform

[ 90%] Built target rgw
[ 90%] Building CXX object src/osd/CMakeFiles/osd.dir/Session.cc.o
[ 90%] Built target ceph-mon
[ 90%] Building CXX object src/osd/CMakeFiles/osd.dir/SnapMapper.cc.o
/home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/rgw_asio_frontend.cc:103:13: warning: In the GNU C Library, "major" is defined
 by <sys/sysmacros.h>. For historical compatibility, it is
 currently defined by <sys/types.h> as well, but we plan to
 remove this soon. To use "major", include <sys/sysmacros.h>
 directly. If you did not intend to use a system-defined macro
 "major", you should undefine it after including <sys/types.h>.
  103 |     : major(version / 10), minor(version % 10) {}
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/rgw_asio_frontend.cc:103:13: warning: In the GNU C Library, "minor" is defined
 by <sys/sysmacros.h>. For historical compatibility, it is
 currently defined by <sys/types.h> as well, but we plan to
 remove this soon. To use "minor", include <sys/sysmacros.h>
 directly. If you did not intend to use a system-defined macro
 "minor", you should undefine it after including <sys/types.h>.
  103 |     : major(version / 10), minor(version % 10) {}
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/x86_64-linux-gnu/sys/types.h:205,
                 from /usr/include/stdlib.h:394,
                 from /usr/include/c++/9/cstdlib:75,
                 from /usr/include/c++/9/ext/string_conversions.h:41,
                 from /usr/include/c++/9/bits/basic_string.h:6493,
                 from /usr/include/c++/9/string:55,
                 from /usr/include/c++/9/stdexcept:39,
                 from /usr/include/c++/9/array:39,
                 from /usr/include/c++/9/tuple:39,
                 from /usr/include/c++/9/bits/unique_ptr.h:37,
                 from /usr/include/c++/9/memory:80,
                 from /usr/include/c++/9/thread:39,
                 from /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/rgw_asio_frontend.cc:5:
/home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/rgw_asio_frontend.cc: In constructor '{anonymous}::http_version::http_version(unsigned int)':
/home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/rgw_asio_frontend.cc:103:7: error: class '{anonymous}::http_version' does not have any field named 'gnu_dev_major'
  103 |     : major(version / 10), minor(version % 10) {}
      |       ^~~~~
/home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/rgw_asio_frontend.cc:103:28: error: class '{anonymous}::http_version' does not have any field named 'gnu_dev_minor'
  103 |     : major(version / 10), minor(version % 10) {}
      |                            ^~~~~
src/rgw/CMakeFiles/radosgw.dir/build.make:182: recipe for target 'src/rgw/CMakeFiles/radosgw.dir/rgw_asio_frontend.cc.o' failed
make[3]: *** [src/rgw/CMakeFiles/radosgw.dir/rgw_asio_frontend.cc.o] Error 1
CMakeFiles/Makefile2:30767: recipe for target 'src/rgw/CMakeFiles/radosgw.dir/all' failed
make[2]: *** [src/rgw/CMakeFiles/radosgw.dir/all] Error 2
make[2]: *** Waiting for unfinished jobs....

@mkogan1 mkogan1 force-pushed the wip-rgw-add-beast-access-logs branch from 4774c33 to c780d73 Compare June 4, 2020 12:38
@mkogan1
Copy link
Contributor Author

mkogan1 commented Jun 8, 2020

Add to the Beast frontend an access log line similar to CivetWeb.
attempting to adhere as much as possible to the Apache Combined Log
Format.

Fixes: https://tracker.ceph.com/issues/45920

rgw: use beast message for access log
(cherry picked from commit 44ec389)

Co-authored-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Mark Kogan <mkogan@redhat.com>
@mkogan1 mkogan1 force-pushed the wip-rgw-add-beast-access-logs branch from c780d73 to 5ea7bb8 Compare June 8, 2020 11:23
@mkogan1 mkogan1 removed the DNM label Jun 8, 2020
@mkogan1
Copy link
Contributor Author

mkogan1 commented Jun 8, 2020

^ just updated commit message

@ivancich ivancich added the wip-eric-testing-1 for ivancich testing label Jun 8, 2020
@cbodley cbodley merged commit f541833 into ceph:master Jun 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants