Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DNM] tracing: use LTTng for logging #22458

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 2 additions & 1 deletion CMakeLists.txt
Expand Up @@ -66,7 +66,8 @@ endif(WITH_MANPAGE)

include_directories(
${PROJECT_BINARY_DIR}/src/include
${PROJECT_SOURCE_DIR}/src)
${PROJECT_SOURCE_DIR}/src
${PROJECT_SOURCE_DIR}/build)

if(OFED_PREFIX)
include_directories(SYSTEM ${OFED_PREFIX}/include)
Expand Down
5 changes: 5 additions & 0 deletions ceph.spec.in
Expand Up @@ -174,6 +174,8 @@ BuildRequires: liboath-devel
BuildRequires: libtool
BuildRequires: libxml2-devel
BuildRequires: libuuid-devel
BuildRequires: lttng-tools-devel
BuildRequires: lttng-ust-devel
BuildRequires: make
BuildRequires: ncurses-devel
BuildRequires: parted
Expand Down Expand Up @@ -1175,6 +1177,7 @@ ${CMAKE} .. \
-DWITH_MANPAGE=ON \
-DWITH_PYTHON3=%{python3_version} \
-DWITH_MGR_DASHBOARD_FRONTEND=OFF \
-DWITH_LTTNG_LOGGING=ON \
%if %{with python2}
-DWITH_PYTHON2=ON \
%else
Expand Down Expand Up @@ -1330,6 +1333,8 @@ rm -rf %{buildroot}
%{_libdir}/libos_tp.so*
%{_libdir}/libosd_tp.so*
%endif
%{_libdir}/liblogging_tp.so*
%{_libdir}/libceph_logging_tp.so*
%config(noreplace) %{_sysconfdir}/logrotate.d/ceph
%if 0%{?fedora} || 0%{?rhel}
%config(noreplace) %{_sysconfdir}/sysconfig/ceph
Expand Down
3 changes: 3 additions & 0 deletions src/CMakeLists.txt
Expand Up @@ -290,6 +290,7 @@ add_subdirectory(json_spirit)
include_directories(SYSTEM "${CMAKE_SOURCE_DIR}/src/xxHash")
include_directories(SYSTEM "${CMAKE_SOURCE_DIR}/src/rapidjson/include")

add_subdirectory(fmt)
if(WITH_SEASTAR)
find_package(fmt 5.2.1 QUIET)
if(NOT fmt_FOUND)
Expand Down Expand Up @@ -405,6 +406,8 @@ endif()
add_library(common STATIC ${ceph_common_objs})
target_link_libraries(common ${ceph_common_deps})

#target_link_libraries(common dispatchqueue_tp)

add_library(ceph-common SHARED ${ceph_common_objs})
target_link_libraries(ceph-common ${ceph_common_deps})
# appease dpkg-shlibdeps
Expand Down
3 changes: 3 additions & 0 deletions src/ceph_osd.cc
Expand Up @@ -55,6 +55,8 @@

namespace {

TracepointProvider::Traits dispatchqueue_tracepoint_traits("libdispatchqueue_tp.so", "dispatchqueue_tracing");

TracepointProvider::Traits osd_tracepoint_traits("libosd_tp.so",
"osd_tracing");
TracepointProvider::Traits os_tracepoint_traits("libos_tp.so",
Expand Down Expand Up @@ -653,6 +655,7 @@ int main(int argc, const char **argv)
init_async_signal_handler();
register_async_signal_handler(SIGHUP, sighup_handler);

TracepointProvider::initialize<dispatchqueue_tracepoint_traits>(g_ceph_context);
TracepointProvider::initialize<osd_tracepoint_traits>(g_ceph_context);
TracepointProvider::initialize<os_tracepoint_traits>(g_ceph_context);
#ifdef WITH_OSD_INSTRUMENT_FUNCTIONS
Expand Down
6 changes: 5 additions & 1 deletion src/common/hobject.cc
Expand Up @@ -355,7 +355,11 @@ int cmp(const hobject_t& l, const hobject_t& r)
return 0;
}


ghobject_t::operator std::string() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, we could use CRTP for this idiom, like

template <typename T>
struct Printable {
  operator std::string() const {
    std::stringstream out;
    out << *static_cast<const T*>(this);
    return out.str();
  }
};

struct Bar : Printable<Bar>
{};

ostream& operator<<(ostream& os, const Bar&) {
  return os << "bar!";
}

so we don't need to repeat this method for every class that needs to be printed in the new logging system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that would make sense. OTOH this comment suggests to use a str() method instead of relying on the implicit operator std::string() to avoid surprises. What are your thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am also for the str() method, or at least make the operator std::string explicit, like

explicit operator std::string() const { 
 /// ....
}

. @mogeb have you considered reusing a single trace point everywhere and send the fmt string to it, i.e. use it as an alternative of syslog(3) without the ... parameters. if we are fine with this, we can avoid all these troubles. but the downside of this apporach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tchaikov sure, that's the very first solution I explored (pre-Cephalocon). I initially changed the implementation of dout() to just record a tracepoint. With that approach, none of the actual has to be changed. Another alternative could be to use LTTng's tracef() (similar to printf), but it also needs/uses strings.

My objective here though is to avoid passing strings around for performance reasons.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I think if the legacy dout lines degenerate into a slow string op, we can focus our efforts on replacing dout calls in the fast path and leave the less frequently-run and slow-path code alone until later (e.g., peering).

Copy link
Contributor

@tchaikov tchaikov Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks guys. after re-reading tracing/tracetool/{README.md,parsetool.py}, now i understand the objective and rationale behind this design.

Copy link
Contributor

@tchaikov tchaikov Aug 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mogeb i agree with @liewegas on this. and i was thinking about how to ease the pain to add a logging message in ceph in future. i even explored the possibility to use libclang to parse the source for creating the .tp file. with it, we can coin the event name using the filename, and lineno, and deduce the name of argument type from the argument in the trace_*(...) expression, so we can extract information interesting to us and create the .tp , .h and _impl.h files for us. but this dependency is kind of heavy weight for us at this moment. probably we can revisit it in future.

but, we can always have a compromise here by differentiating following two use cases:

  1. the logging in critical path: to use trace_${event_name}(loglevel, subsys, [arg_type, arg_name]...), so we don't need to render the values of primitive types in string before passing them to LTTng.
  2. the logging in non-critical path: to use trace(loglevel, subsys, fmt_str, [arg]...). we can use fmt::format() offered by libfmt in the pasrsetool.py. as libfmt is already included by ceph as a submodule. and make trace() a template function. which calls tracepoint(subsys, ${default_event_name}_at_${log_evel}) , assuming all combinations of ${default_event_name}_at_${log_level} TRACEPOINT_EVENTs of different ${subsys}s and ${log_level}s combinations have been defined by parsetool.py. they only have a single TP_FIELD.

in this way, we can have

  • a unified logging backend, so developers or users don't need to check different sources for tracing an issue, and
  • a flexible front-end, so developers don't need to worry about the name of a trace event, and the argument type of event payload for all logging messages. they only need to do so when it's really needed.

what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the clang approach would be great. Sounds good, let's rethink it later as you said.

For having a critical and non-critical paths: just to make sure I understand, the non-critical path uses a formatted string using the arguments' operator <<() functions, correct? If that's the case, we could change the implementation of dout to record a tracepoint instead of using trace(loglevel, subsys, fmt_str, [arg]...). This becomes even easier for the developers as nothing changes: dout() just takes the string passed to it and ends up calling tracepoint() at the end. I'd need to figure out the details, but I was able to achieve that at the very beginning. Any thought?

Copy link
Contributor

@tchaikov tchaikov Aug 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the non-critical path uses a formatted string using the arguments' operator <<() functions, correct?

yes. at this moment, we can use the argument's operator<<(), so minimum change is needed. in short and long term, i'd suggest use libfmt's fmt::format() instead, it has better performance than that of std::iostream. probably we can expose two more interfaces

  1. just for the backward compatibility: `dout(1) << "ceph rock" << 5 << '!' << dendl;
  2. as the faster one: trace(bluestore, 1, "{0} rock{1}{2}", "ceph", 5, '!');
  3. as the fatest one: trace_ceph_rocks(1, bluestore, string, who, "ceph", int, s, 5, uint_8, wow, '!', "%s rock %d%d");

so, i'd suggest start using the trace() or trace_${event_name} interface in the new code. and replace the implementaion of dout(1) for the existing code.

std::stringstream out;
out << *this;
return out.str();
}

// This is compatible with decode for hobject_t prior to
// version 5.
Expand Down
1 change: 1 addition & 0 deletions src/common/hobject.h
Expand Up @@ -464,6 +464,7 @@ struct ghobject_t {
(*this) = temp;
}

operator std::string() const;
void encode(ceph::buffer::list& bl) const;
void decode(ceph::buffer::list::const_iterator& bl);
void decode(json_spirit::Value& v);
Expand Down
4 changes: 4 additions & 0 deletions src/common/options.cc
Expand Up @@ -603,6 +603,10 @@ std::vector<Option> get_global_options() {
"log_graylog_host",
"log_graylog_port"}),

Option("dispatchqueue_tracing", Option::TYPE_BOOL, Option::LEVEL_ADVANCED)
.set_default(false)
.set_description(""),

Option("err_to_graylog", Option::TYPE_BOOL, Option::LEVEL_BASIC)
.set_default(false)
.set_description("send critical error log lines to remote graylog server")
Expand Down
6 changes: 6 additions & 0 deletions src/common/types.cc
Expand Up @@ -29,4 +29,10 @@ ostream &operator<<(ostream &lhs, const shard_id_t &rhs)
return lhs << (unsigned)(uint8_t)rhs.id;
}

byte_u_t::operator std::string() {
stringstream out;
out << *this;
return out.str();
}

#endif
3 changes: 3 additions & 0 deletions src/include/config-h.in.cmake
Expand Up @@ -178,6 +178,9 @@
/* Define if you want to use LTTng */
#cmakedefine WITH_LTTNG

/* Define if you want to use LTTng for logging instead of dout */
#cmakedefine WITH_LTTNG_LOGGING

/* Define if you want to OSD function instrumentation */
#cmakedefine WITH_OSD_INSTRUMENT_FUNCTIONS

Expand Down
7 changes: 7 additions & 0 deletions src/include/interval_set.h
Expand Up @@ -724,6 +724,13 @@ class interval_set {
other = std::move(m);
}

public:
operator std::string() const {
std::stringstream out;
out << *this;
return out.str();
}

private:
// data
uint64_t _size;
Expand Down
36 changes: 36 additions & 0 deletions src/include/tracing/ceph_logging_impl.h
@@ -0,0 +1,36 @@
#ifndef CEPH_LOGGING_IMPL_H
#define CEPH_LOGGING_IMPL_H

#include <string>
#include <stdarg.h>
#include <fmt/include/fmt/format.h>
#include <fmt/include/fmt/ostream.h>

#include "tracing/ceph_logging.h"

using fmt::format;

#define trace(ll, ss, fmt, ...) __trace(ll, ss, format(fmt, __VA_ARGS__))
#define trace_error(ss, fmt, ...) __trace(-1, ss, format(fmt, __VA_ARGS__))

//static inline void trace(int loglevel, string subsys, const char *fmt, void *arg, ...)
static inline void __trace(int loglevel, string subsys, string str)
{
str = subsys + ": " + str;
if (loglevel <= -1) {
str = "Error: " + str;
tracepoint(ceph_logging, log_error, (char*)str.c_str());
} else if (loglevel <= 1) {
tracepoint(ceph_logging, log_critical, (char*)str.c_str());
} else if (loglevel <= 5) {
tracepoint(ceph_logging, log_warning, (char*)str.c_str());
} else if (loglevel <= 10) {
tracepoint(ceph_logging, log_info, (char*)str.c_str());
} else if (loglevel <= 15) {
tracepoint(ceph_logging, log_debug, (char*)str.c_str());
} else {
tracepoint(ceph_logging, log_verbose, (char*)str.c_str());
}
}

#endif // CEPH_LOGGING_IMPL_H
1 change: 1 addition & 0 deletions src/include/types.h
Expand Up @@ -469,6 +469,7 @@ inline std::ostream& operator<<(std::ostream& out, const si_u_t& b)
struct byte_u_t {
uint64_t v;
explicit byte_u_t(uint64_t _v) : v(_v) {};
operator std::string();
};

inline std::ostream& operator<<(std::ostream& out, const byte_u_t& b)
Expand Down
6 changes: 6 additions & 0 deletions src/include/utime.h
Expand Up @@ -225,6 +225,12 @@ class utime_t {
return ts;
}

operator std::string() {
stringstream out;
localtime(out);
return out.str();
}

void sleep() const {
struct timespec ts;
to_timespec(&ts);
Expand Down
3 changes: 3 additions & 0 deletions src/include/uuid.h
Expand Up @@ -68,6 +68,9 @@ struct uuid_d {

void dump(ceph::Formatter *f) const;
static void generate_test_instances(std::list<uuid_d*>& o);
operator std::string() {
return to_string();
}
};
WRITE_CLASS_ENCODER(uuid_d)

Expand Down
19 changes: 19 additions & 0 deletions src/log/Log.cc
Expand Up @@ -22,6 +22,25 @@

#include <iostream>

#ifdef WITH_LTTNG_LOGGING
#define TRACEPOINT_DEFINE
#define TRACEPOINT_PROBE_DYNAMIC_LINKAGE
#include "tracing/bluestore.h"
#include "tracing/bluestore_blob.h"
#include "tracing/bluestore_gc.h"
#include "tracing/bluestore_lru_cache.h"
#include "tracing/bluestore_twoqcache.h"
#include "tracing/bitmapallocator.h"
#include "tracing/bmap_freelist_manager.h"
#include "tracing/ceph_logging.h"
#include "tracing/asyncconnection.h"
#include "tracing/dispatchqueue.h"
#undef TRACEPOINT_PROBE_DYNAMIC_LINKAGE
#undef TRACEPOINT_DEFINE
#else
#define tracepoint(...)
#endif

#define MAX_LOG_BUF 65536

namespace ceph {
Expand Down
4 changes: 2 additions & 2 deletions src/mgr/DaemonState.cc
Expand Up @@ -44,12 +44,12 @@ void DeviceState::set_life_expectancy(utime_t from, utime_t to, utime_t now)
life_expectancy = make_pair(from, to);
life_expectancy_stamp = now;
if (from != utime_t()) {
metadata["life_expectancy_min"] = from;
metadata["life_expectancy_min"] = stringify(from);
} else {
metadata["life_expectancy_min"] = "";
}
if (to != utime_t()) {
metadata["life_expectancy_max"] = to;
metadata["life_expectancy_max"] = stringify(to);
} else {
metadata["life_expectancy_max"] = "";
}
Expand Down
50 changes: 38 additions & 12 deletions src/msg/DispatchQueue.cc
Expand Up @@ -16,10 +16,20 @@
#include "DispatchQueue.h"
#include "Messenger.h"
#include "common/ceph_context.h"
#include "common/TracepointProvider.h"

#define dout_subsys ceph_subsys_ms
#include "common/debug.h"

#ifdef WITH_LTTNG_LOGGING
#include "include/tracing/dispatchqueue_impl.h"
#else
#define trace(...)
#define trace_error(...)
#endif


//static TracepointProvider::Traits tracepoint_traits("libdispatchqueue_tp.so", "dispatchqueue_tracing");

/*******************
* DispatchQueue
Expand All @@ -38,18 +48,34 @@ double DispatchQueue::get_max_age(utime_t now) const {

uint64_t DispatchQueue::pre_dispatch(const ref_t<Message>& m)
{
ldout(cct,1) << "<== " << m->get_source_inst()
<< " " << m->get_seq()
<< " ==== " << *m
<< " ==== " << m->get_payload().length()
<< "+" << m->get_middle().length()
<< "+" << m->get_data().length()
<< " (" << ceph_con_mode_name(m->get_connection()->get_con_mode())
<< " " << m->get_footer().front_crc << " "
<< m->get_footer().middle_crc
<< " " << m->get_footer().data_crc << ")"
<< " " << m << " con " << m->get_connection()
<< dendl;
// ldout(cct,1) << "<== " << m->get_source_inst()
// << " " << m->get_seq()
// << " ==== " << *m
// << " ==== " << m->get_payload().length()
// << "+" << m->get_middle().length()
// << "+" << m->get_data().length()
// << " (" << ceph_con_mode_name(m->get_connection()->get_con_mode())
// << " " << m->get_footer().front_crc << " "
// << m->get_footer().middle_crc
// << " " << m->get_footer().data_crc << ")"
// << " " << m << " con " << m->get_connection()
// << dendl;
stringstream str_m;
stringstream str_source_inst;
str_m << *m;
str_source_inst << m->get_source_inst();
trace_pre_dispatch_info(1, dispatchqueue,
string, str_source_inst, str_source_inst.str(),
uint64_t, seq, m->get_seq(),
string, str_m, str_m.str(),
uint32_t, payload_len, m->get_payload().length(),
uint32_t, middle_len, m->get_middle().length(),
uint32_t, data_len, m->get_data().length(),
string, con_mode, ceph_con_mode_name(m->get_connection()->get_con_mode()),
uint32_t, front_crc, m->get_footer().front_crc,
uint32_t, middle_crc, m->get_footer().middle_crc,
uint32_t, data_crc, m->get_footer().data_crc,
"<== {} {} === {} +++ {} + {} + {} ({} {} {} {})");
uint64_t msize = m->get_dispatch_throttle_size();
m->set_dispatch_throttle_size(0); // clear it out, in case we requeue this message.
return msize;
Expand Down
6 changes: 6 additions & 0 deletions src/msg/Message.cc
Expand Up @@ -285,6 +285,12 @@ void Message::dump(Formatter *f) const
f->dump_string("summary", ss.str());
}

Message::operator std::string() const {
std::stringstream out;
out << *this;
return out.str();
}

Message *decode_message(CephContext *cct, int crcflags,
ceph_msg_header& header,
ceph_msg_footer& footer,
Expand Down
1 change: 1 addition & 0 deletions src/msg/Message.h
Expand Up @@ -501,6 +501,7 @@ class Message : public RefCountedObject {
virtual void dump(ceph::Formatter *f) const;

void encode(uint64_t features, int crcflags, bool skip_header_crc = false);
operator std::string() const;
};

extern Message *decode_message(CephContext *cct, int crcflags,
Expand Down
27 changes: 21 additions & 6 deletions src/msg/async/AsyncConnection.cc
Expand Up @@ -47,6 +47,13 @@ ostream& AsyncConnection::_conn_prefix(std::ostream *_dout) {
<< ").";
}

#ifdef WITH_LTTNG_LOGGING
#include "include/tracing/asyncconnection_impl.h"
#else
#define trace(...)
#define trace_error(...)
#endif

// Notes:
// 1. Don't dispatch any event when closed! It may cause AsyncConnection alive even if AsyncMessenger dead

Expand Down Expand Up @@ -517,12 +524,20 @@ void AsyncConnection::accept(ConnectedSocket socket,
int AsyncConnection::send_message(Message *m)
{
FUNCTRACE(async_msgr->cct);
lgeneric_subdout(async_msgr->cct, ms,
1) << "-- " << async_msgr->get_myaddrs() << " --> "
<< get_peer_addrs() << " -- "
<< *m << " -- " << m << " con "
<< this
<< dendl;
// lgeneric_subdout(async_msgr->cct, ms,
// 1) << "-- " << async_msgr->get_myaddrs() << " --> "
// << get_peer_addrs() << " -- "
// << *m << " -- " << m << " con "
// << this
// << dendl;
trace_send_message(1, asyncconnection,
void*, cct, async_msgr->cct,
string, myaddrs, async_msgr->get_myaddrs(),
string, peer_addrs, get_peer_addrs(),
string, message, *m,
void*, m, m,
void*, this_, this,
"{} {} -- {} --> {} -- {} -- {} con {}");

// optimistic think it's ok to encode(actually may broken now)
if (!m->get_priority())
Expand Down