-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Replayer to report the results of TraceRecords. #8657
Conversation
@autopear has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@autopear has updated the pull request. You must reimport the pull request before landing. |
@autopear has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@autopear has updated the pull request. You must reimport the pull request before landing. |
|
||
private: | ||
std::vector<Status> multi_status_; | ||
std::vector<std::string> values_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you want this to be able to present bytes/sec. I think we are OK starting with QPS, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be used for bytes/sec, but it's not the major goal. Currently you can get the TraceRecord execution status from Replayer::Next
and Replayer::Execute
, but you have no way to obtain the actual DB execution status from these 2 APIs. Also you cannot get any execution status from Replayer::Replay
.
The TraceRecordResult
API set provides an interface to query the execution result from TraceRecord::Handler
. If it is the execution handler, then you can get the DB calls returned status and/or values.
If we use Handlers for TraceAnalyzer as well, TraceRecordResult
may be used for parsing or outputting to stdout or file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood, thanks. Exposing the query result via TraceRecordResult
is fine with me to be symmetric with how we expose the query parameters via TraceRecord
API. Although Replayer::Next()
, Replayer::Execute()
, and Replayer::Replay()
all return Status
, they do things like hide NotFound
from Get()
, or not expose the breakdown of MultiGet()
return Status
es.
edit: This comment is just to document my agreement on the PR purpose; nothing needs to be addressed here.
virtual uint64_t GetLatency() const; | ||
|
||
private: | ||
uint64_t latency_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I forget, did we discuss what is the benefit of moving latency measurement to RocksDB? In any case it'd be good to write that in the description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I feel it's better to report the start/end timestamp of the execution. Then the user can use these values to order the execution result for analysis, or compute latency, etc for other types of analysis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NowNanos()
uses CLOCK_MONOTONIC
, which is not real time:
https://linux.die.net/man/3/clock_gettime
And it may not be accurate between threads (cores): https://stackoverflow.com/questions/46893072/clock-gettimeclock-monotonic-monotonicity-across-cores-threads
Typically we cannot get current absolute time at nanos level, only to the Micros level.
So is it okay to use Micros? or switch back to latency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NowNanos()
usesCLOCK_MONOTONIC
, which is not real time:
https://linux.die.net/man/3/clock_gettime
And it may not be accurate between threads (cores): https://stackoverflow.com/questions/46893072/clock-gettimeclock-monotonic-monotonicity-across-cores-threadsTypically we cannot get current absolute time at nanos level, only to the Micros level.
So is it okay to use Micros? or switch back to latency?
Changed to NowMicros()
. Is this OK?
…not sleep if replay is falling behind.
@autopear has updated the pull request. You must reimport the pull request before landing. |
@autopear has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
virtual uint64_t GetLatency() const; | ||
|
||
private: | ||
uint64_t latency_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NowNanos()
uses CLOCK_MONOTONIC
, which is not real time:
https://linux.die.net/man/3/clock_gettime
And it may not be accurate between threads (cores): https://stackoverflow.com/questions/46893072/clock-gettimeclock-monotonic-monotonicity-across-cores-threads
Typically we cannot get current absolute time at nanos level, only to the Micros level.
So is it okay to use Micros? or switch back to latency?
@autopear has updated the pull request. You must reimport the pull request before landing. |
@autopear has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@autopear has updated the pull request. You must reimport the pull request before landing. |
@autopear has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great overall, mostly comments on the API
|
||
private: | ||
std::vector<Status> multi_status_; | ||
std::vector<std::string> values_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood, thanks. Exposing the query result via TraceRecordResult
is fine with me to be symmetric with how we expose the query parameters via TraceRecord
API. Although Replayer::Next()
, Replayer::Execute()
, and Replayer::Replay()
all return Status
, they do things like hide NotFound
from Get()
, or not expose the breakdown of MultiGet()
return Status
es.
edit: This comment is just to document my agreement on the PR purpose; nothing needs to be addressed here.
@autopear has updated the pull request. You must reimport the pull request before landing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
@autopear has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@autopear has updated the pull request. You must reimport the pull request before landing. |
@autopear has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Summary: `Replayer::Execute()` can directly returns the result (e.g, request latency, DB::Get() return code, returned value, etc.) `Replayer::Replay()` reports the results via a callback function. New interface: `TraceRecordResult` in "rocksdb/trace_record_result.h". `DBTest2.TraceAndReplay` and `DBTest2.TraceAndManualReplay` are updated accordingly. Pull Request resolved: facebook/rocksdb#8657 Reviewed By: ajkr Differential Revision: D30290216 Pulled By: autopear fbshipit-source-id: 3c8d4e6b180ec743de1a9d9dcaee86064c74f0d6
Replayer::Execute()
can directly returns the result (e.g, request latency, DB::Get() return code, returned value, etc.)Replayer::Replay()
reports the results via a callback function.New interface:
TraceRecordResult
in "rocksdb/trace_record_result.h".DBTest2.TraceAndReplay
andDBTest2.TraceAndManualReplay
are updated accordingly.