Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Replayer to report the results of TraceRecords. #8657

Closed
wants to merge 13 commits into from

Conversation

autopear
Copy link
Contributor

Replayer::Execute() can directly returns the result (e.g, request latency, DB::Get() return code, returned value, etc.)
Replayer::Replay() reports the results via a callback function.

New interface:
TraceRecordResult in "rocksdb/trace_record_result.h".

DBTest2.TraceAndReplay and DBTest2.TraceAndManualReplay are updated accordingly.

@facebook-github-bot
Copy link
Contributor

@autopear has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@autopear has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@autopear has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@autopear has updated the pull request. You must reimport the pull request before landing.


private:
std::vector<Status> multi_status_;
std::vector<std::string> values_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you want this to be able to present bytes/sec. I think we are OK starting with QPS, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be used for bytes/sec, but it's not the major goal. Currently you can get the TraceRecord execution status from Replayer::Next and Replayer::Execute, but you have no way to obtain the actual DB execution status from these 2 APIs. Also you cannot get any execution status from Replayer::Replay.

The TraceRecordResult API set provides an interface to query the execution result from TraceRecord::Handler. If it is the execution handler, then you can get the DB calls returned status and/or values.

If we use Handlers for TraceAnalyzer as well, TraceRecordResult may be used for parsing or outputting to stdout or file.

Copy link
Contributor

@ajkr ajkr Aug 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, thanks. Exposing the query result via TraceRecordResult is fine with me to be symmetric with how we expose the query parameters via TraceRecord API. Although Replayer::Next(), Replayer::Execute(), and Replayer::Replay() all return Status, they do things like hide NotFound from Get(), or not expose the breakdown of MultiGet() return Statuses.

edit: This comment is just to document my agreement on the PR purpose; nothing needs to be addressed here.

virtual uint64_t GetLatency() const;

private:
uint64_t latency_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I forget, did we discuss what is the benefit of moving latency measurement to RocksDB? In any case it'd be good to write that in the description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I feel it's better to report the start/end timestamp of the execution. Then the user can use these values to order the execution result for analysis, or compute latency, etc for other types of analysis.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NowNanos() uses CLOCK_MONOTONIC, which is not real time:
https://linux.die.net/man/3/clock_gettime
And it may not be accurate between threads (cores): https://stackoverflow.com/questions/46893072/clock-gettimeclock-monotonic-monotonicity-across-cores-threads

Typically we cannot get current absolute time at nanos level, only to the Micros level.
So is it okay to use Micros? or switch back to latency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NowNanos() uses CLOCK_MONOTONIC, which is not real time:
https://linux.die.net/man/3/clock_gettime
And it may not be accurate between threads (cores): https://stackoverflow.com/questions/46893072/clock-gettimeclock-monotonic-monotonicity-across-cores-threads

Typically we cannot get current absolute time at nanos level, only to the Micros level.
So is it okay to use Micros? or switch back to latency?

Changed to NowMicros(). Is this OK?

db/db_test2.cc Outdated Show resolved Hide resolved
include/rocksdb/trace_record.h Outdated Show resolved Hide resolved
include/rocksdb/trace_record_result.h Outdated Show resolved Hide resolved
include/rocksdb/trace_record_result.h Outdated Show resolved Hide resolved
include/rocksdb/utilities/replayer.h Outdated Show resolved Hide resolved
trace_replay/trace_record_result.cc Outdated Show resolved Hide resolved
trace_replay/trace_record_result.cc Outdated Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@autopear has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@autopear has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

include/rocksdb/trace_record.h Outdated Show resolved Hide resolved
include/rocksdb/trace_record.h Outdated Show resolved Hide resolved
trace_replay/trace_record_result.cc Outdated Show resolved Hide resolved
virtual uint64_t GetLatency() const;

private:
uint64_t latency_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NowNanos() uses CLOCK_MONOTONIC, which is not real time:
https://linux.die.net/man/3/clock_gettime
And it may not be accurate between threads (cores): https://stackoverflow.com/questions/46893072/clock-gettimeclock-monotonic-monotonicity-across-cores-threads

Typically we cannot get current absolute time at nanos level, only to the Micros level.
So is it okay to use Micros? or switch back to latency?

@facebook-github-bot
Copy link
Contributor

@autopear has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@autopear has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@autopear has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@autopear has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall, mostly comments on the API

include/rocksdb/trace_record_result.h Outdated Show resolved Hide resolved
include/rocksdb/trace_record.h Outdated Show resolved Hide resolved

private:
std::vector<Status> multi_status_;
std::vector<std::string> values_;
Copy link
Contributor

@ajkr ajkr Aug 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, thanks. Exposing the query result via TraceRecordResult is fine with me to be symmetric with how we expose the query parameters via TraceRecord API. Although Replayer::Next(), Replayer::Execute(), and Replayer::Replay() all return Status, they do things like hide NotFound from Get(), or not expose the breakdown of MultiGet() return Statuses.

edit: This comment is just to document my agreement on the PR purpose; nothing needs to be addressed here.

utilities/trace/replayer_impl.cc Outdated Show resolved Hide resolved
trace_replay/trace_record_handler.cc Outdated Show resolved Hide resolved
include/rocksdb/utilities/replayer.h Outdated Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@autopear has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@facebook-github-bot
Copy link
Contributor

@autopear has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@autopear has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@autopear has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@autopear merged this pull request in d10801e.

@autopear autopear deleted the replayer_result branch August 19, 2021 04:06
yoori pushed a commit to yoori/rocksdb that referenced this pull request Nov 26, 2023
Summary:
`Replayer::Execute()` can directly returns the result (e.g, request latency, DB::Get() return code, returned value, etc.)
`Replayer::Replay()` reports the results via a callback function.

New interface:
`TraceRecordResult` in "rocksdb/trace_record_result.h".

`DBTest2.TraceAndReplay` and `DBTest2.TraceAndManualReplay` are updated accordingly.

Pull Request resolved: facebook/rocksdb#8657

Reviewed By: ajkr

Differential Revision: D30290216

Pulled By: autopear

fbshipit-source-id: 3c8d4e6b180ec743de1a9d9dcaee86064c74f0d6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants