Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a graph of command time statistics to grafana #1751

Merged
merged 5 commits into from
Jul 18, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 22 additions & 8 deletions src/pika_admin.cc
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,14 @@ static std::string ConstructPinginPubSubResp(const PikaCmdArgsType& argv) {
return resp.str();
}

static double MethodofCommandStatistics(const uint64_t time_consuming, const uint64_t frequency) {
return (static_cast<double>(time_consuming) / 1000.0) / static_cast<double>(frequency);
}

static double MethodofTotalTimeCalculation(const uint64_t time_consuming) {
return static_cast<double>(time_consuming) / 1000.0;
}

enum AuthResult {
OK,
INVALID_PASSWORD,
Expand Down Expand Up @@ -742,6 +750,8 @@ void InfoCmd::Do(std::shared_ptr<Slot> slot) {
info.append("\r\n");
InfoExecCount(info);
info.append("\r\n");
InfoCommandStats(info);
info.append("\r\n");
InfoCPU(info);
info.append("\r\n");
InfoReplication(info);
Expand Down Expand Up @@ -1254,15 +1264,19 @@ void InfoCmd::InfoCommandStats(std::string& info) {
tmp_stream.precision(2);
tmp_stream.setf(std::ios::fixed);
tmp_stream << "# Commandstats" << "\r\n";
for (auto& iter : *g_pika_server->GetCommandStatMap()) {
if (iter.second.cmd_count != 0) {
tmp_stream << "cmdstat_" << iter.first << ":"
<< "calls=" << iter.second.cmd_count << ",usec="
<< iter.second.cmd_time_consuming
<< ",usec_per_call="
<< static_cast<double>(iter.second.cmd_time_consuming) / static_cast<double>(iter.second.cmd_count)
<< "\r\n";
for (auto iter : *g_pika_server->GetCommandStatMap()) {
if (iter.second.cmd_count != 0) {
tmp_stream << iter.first << ":"
<< "calls=" << iter.second.cmd_count << ", usec="
<< MethodofTotalTimeCalculation(iter.second.cmd_time_consuming)
<< ", usec_per_call=";
if (!iter.second.cmd_time_consuming) {
tmp_stream << 0 << "\r\n";
} else {
tmp_stream << MethodofCommandStatistics(iter.second.cmd_time_consuming, iter.second.cmd_count)
<< "\r\n";
}
}
}
info.append(tmp_stream.str());
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, the code patch you provided seems to address the addition of a new function called InfoCommandStats and the modification of the existing function void InfoCmd::Do(std::shared_ptr<Slot> slot).

Here are a few points to consider for your code review:

  1. Naming: The function name InfoCommandStats suggests that it provides information related to command statistics. Make sure the function accurately represents its purpose.

  2. String Concatenation: Instead of repeated info.append("\r\n"), you can use info += "\r\n" for brevity and readability.

  3. Code Formatting: Maintain consistent code formatting for better readability. In the code you provided, indentation is inconsistent. Ensure proper indentation throughout the code.

  4. Iteration: In the for loop, consider using a reference to avoid unnecessary object copying. Change auto iter to auto& iter in the for loop declaration.

  5. Avoid Unnecessary Cast: In the line static_cast<double>(iter.second.cmd_time_consuming) == 0, casting the value to double to compare with zero is not necessary. You can directly compare it with zero without casting.

  6. Division by Zero: Be cautious about potential division by zero. Make sure to handle the case where iter.second.cmd_count is zero to avoid division by zero. In the updated code, a check has been implemented to handle this scenario.

  7. Code Documentation: Consider adding comments or documentation to clarify the purpose and functionality of the added code, especially if it's complex or not immediately obvious.

Remember to thoroughly test your code changes before deploying them to ensure they work as intended and do not introduce any unexpected bugs.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, the code patch you provided seems fine and does not contain any obvious bugs. However, there are a couple of improvement suggestions:

  1. Formatting: The indentation in the patch appears inconsistent, which can make the code harder to read and maintain. Make sure to use consistent indentation throughout the code.

  2. Clarify variable iteration with value reference: In the line for (auto iter : *g_pika_server->GetCommandStatMap()), it would be more efficient and clearer to iterate by reference using auto& instead of by value. This avoids unnecessary copying of map elements.

  3. Simplification of division by 1000: To convert microseconds to milliseconds, you can simplify the code by dividing iter.second.cmd_time_consuming by 1000 directly in the loop, rather than repeating the division in each usage. This improves both readability and performance.

Here's an updated version of the code that incorporates these suggestions:

void InfoCmd::InfoCommandStats(std::string& info) {
    std::ostringstream tmp_stream;
    tmp_stream.precision(2);
    tmp_stream.setf(std::ios::fixed);

    tmp_stream << "# Commandstats" << "\r\n";
    for (const auto& iter : *g_pika_server->GetCommandStatMap()) {
        if (iter.second.cmd_count != 0) {
            tmp_stream << iter.first << ":"
                       << "calls=" << iter.second.cmd_count
                       << ", usec=" << iter.second.cmd_time_consuming / 1000.0
                       << ", usec_per_call="
                       << (iter.second.cmd_time_consuming / 1000.0) / iter.second.cmd_count
                       << "\r\n";
        }
    }

    info.append(tmp_stream.str());
}

Remember to apply consistent formatting and ensure appropriate error handling and boundary checks for your specific use case.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code review of the given patch is as follows:

  1. MethodofCommandStatistics():

    • It calculates the average time per command based on the time-consuming and frequency.
    • This method seems fine, but it would be better to provide a more descriptive name for this function.
  2. MethodofTotalTimeCalculation():

    • It calculates the total time in seconds based on the time-consuming.
    • This method seems fine, but it would be better to provide a more descriptive name for this function.
  3. InfoCmd::InfoCommandStats():

    • The loop iterating over *g_pika_server->GetCommandStatMap() can be improved by using a reference parameter instead of creating a copy.
    • A check is added before printing the "usec_per_call" value to handle potential division by zero errors.
  4. Overall, the code looks good, and there are no apparent bug risks. However, consider the following suggestions for improvement:

    • Clarify the naming conventions of the methods to make their purpose clearer.
    • Add comments to explain the logic and purpose behind critical sections of code.
    • Consider using the const auto& syntax when iterating over containers if the objects are not modified.
    • Consider separating concerns into smaller, reusable functions to improve readability and maintainability.

Remember that the provided review is based on a brief analysis, and a deeper understanding may require more context about the codebase and its specific requirements.

Expand Down
4 changes: 2 additions & 2 deletions src/pika_server.cc
Original file line number Diff line number Diff line change
Expand Up @@ -184,9 +184,9 @@ void PikaServer::Start() {
}
}
CommandStatistics statistics;
auto cmdstat_map = *g_pika_server->GetCommandStatMap();
auto cmdstat_map = g_pika_server->GetCommandStatMap();
for (auto& iter : *g_pika_cmd_table_manager->GetCmdTable()) {
cmdstat_map.emplace(iter.first, statistics);
cmdstat_map->emplace(iter.first, statistics);
}
LOG(INFO) << "Pika Server going to start";

Expand Down
8 changes: 8 additions & 0 deletions tools/pika_exporter/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,12 @@ prometheus --config.file=./grafana/prometheus.yml

### Pika Command Execution Time

| Metrics Name | Metric Type | Labels | Metrics Value | Metric Desc |
|-----------------------------------| ----------- |----------------------|-----------------------------------------------|--------------------------------------------------------------------------|
| calls | `Counter` | {addr="", alias=""} | the value of `calls` | Pika Number of times each command is invoked |
| usec | `Counter` | {addr="", alias=""} | the value of `usec` | Total time taken by each Pika command |
| usec_per_call | `Counter` | {addr="", alias=""} | the value of `usec_per_call` | Average time of each Pika command |

### Rocksdb Metrics

| Serial Number | Metric | Meaning |
Expand Down Expand Up @@ -203,3 +209,5 @@ Screenshots:
![Network](./contrib/network.png)

![RocksDB](./contrib/rocksdb.png)

![Commandstats](./contrib/command_time.png)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the code patch you provided, here are a few observations:

  1. It looks like you're adding new metrics for Pika command execution time and RocksDB metrics to your Prometheus configuration.
  2. For the Pika Command Execution Time metrics, you have defined three counters: calls, usec, and usec_per_call with corresponding labels. Make sure that the metric values being recorded align with these definitions in your codebase.
  3. The RocksDB metrics and their associated explanations look fine based on the provided code snippet.
  4. The screenshots you included at the end of the patch demonstrate the visual representation of the network, RocksDB, and commandstats metrics.

Without additional context or the actual code implementation, it's difficult to identify specific bugs or suggest improvements. However, ensure that the metrics are correctly implemented, labeled, and associated with the appropriate data sources or calculations. Additionally, consider verifying that the Prometheus configuration interacts correctly with your monitoring stack and tools.

Binary file added tools/pika_exporter/contrib/command_time.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
40 changes: 40 additions & 0 deletions tools/pika_exporter/exporter/metrics/commandstats.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
package metrics

import "regexp"

func init() {
Register(collectCommandstatsMetrics)
}

var collectCommandstatsMetrics = map[string]MetricConfig{
"commandstats_info": {
Parser: &regexParser{
name: "commandstats_info",
reg: regexp.MustCompile(`(?P<cmd>[\w]+)[:\s]*calls=(?P<calls>[\d]+)[,\s]*usec=(?P<usec>[\d+\.\d+]+)[,\s]*usec_per_call=(?P<usec_per_call>[\d+\.\d+]+)`),
Parser: &normalParser{},
},
MetricMeta: MetaDatas{
{
Name: "calls",
Help: "Pika Number of times each command is invoked",
Type: metricTypeGauge,
Labels: []string{LabelNameAddr, LabelNameAlias, "cmd"},
ValueName: "calls",
},
{
Name: "usec",
Help: "Total time taken by each Pika command",
Type: metricTypeGauge,
Labels: []string{LabelNameAddr, LabelNameAlias, "cmd"},
ValueName: "usec",
},
{
Name: "usec_per_call",
Help: "Average time of each Pika command",
Type: metricTypeGauge,
Labels: []string{LabelNameAddr, LabelNameAlias, "cmd"},
ValueName: "usec_per_call",
},
},
},
}
Loading
Loading