chore: add time shift function and some refactoring #4445

srikanthccv · 2024-01-28T13:49:28Z

Summary

Part of #4016

The number of files it touches is big but the changes are scoped. Instead of creating one PR for each I thought to include them all here.

Use v4 tables wherever possible in the new builder and update tests
Add metric type so that frontend can have options relevant to each type.
Add endpoint for metric metadata which includes the unit, and description from the instrument. This can be used to autofill the y-axis unit.
Update temporality logic to fetch only if we don't know.
Adjust the queried to not run formula anymore and replace it with query-serivce
Fix some gaps noticed like double le
Add some more validation
Add support for timeshift.

Summary by CodeRabbit

New Features
- Introduced enhanced metrics retrieval and metadata fetching capabilities.
- Added time-shifting functionality for metrics data.
- Implemented more efficient and accurate query processing for metrics.
Improvements
- Improved handling of query temporality to reduce redundant operations.
- Refined error handling and validation for more robust query parsing.
Bug Fixes
- Fixed inconsistencies in time filtering across various metrics retrieval processes.
Refactor
- Updated and optimized SQL queries and logic for handling metrics data.
- Enhanced query builder to support new functionalities and improve efficiency.

coderabbitai · 2024-01-28T13:49:36Z

Walkthrough

The update introduces significant enhancements to the query service, focusing on improved metric metadata handling, query efficiency, and temporal logic adjustments. New functionalities include fetching metric metadata, converting label maps to arrays, storing metric temporality to optimize queries, and refining SQL queries for better performance. Additionally, it introduces logic for time-shifting in queries and updates to handle new table structures and time filtering consistency.

Changes

Files	Change Summary
`.../clickhouseReader/reader.go`, `.../http_handler.go`, `.../interfaces/interface.go`, `.../model/v3/v3.go`	Introduced handling for metric metadata, added `GetMetricMetadata` functionality, and updated interfaces.
`.../formula.go`, `.../http_handler.go`	Enhanced result processing to convert label maps to arrays and store metric temporality.
`.../metrics/v4/...`, `.../helpers/...`	Updated SQL queries for accuracy and efficiency, refined time filtering, and added helper functions for attribute handling.
`.../queryBuilder/...`, `.../parser.go`, `.../querier/v2/querier.go`	Enhanced query builder logic for time shifting, improved error handling, and adjusted query execution logic.
`constants/constants.go`	Updated constants to reflect new table names and versions.

Related issues

Metrics query builder improvements #4016: This PR addresses objectives related to enhancing metric metadata handling, supporting temporal and spatial aggregations, and improving the BuilderQuery payload to accommodate new functionalities, aligning with the issue's goals for API and query builder enhancements.

🐇✨
In the realm of data, where queries fly,
A rabbit worked hard, beneath the digital sky.
With metadata fetched, and queries refined,
A leap towards efficiency, with every line designed.
🌟📊

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit-tests for this file.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit tests for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository from git and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit tests.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
The JSON schema for the configuration file is available here.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

coderabbitai

Review Status

Actionable comments generated: 12

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between c658178 and 144ee68.

Files selected for processing (19)

pkg/query-service/app/clickhouseReader/reader.go (6 hunks)
pkg/query-service/app/formula.go (1 hunks)
pkg/query-service/app/http_handler.go (8 hunks)
pkg/query-service/app/metrics/v4/cumulative/table_test.go (2 hunks)
pkg/query-service/app/metrics/v4/cumulative/timeseries.go (1 hunks)
pkg/query-service/app/metrics/v4/cumulative/timeseries_test.go (4 hunks)
pkg/query-service/app/metrics/v4/delta/table_test.go (2 hunks)
pkg/query-service/app/metrics/v4/delta/time_series_test.go (4 hunks)
pkg/query-service/app/metrics/v4/delta/timeseries.go (2 hunks)
pkg/query-service/app/metrics/v4/helpers/clauses.go (2 hunks)
pkg/query-service/app/metrics/v4/helpers/sub_query.go (2 hunks)
pkg/query-service/app/metrics/v4/query_builder.go (3 hunks)
pkg/query-service/app/metrics/v4/query_builder_test.go (14 hunks)
pkg/query-service/app/parser.go (3 hunks)
pkg/query-service/app/querier/v2/querier.go (1 hunks)
pkg/query-service/app/queryBuilder/functions.go (1 hunks)
pkg/query-service/constants/constants.go (1 hunks)
pkg/query-service/interfaces/interface.go (1 hunks)
pkg/query-service/model/v3/v3.go (8 hunks)

Files not reviewed due to errors (1)

pkg/query-service/app/clickhouseReader/reader.go (Error: unable to parse review)

Additional comments: 31

pkg/query-service/app/metrics/v4/helpers/clauses.go (2)

40-49: Ensure the exclusion of "le" from grouping is consistent with the intended use cases across the application. This change could impact data aggregation and visualization if "le" is a significant attribute in some contexts.

74-94: Similar to the grouping function, verify that excluding "le" from ordering does not inadvertently affect data presentation or analysis where "le" is relevant.

pkg/query-service/app/metrics/v4/query_builder.go (2)

24-26: The time shift logic directly modifies the start and end times. Ensure this adjustment is correctly applied in all relevant contexts and does not introduce discrepancies in time range calculations.

75-82: Excluding "le" from group-by and order-by clauses could significantly impact query results. Confirm this behavior aligns with the intended query logic and does not omit necessary data in analyses that rely on "le".

pkg/query-service/app/metrics/v4/delta/table_test.go (2)

56-56: Confirm the updated query accurately reflects the intended test scenario, especially the changes to table names, fields, and timestamp calculations.

98-98: Similar to the previous comment, ensure the updated query in this test case correctly matches the new schema and accurately tests the intended functionality.

pkg/query-service/app/metrics/v4/cumulative/table_test.go (2)

54-54: Ensure the updated query in this test case is correct and aligns with the new schema and intended functionality, particularly the changes to table names and timestamp calculations.

96-96: As with the previous comment, verify that the updated query correctly reflects the intended test scenario and schema changes.

pkg/query-service/app/metrics/v4/helpers/sub_query.go (3)

13-16: The introduction of constants for time durations is a good practice for clarity and maintainability. Confirm these values are used consistently throughout the application where applicable.

18-39: The which function's logic for selecting the appropriate table based on the time range and adjusting the start time is crucial for query accuracy. Ensure this logic is thoroughly tested, especially the boundary conditions between different time ranges.

3-56: > 📝 NOTE

This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [42-114]

The integration of the which function in PrepareTimeseriesFilterQuery to dynamically select the table name based on the time range is significant. Verify that this dynamic table selection does not introduce any inaccuracies in the generated queries, especially in edge cases near the time range boundaries.

pkg/query-service/app/metrics/v4/delta/timeseries.go (2)

17-17: The inclusion of start and end parameters in prepareTimeAggregationSubQuery is important for accurate time filtering. Confirm that this change is correctly implemented across all calls to this function.

22-29: The update to use unix_milli for time filtering and the change to SIGNOZ_SAMPLES_V4_TABLENAME indicate schema updates. Ensure these changes are accurately reflected in all relevant queries and tests.

pkg/query-service/interfaces/interface.go (1)

102-102: The addition of GetMetricMetadata to the Reader interface expands its responsibilities. Ensure that all implementing classes are updated accordingly to support this new method.

pkg/query-service/app/queryBuilder/functions.go (1)

284-298: The introduction of funcTimeShift to shift timestamps in result series is a significant addition. Ensure that the logic correctly handles both positive and negative shifts and that it is accurately applied to all points in a series.

pkg/query-service/app/metrics/v4/delta/time_series_test.go (2)

69-69: Ensure the replacement of timestamp_ms with unix_milli aligns with the updated time representation across the system.

110-110: Verify that the updated query, with unix_milli replacing timestamp_ms, correctly reflects the intended time intervals and filtering criteria.

pkg/query-service/app/metrics/v4/cumulative/timeseries_test.go (2)

69-69: Ensure the replacement of timestamp_ms with unix_milli is consistent with the system's updated time representation.

110-110: Confirm that the query update, replacing timestamp_ms with unix_milli, accurately reflects the intended time intervals and filtering criteria.

pkg/query-service/app/metrics/v4/cumulative/timeseries.go (1)

110-115: Confirm that the replacement of timestamp_ms with unix_milli in the time filtering logic is consistent with the system's updated time representation and does not introduce any inconsistencies.

pkg/query-service/constants/constants.go (1)

206-215: The introduction of new constants for table names supports expanded data storage structures. Ensure these constants are consistently used across the application and check for potential naming conflicts.

pkg/query-service/app/metrics/v4/query_builder_test.go (1)

58-64: > 📝 NOTE

This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [36-141]

The updates to SQL queries in test cases, including the adoption of time_series_v4 table and the addition of time range conditions, are correctly implemented and align with the PR objectives.

pkg/query-service/model/v3/v3.go (4)

466-481: Ensure the TimeAggregation enum includes all relevant aggregation types that the system supports. Missing or extraneous types could lead to incorrect query behavior.

508-524: Similar to TimeAggregation, verify that SpaceAggregation covers all necessary aggregation types for spatial data processing.

589-596: > 📝 NOTE

This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [574-593]

The addition of FunctionNameTimeShift extends the system's functionality. Confirm that all necessary logic for handling time shifts, including parsing and applying the shift to queries, is implemented elsewhere in the codebase.

992-1000: The MetricMetadataResponse struct is well-defined. Ensure that all fields are correctly populated from the data source and that the frontend correctly interprets this structure.

pkg/query-service/app/parser.go (5)

938-938: The error message in validateExpressions function now includes the invalid expression along with the error. This change improves error clarity for debugging.

941-941: The addition of a check for unknown variables in expressions enhances the validation logic, preventing runtime errors due to undefined variables.

963-963: Improved error reporting in ParseQueryRangeParams by specifying that the request body cannot be parsed, which aids in troubleshooting.

971-971: The initialization of formattedVars map in ParseQueryRangeParams is a preparatory step for handling different query types, which is a good practice for code clarity and maintainability.

1049-1066: The logic to handle time shifting functions in query processing is introduced. This includes moving the time shift function to the beginning of the list and calculating the ShiftBy value. This change enables dynamic adjustment of time ranges in queries, aligning with the PR objectives.

pkg/query-service/app/querier/v2/querier.go

pkg/query-service/model/v3/v3.go

pkg/query-service/app/http_handler.go

pkg/query-service/app/formula.go

ankitnayan · 2024-02-10T17:29:52Z

Add endpoint for metric metadata which includes the unit, and description from the instrument. This can be used to autofill the y-axis unit.

This can also be used to suggest/restrict the aggregations that can be used for the metric name? So, should be added to the api that fetches/suggests metric name too? If this is incremental, we can pick this up when needed

srikanthccv · 2024-02-10T17:39:17Z

Yes, I added the metric type here https://github.com/SigNoz/signoz/pull/4445/files#diff-a8c15149b48aee5261176aa62fa630bb7bde1c4362583ff95871ba24cec75bf9R3976, which helps frontend to show the appropriate options. I discussed this with @YounixM already. The unit, description is good to have that can't be added in the v3.AttributeKey so those two can be added to the response api when we update the apis structure to v4.

pkg/query-service/app/clickhouseReader/reader.go

ankitnayan · 2024-02-10T17:47:06Z

Yes, I added the metric type here https://github.com/SigNoz/signoz/pull/4445/files#diff-a8c15149b48aee5261176aa62fa630bb7bde1c4362583ff95871ba24cec75bf9R3976, which helps frontend to show the appropriate options. I discussed this with @YounixM already. The unit, description is good to have that can't be added in the v3.AttributeKey so those two can be added to the response api when we update the apis structure to v4.

So the frontend is still planned to work with v3 APIs and the backend will be prepared for v4 structure?

pkg/query-service/app/metrics/v4/cumulative/timeseries.go

pkg/query-service/app/metrics/v4/helpers/clauses.go

srikanthccv · 2024-02-10T18:00:40Z

Yes, the work I did is under the v4 endpoint, but the request/response payload is structurally v3 (added time, space aggregation fields, and functions) and the frontend will work with that. V4 request/response structure will be a breaking change with each signal sharing the common parts and extending to accommodate their signal-specific details.

Since we are discussing these details, let me share the migration plan we want to take as well. We will start maintaining the version for dashboards from now on and make upgrades gracefully from the UI side when possible. The path we discussed for metrics is, to use the dashboard version to decide whether to show time and space aggregation options or show the existing version. We will provide the instructions and scripts to move away but there is no instant migration for all users since we can't automatically migrate some aggregate operators without knowing the context of the metric.

pkg/query-service/app/metrics/v4/helpers/sub_query.go

pkg/query-service/app/parser.go

pkg/query-service/app/queryBuilder/functions.go

srikanthccv · 2024-02-10T18:57:38Z

We will have the delta data backfilled in a week for span metrics and the tables used are v4. Here is the script I used to compare the delta and cumulative side by side https://gist.github.com/srikanthccv/882027fe4229cef8fa403f28fe8babd0. We will most likely shift to the delta + v4 tables for span metrics soon unless something strange comes up in further testing.

ankitnayan · 2024-02-10T19:02:05Z

We will have the delta data backfilled in a week for span metrics and the tables used are v4. Here is the script I used to compare the delta and cumulative side by side https://gist.github.com/srikanthccv/882027fe4229cef8fa403f28fe8babd0. We will most likely shift to the delta + v4 tables for span metrics soon unless something strange comes up in further testing.

But using delta is not just for quantiles right? The script compares quantiles

srikanthccv · 2024-02-10T19:50:01Z

Right, there are two types where delta is used - 1. Histograms 2. Counters. Counters' most common case as we know is rates. The Prometheus style quantiles we support today use the sum rate internally. If the quantiles are correct, we can assume the sum rate (by bucket) is correct as well.

srikanthccv added 12 commits January 17, 2024 01:38

chore: add functions support

2036fbc

chore: support p9{9,5,0},75,50 for space aggregation

18e4518

chore: update access permission

b30c511

Merge branch '4016-7-add-functions' into 4016-8-spaceagg-percentile

e5f5c70

chore: add formula eval in query-service

4077cec

Merge branch 'develop' into 4016-8-spaceagg-percentile

f02cf50

chore: update pkg/query-service/model/v3/v3.go

4f3c86b

Merge branch '4016-8-spaceagg-percentile' into 4016-9-formula

cfe6d25

Merge branch 'develop' into 4016-9-formula

fbe2e46

chore: add documentation

1e106b9

chore: add time shift function

873ed88

chore: some cleanup

46ce151

github-actions bot added the chore label Jan 28, 2024

srikanthccv added 5 commits February 5, 2024 01:35

chore: add validation and more tests for formula eval

535eaa0

chore: resolve conflicts

22996aa

chore: include rabbit suggestions

1544337

Merge branch '4016-9-formula' into 4016-10-finishup

fec516f

chore: fix the gaps noticed

0036c3f

Base automatically changed from 4016-9-formula to develop February 6, 2024 16:59

srikanthccv added 2 commits February 6, 2024 22:32

Merge develop into 4016-10-finishup

4f752af

chore: skip additional le for latenct metrics

144ee68

srikanthccv marked this pull request as ready for review February 8, 2024 16:35

srikanthccv requested a review from ankitnayan February 8, 2024 16:36

coderabbitai bot reviewed Feb 8, 2024

View reviewed changes

Merge branch 'develop' into 4016-10-finishup

d056ab2