-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: telemetry improvements #1393
Conversation
WalkthroughThis update focuses on refining telemetry configurations in the Tailcall project to align with OpenTelemetry standards. It includes standardizing naming conventions for metrics and traces, enhancing observability, introducing request header tracking, and ensuring consistency across the telemetry implementation. Changes
Assessment against linked issues
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1393 +/- ##
==========================================
- Coverage 88.52% 88.47% -0.06%
==========================================
Files 129 129
Lines 13755 13799 +44
==========================================
+ Hits 12177 12208 +31
- Misses 1578 1591 +13 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 9
Configuration used: CodeRabbit UI
Files ignored due to path filters (2)
tests/snapshots/execution_spec__caching.md_assert_metrics_0.snap
is excluded by:!**/*.snap
tests/snapshots/execution_spec__caching.md_assert_traces_0.snap
is excluded by:!**/*.snap
Files selected for processing (31)
- .github/contributing.md (1 hunks)
- autogen/src/main.rs (1 hunks)
- aws-lambda/src/main.rs (3 hunks)
- examples/telemetry-otlp.graphql (1 hunks)
- generated/.tailcallrc.graphql (2 hunks)
- generated/.tailcallrc.schema.json (3 hunks)
- src/blueprint/blueprint.rs (1 hunks)
- src/blueprint/from_config.rs (1 hunks)
- src/blueprint/into_schema.rs (1 hunks)
- src/blueprint/telemetry.rs (3 hunks)
- src/cli/metrics.rs (1 hunks)
- src/cli/server/server.rs (1 hunks)
- src/cli/telemetry.rs (5 hunks)
- src/config/config.rs (3 hunks)
- src/config/from_document.rs (2 hunks)
- src/config/reader.rs (2 hunks)
- src/config/telemetry.rs (3 hunks)
- src/http/request_handler.rs (4 hunks)
- src/lambda/concurrent.rs (1 hunks)
- src/lambda/expression.rs (4 hunks)
- src/lambda/io.rs (1 hunks)
- src/lambda/list.rs (1 hunks)
- src/lambda/logic.rs (1 hunks)
- src/lambda/math.rs (1 hunks)
- src/lambda/relation.rs (1 hunks)
- src/main.rs (3 hunks)
- src/rest/endpoint.rs (4 hunks)
- src/rest/partial_request.rs (2 hunks)
- src/rest/path.rs (2 hunks)
- src/tracing.rs (2 hunks)
- tests/telemetry/init.rs (2 hunks)
Check Runs (1)
codecov/patch completed (15)
- src/blueprint/telemetry.rs: 68-71: Added lines #L68 - L71 were not covered by tests
- src/cli/telemetry.rs: 95-98: Added lines #L95 - L98 were not covered by tests
- src/cli/telemetry.rs: 205-205: Added line #L205 was not covered by tests
- src/cli/telemetry.rs: 219-219: Added line #L219 was not covered by tests
- src/cli/telemetry.rs: 228-231: Added lines #L228 - L231 were not covered by tests
- src/http/request_handler.rs: 26-33: Added lines #L26 - L33 were not covered by tests
- src/http/request_handler.rs: 38-46: Added lines #L38 - L46 were not covered by tests
- src/http/request_handler.rs: 48-54: Added lines #L48 - L54 were not covered by tests
- src/http/request_handler.rs: 57-57: Added line #L57 was not covered by tests
- src/http/request_handler.rs: 202-202: Added line #L202 was not covered by tests
- src/http/request_handler.rs: 209-209: Added line #L209 was not covered by tests
- src/lambda/expression.rs: 37-37: Added line #L37 was not covered by tests
- src/main.rs: 28-28: Added line #L28 was not covered by tests
- src/main.rs: 43-43: Added line #L43 was not covered by tests
- src/tracing.rs: 106-106: Added line #L106 was not covered by tests
Additional comments: 44
src/cli/metrics.rs (1)
- 9-9: The change from an observable counter to an observable gauge for "cache.hit_rate" is appropriate for representing the cache hit rate ratio over time. Observable gauges are better suited for values that can increase and decrease, like hit rates. Ensure that the gauge is correctly updated in all relevant parts of the application to maintain accurate telemetry.
examples/telemetry-otlp.graphql (2)
- 15-15: Adding the
requestHeaders
field with the value["user-id"]
to the telemetry schema is a good practice for capturing user-specific information in telemetry data. This aligns with the PR's objective to provide more detailed IO information. Ensure that the inclusion of user IDs complies with privacy and data protection regulations.- 21-21: The addition of the
@cache(maxAge: 3000)
directive to theposts
field is a positive change for improving the efficiency of data retrieval. It specifies that the posts data can be cached for up to 3000 milliseconds, reducing the need for frequent backend calls. Verify that this caching duration aligns with the application's data freshness requirements.src/rest/partial_request.rs (2)
- 5-5: The import of the
Path
module is necessary for the addition of thepath
field to thePartialRequest
struct. This change supports the PR's goal of enhancing telemetry with detailed IO information. Ensure that thePath
module provides the expected functionality and integrates well with the rest of the application.- 15-15: Adding the
path
field to thePartialRequest
struct is a valuable enhancement for telemetry, as it allows capturing the request path information. This addition aligns with the PR's objectives and can provide insights into the usage patterns of different endpoints. Ensure that thepath
information is correctly populated and used in telemetry data.aws-lambda/src/main.rs (2)
- 11-11: The inclusion of the
get_log_level
function from thetailcall::tracing
module simplifies the retrieval of the log level, enhancing code readability and maintainability. This change aligns with the PR's objectives of improving observability and telemetry within the project. Ensure that theget_log_level
function correctly handles all possible scenarios for log level configuration.- 20-25: Refactoring the logging configuration to use the
get_log_level
function is a positive change that simplifies the code and improves readability. The default log level ofTRACE
ensures that all logs are captured by default, which can be filtered later in CloudWatch. This approach aligns with best practices for observability. Verify that the log level configuration works as expected in different deployment environments.src/lambda/concurrent.rs (1)
- 9-9: Adding the
strum_macros::Display
trait to theConcurrent
enum is a beneficial change that enhances its functionality for display purposes. This makes telemetry data more user-friendly and accessible, aligning with the PR's objectives of improving observability. Ensure that the display representations of the enum variants are clear and meaningful to the end-users.src/cli/server/server.rs (1)
- 41-41: Modifying the
init_opentelemetry
function call to useblueprint.telemetry
instead ofblueprint.opentelemetry
is a positive change that reflects an effort to standardize telemetry configuration. This renaming aligns with the PR's objectives of enhancing telemetry capabilities. Ensure that all references to the oldopentelemetry
field are updated accordingly throughout the project.src/main.rs (3)
- 8-8: The renaming of tracing function calls from
default_tailcall_tracing()
todefault_tracing_tailcall()
is a part of the effort to standardize and improve telemetry and observability. This change aligns with the PR's objectives. Ensure that all references to the old function name are updated throughout the project to maintain consistency.- 28-28: The change in tracing function calls to
default_tracing_tailcall()
in theon_thread_start
closure is appropriate for setting up the default tracing configuration for CLI output. This ensures that logs are captured correctly before the global subscriber is set. Verify that this change does not impact the overall tracing behavior negatively.- 43-43: Using
default_tracing_tailcall()
to set the default tracing subscriber for the main thread is consistent with the changes made elsewhere in the project. This ensures that CLI logs are captured according to the standardized tracing configuration. Confirm that this setup works as expected across different execution environments.tests/telemetry/init.rs (2)
- 8-8: Replacing the usage of
tracing_subscriber::filter::filter_fn
withtailcall::tracing::default_tracing
andfilter_target
for log filtering is a positive change that standardizes the tracing configuration. This approach simplifies the setup and makes it more maintainable. Ensure that thefilter_target
function correctly filters logs related to "execution_spec" as intended.- 50-52: The updated subscriber configuration using
default_tracing().with_filter(filter_target("execution_spec"))
is a good practice for standardizing log filtering. This change aligns with the PR's objectives of enhancing telemetry and observability. Verify that the filtering works as expected and does not inadvertently exclude important logs.src/rest/path.rs (3)
- 24-24: Adding a
pattern
field to thePath
struct is a good enhancement for storing the original pattern string. This could be useful for debugging or logging purposes.- 29-31: The
as_str
method provides a convenient way to access thepattern
as a string slice. This is a straightforward and useful addition.- 50-50: The update to the
parse
method to initialize thepattern
field with the input string is correctly implemented. It ensures that thepattern
field is always in sync with the input used to create thePath
instance.src/lambda/list.rs (1)
- 12-12: Deriving the
strum_macros::Display
trait for theList
enum is a beneficial change. It simplifies the process of converting enum variants to their string representation, which can be particularly useful for logging or error messages.src/blueprint/telemetry.rs (1)
- 29-29: Adding the
request_headers
field to theTelemetry
struct is a significant improvement. It allows for specifying headers for telemetry exporters, addressing potential data leakage concerns and enhancing configurability.autogen/src/main.rs (1)
- 20-20: Updating the tracing configuration function call from
default_crate_tracing
todefault_tracing_for_name
in themain
function is a positive change. It aligns with the refactoring efforts to improve clarity and consistency in naming conventions.src/tracing.rs (2)
- 74-75: Refactoring
default_tracing_tailcall
to utilizedefault_tracing_for_name
with a specific name is a good practice. It simplifies the code and makes the tracing configuration more flexible.- 78-79: The introduction of
default_tracing_for_name
function provides a more granular control over tracing configurations, allowing for specific naming. This is a valuable addition for better observability.src/blueprint/from_config.rs (1)
- 38-39: Renaming references from
opentelemetry
totelemetry
within theconfig_blueprint
function is a thoughtful change. It aligns with the broader effort to standardize and clarify the configuration terminology.src/lambda/logic.rs (1)
- 9-9: Deriving the
strum_macros::Display
trait for theLogic
enum is a beneficial change. It simplifies the process of converting enum variants to their string representation, which can be particularly useful for logging or error messages.src/blueprint/into_schema.rs (1)
- 59-60: The change in span naming from "field::resolver" to "field_resolver" and the addition of the "otel.name" field align well with OpenTelemetry standards, enhancing readability and consistency in telemetry data. Ensure the integration with telemetry platforms like DataDog and NewRelic is tested to verify that these changes are correctly reflected.
.github/contributing.md (1)
- 50-62: The new Telemetry section provides valuable guidelines for implementing observability standards following the OpenTelemetry specification. It's well-structured and informative. Consider adding examples or links to further documentation on OpenTelemetry for contributors who might be new to these concepts.
src/lambda/math.rs (1)
- 13-13: Adding the
strum_macros::Display
trait to theMath
enum is a good practice for improving debuggability and logging. Ensure that all enum variants have meaningful display representations for consistency and clarity in logs.src/blueprint/blueprint.rs (1)
- 25-25: Renaming the
opentelemetry
field totelemetry
in theBlueprint
struct enhances clarity and avoids potential confusion with the OpenTelemetry project. Ensure that this renaming is consistently applied throughout the project and that the integration with the telemetry system is verified.src/config/telemetry.rs (1)
- 122-131: The modifications to the
merge_right
andrender_mustache
methods in theTelemetry
struct follow Rust's ownership and borrowing rules effectively. These changes improve the flexibility and maintainability of telemetry configuration. Verify that these method changes integrate smoothly with the rest of the telemetry system.src/lambda/io.rs (1)
- 24-24: Adding the
strum_macros::Display
trait to theIO
enum is a good practice for improving logging and error messaging. Ensure that all enum variant names are clear and descriptive, as they will directly influence the output of theto_string
method.src/cli/telemetry.rs (2)
- 29-29: Renaming
default_tailcall_tracing
todefault_tracing_tailcall
improves naming consistency and clarity. Good job on standardizing naming conventions.- 95-98: Modifications to telemetry initialization introduce additional configuration options and adjustments to tracing layers and filters. These changes are likely to offer more granular control over telemetry behavior. Please verify their impact on performance and usability.
Also applies to: 205-205, 219-219, 228-231
src/rest/endpoint.rs (2)
- 101-105: Adding the
path
field to thePartialRequest
struct is a useful enhancement for providing more detailed request matching information. Ensure that the path information is properly sanitized and validated to prevent security issues.- 141-141: Removing extra whitespace from the
MULTIPLE_TEST_QUERY
constant improves code readability and consistency. Good job on maintaining code cleanliness.src/http/request_handler.rs (3)
- 26-33: The static
REQUEST_COUNTER
is correctly initialized using theLazy
pattern, ensuring thread safety and lazy initialization. However, it's important to ensure that the metric namerequest.count
and its description align with the telemetry standards you're adopting.- 133-138: The addition of
update_request_count
in thegraphql_request
function is a good practice for telemetry. Ensure that the route/graphql
is consistently used across your telemetry to accurately represent GraphQL requests.- 183-218: > 📝 NOTE
This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [215-246]
The
handle_request
function correctly routes requests based on the URI path and method. Ensure that the telemetry and tracing are consistently applied across all request types, including GraphQL, REST, and Prometheus metrics endpoints.src/lambda/relation.rs (1)
- 15-15: Adding the
strum_macros::Display
trait to theRelation
enum is a good practice for improving the usability and debuggability of your code. Ensure that the display representations are meaningful and consistent with your application's logging and telemetry standards.src/config/from_document.rs (1)
- 48-57: Renaming the function from
opentelemetry
totelemetry
and updating its usage within thefrom_document
function aligns with the objective of standardizing telemetry configuration. Ensure that all references to the old function name are updated across the codebase to maintain consistency.src/config/reader.rs (1)
- 296-302: > 📝 NOTE
This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [299-310]
The method
update_opentelemetry
has been updated to reflect the change fromopentelemetry
totelemetry
within theConfigModule
struct. This change aligns with the PR's objective to standardize and improve telemetry capabilities. However, ensure that all references to the oldopentelemetry
field elsewhere in the codebase have been updated accordingly to prevent any inconsistencies or errors.src/config/config.rs (2)
- 61-61: The renaming of the
opentelemetry
field totelemetry
in theConfig
struct aligns with the PR's objectives for standardizing telemetry configuration. Ensure all references to this field throughout the codebase are updated to reflect this change.Verification successful
The references to
opentelemetry
found in the codebase are related to the use of the OpenTelemetry library itself and not to the specific field in theConfig
struct that was renamed fromopentelemetry
totelemetry
. These references are valid and indicate that the renaming has been correctly handled in the context of theConfig
struct without leaving outdated references. The review comment is therefore verified as correct in its context.* 177-183: > 📝 **NOTE** > This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [180-189]Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for any remaining references to the old field name 'opentelemetry'. rg --type rust 'opentelemetry'Length of output: 6581
The update to the
merge_right
method in theConfig
struct to handle the renamedtelemetry
field is correct. Ensure theTelemetry
struct'smerge_right
method is implemented correctly to support this functionality.Verification successful
The implementation of the
merge_right
method in theTelemetry
struct has been successfully located and inspected. The method correctly merges theexport
andrequest_headers
fields of twoTelemetry
instances, handling various cases for theexport
field and extending therequest_headers
from one instance with those of the other. This implementation aligns with the expected functionality for merging telemetry configurations in the context of theConfig
struct's use of this method.Based on this information, the update to the
merge_right
method in theConfig
struct to handle the renamedtelemetry
field, as well as the implementation of theTelemetry
struct'smerge_right
method, are correct and support the intended functionality.Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Verify the implementation of the `Telemetry` struct's `merge_right` method. rg --type rust 'fn merge_right' src/config/telemetry.rsLength of output: 332
Script:
#!/bin/bash # Inspect the public `merge_right` method implementation in the `Telemetry` struct. rg --type rust 'pub fn merge_right\(mut self, other: Self\) -> Self {' -A 10 src/config/telemetry.rsLength of output: 272
Script:
#!/bin/bash # Corrected command to inspect the public `merge_right` method implementation in the `Telemetry` struct. rg --type rust 'pub fn merge_right(mut self, other: Self) -> Self {' -A 10 src/config/telemetry.rsLength of output: 266
Script:
#!/bin/bash # Simplified command to find the `merge_right` method in the `Telemetry` struct without complex regex. rg --type rust 'pub fn merge_right' src/config/telemetry.rs -A 10Length of output: 495
generated/.tailcallrc.schema.json (2)
- 33-40: The addition of the
telemetry
section with a reference to#/definitions/Telemetry
aligns with the PR's objective to enhance telemetry capabilities. It's crucial to ensure that theTelemetry
definition (not visible in the provided snippet) properly encapsulates all necessary properties and adheres to the OpenTelemetry specification.- 1676-1681: The introduction of the
requestHeaders
field under thetelemetry
section is a significant improvement, allowing for more detailed telemetry data. However, the warning about leaking sensitive information is crucial. It's recommended to provide examples or guidelines on how to safely use this feature without exposing sensitive data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 0
Configuration used: CodeRabbit UI
Files selected for processing (9)
- generated/.tailcallrc.graphql (2 hunks)
- generated/.tailcallrc.schema.json (3 hunks)
- src/blueprint/blueprint.rs (1 hunks)
- src/blueprint/telemetry.rs (3 hunks)
- src/cli/telemetry.rs (5 hunks)
- src/config/telemetry.rs (3 hunks)
- src/http/request_handler.rs (4 hunks)
- src/lambda/expression.rs (4 hunks)
- src/lambda/io.rs (1 hunks)
Files skipped from review as they are similar to previous changes (9)
- generated/.tailcallrc.graphql
- generated/.tailcallrc.schema.json
- src/blueprint/blueprint.rs
- src/blueprint/telemetry.rs
- src/cli/telemetry.rs
- src/config/telemetry.rs
- src/http/request_handler.rs
- src/lambda/expression.rs
- src/lambda/io.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 1
Configuration used: CodeRabbit UI
Files ignored due to path filters (3)
tests/snapshots/execution_spec__caching.md_assert_metrics_0.snap
is excluded by:!**/*.snap
tests/snapshots/execution_spec__caching.md_assert_traces_0.snap
is excluded by:!**/*.snap
tests/snapshots/execution_spec__caching.md_merged.snap
is excluded by:!**/*.snap
Files selected for processing (2)
- src/http/request_handler.rs (4 hunks)
- tests/execution/caching.md (1 hunks)
Additional comments: 5
tests/execution/caching.md (2)
- 9-10: The increase in cache expiration time for
fieldCache
andfieldCacheList
queries is noted. Ensure to assess the impact of this change on data freshness, especially for frequently changing data.- 6-13: > 📝 NOTE
This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [1-1]
The YAML configurations for mocking requests and asserting caching behavior appear comprehensive. Consider adding more test cases if there are specific edge cases not covered by the current tests.
src/http/request_handler.rs (3)
- 10-13: The new imports for telemetry improvements are correctly added and necessary for the functionality being introduced.
- 26-33: The implementation of the static
REQUEST_COUNTER
usingLazy
is correctly done and essential for counting incoming requests as part of the telemetry improvements.- 213-254: > 📝 NOTE
This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [157-245]
The updates to request handling functions, including the integration of request count updates and tracing spans, correctly enhance observability as part of the telemetry improvements. Consider adding comments to these functions for better clarity and maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 0
Configuration used: CodeRabbit UI
Files ignored due to path filters (1)
tests/snapshots/execution_spec__caching.md_assert_metrics_0.snap
is excluded by:!**/*.snap
Files selected for processing (1)
- src/config/config.rs (2 hunks)
Files skipped from review as they are similar to previous changes (1)
- src/config/config.rs
Summary:
Issue Reference(s):
Fixes #1386
Build & Testing:
cargo test
successfully../lint.sh --mode=fix
to fix all linting issues raised by./lint.sh --mode=check
.Checklist:
<type>(<optional scope>): <title>
Summary by CodeRabbit
requestHeaders
field to GraphQL schema and configuration for specifying headers in telemetry data.