Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support exporting quantile for summary metrics #17265

Merged
merged 10 commits into from
Feb 24, 2023

Conversation

khanhntd
Copy link
Contributor

@khanhntd khanhntd commented Dec 27, 2022

Description:
As OTEL summary metrics will get exported as a Statistical Set with awsemfexporter, customers will lose the information for each quantile and this would be more critical if receiver does not set Min and Max with Summary Metric (result in setting Statistical Set will have Min and Max at 0). Therefore, by turning on DetailedMetrics, customers can have detailed information regarding each quantile (the DetailedMetrics config option would allow the same with Histogram Metric since its exported as Statistical Set too)

Testing:
By applying these changes, I have test by:

ok      github.com/open-telemetry/opentelemetry-collector-contrib/exporter/awsemfexporter       1.020s
  • Manual testing with the following config
exporters:
    awsemf/prometheus:
        dimension_rollup_option: NoDimensionRollup
        eks_fargate_container_insights_enabled: false
        endpoint: ""
        local_mode: false
        log_group_name: Detailed-Prometheus-On
        log_stream_name: '{job}'
        max_retries: 2
        metric_declarations:
            - dimensions:
                - - key1
                  - quantile                  
              label_matchers:
                - label_names:
                    - job
                  regex: MY_JOB
                  separator: ;
              metric_name_selectors:
                - ^go_gc_duration_seconds$
        metric_descriptors: []
        detailed_metrics: true
        namespace: CWAgent-Internal
        no_verify_ssl: false
        num_workers: 8
        output_destination: cloudwatch
        parse_json_encoded_attr_values: []
        proxy_address: ""
        region: ""
        request_timeout_seconds: 30
        resource_arn: ""
        resource_to_telemetry_conversion:
            enabled: true
        role_arn: ""
extensions: {}
processors:
    batch/prometheus:
        send_batch_max_size: 0
        send_batch_size: 8192
        timeout: 200ms
    metricstransform/prometheus:
        transforms: []
    resource/prometheus:
        attributes:
            - action: upsert
              converted_type: ""
              from_attribute: service.name
              from_context: ""
              key: job
              pattern: ""
              value: null
            - action: delete
              converted_type: ""
              from_attribute: ""
              from_context: ""
              key: service.name
              pattern: ""
              value: null
            - action: upsert
              converted_type: ""
              from_attribute: service.instance.id
              from_context: ""
              key: instance
              pattern: ""
              value: null
            - action: delete
              converted_type: ""
              from_attribute: ""
              from_context: ""
              key: service.instance.id
              pattern: ""
              value: null
            - action: delete
              converted_type: ""
              from_attribute: ""
              from_context: ""
              key: net.host.port
              pattern: ""
              value: null
            - action: delete
              converted_type: ""
              from_attribute: ""
              from_context: ""
              key: http.scheme
              pattern: ""
              value: null
            - action: insert
              converted_type: ""
              from_attribute: ""
              from_context: ""
              key: Version
              pattern: ""
              value: 1
            - action: insert
              converted_type: ""
              from_attribute: ""
              from_context: ""
              key: receiver
              pattern: ""
              value: prometheus
receivers:
    prometheus/prometheus:
        buffer_count: 0
        buffer_period: 0s
        config:
            global:
                evaluation_interval: 1m
                scrape_interval: 1m
                scrape_timeout: 10s
            scrape_configs:
                - enable_http2: true
                  file_sd_configs:
                    - files:
                        - /home/ec2-user/prometheus_sd_1.yaml
                      refresh_interval: 5m
                  follow_redirects: true
                  honor_timestamps: true
                  job_name: MY_JOB
                  metrics_path: /metrics
                  sample_limit: 10000
                  scheme: http
                  scrape_interval: 1m
                  scrape_timeout: 10s
        start_time_metric_regex: ""
        target_allocator: null
        use_start_time_metric: false
service:
    extensions: []
    pipelines:
        metrics/prometheus:
            exporters:
                - awsemf/prometheus
            processors:
                - batch/prometheus
                - resource/prometheus
                - metricstransform/prometheus
            receivers:
                - prometheus/prometheus
    telemetry:
        logs:
            development: false
            disable_caller: false
            disable_stacktrace: false
            encoding: json
            error_output_paths: []
            initial_fields: {}
            level: info
            output_paths: []
        metrics:
            address: ""
            level: none
        resource: {}
        traces:
            propagators: []

and the result is (generate 6 different logs event) - with quantile changing between 0, 0.25, 0.5, 0.75 and 1 and with count and sum logs event

{
    "Version": "1",
    "_aws": {
        "CloudWatchMetrics": [
            {
                "Namespace": "CWAgent-Internal",
                "Dimensions": [
                    [
                        "key1",
                        "quantile"
                    ]
                ],
                "Metrics": [
                    {
                        "Name": "go_gc_duration_seconds"
                    }
                ]
            }
        ],
        "Timestamp": 1672102208350
    },
    "go_gc_duration_seconds": 0.00058318,
    "instance": "127.0.0.1:8000",
    "job": "MY_JOB",
    "key1": "value1",
    "key2": "value2",
    "prom_metric_type": "summary",
    "quantile": "1",
    "receiver": "prometheus"
}
{
    "Version": "1",
    "go_gc_duration_seconds_count": 1,
    "go_gc_duration_seconds_sum": 0.0000597010000000231,
    "instance": "127.0.0.1:8000",
    "job": "MY_JOB",
    "key1": "value1",
    "key2": "value2",
    "prom_metric_type": "summary",
    "receiver": "prometheus"
}

Before apply the changes, the summary metrics will have the following Embedded Metric Format(EMF)

 {
    "Version": "1",
    "_aws": {
        "CloudWatchMetrics": [
            {
                "Namespace": "CWAgent-Internal",
                "Dimensions": [
                    [
                        "key1"
                    ]
                ],
                "Metrics": [
                    {
                        "Name": "go_gc_duration_seconds"
                    }
                ]
            }
        ],
        "Timestamp": 1672101428349
    },
    "go_gc_duration_seconds": {
        "Max": 0.00058318,
        "Min": 0.000033009,
        "Count": 0,
        "Sum": 0
    },
    "instance": "127.0.0.1:8000",
    "job": "MY_JOB",
    "key1": "value1",
    "key2": "value2",
    "prom_metric_type": "summary",
    "receiver": "prometheus"
}

Documentation:

@khanhntd khanhntd requested review from a team and Aneurysm9 as code owners December 27, 2022 00:46
@github-actions github-actions bot added the exporter/awsemf awsemf exporter label Dec 27, 2022
@khanhntd khanhntd force-pushed the summary_type branch 5 times, most recently from 27b3d48 to a4b6ae8 Compare December 27, 2022 01:06
@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jan 10, 2023
@Aneurysm9 Aneurysm9 removed the Stale label Jan 10, 2023
@runforesight
Copy link

runforesight bot commented Jan 10, 2023

Foresight Summary

    
Major Impacts

build-and-test duration(42 minutes 59 seconds) has decreased 25 minutes 44 seconds compared to main branch avg(1 hour 8 minutes 43 seconds).
View More Details

✅  tracegen workflow has finished in 51 seconds (1 minute 37 seconds less than main branch avg.) and finished at 24th Jan, 2023.


Job Failed Steps Tests
build-dev -     🔗  N/A See Details
publish-latest -     🔗  N/A See Details
publish-stable -     🔗  N/A See Details

✅  telemetrygen workflow has finished in 1 minute 4 seconds (1 minute 56 seconds less than main branch avg.) and finished at 24th Feb, 2023.


Job Failed Steps Tests
build-dev -     🔗  N/A See Details
publish-latest -     🔗  N/A See Details
publish-stable -     🔗  N/A See Details

✅  check-links workflow has finished in 1 minute 33 seconds (51 seconds less than main branch avg.) and finished at 24th Feb, 2023.


Job Failed Steps Tests
changed files -     🔗  N/A See Details
check-links -     🔗  N/A See Details

✅  build-and-test workflow has finished in 42 minutes 59 seconds (25 minutes 44 seconds less than main branch avg.) and finished at 24th Feb, 2023.


Job Failed Steps Tests
correctness-metrics -     🔗  ✅ 2  ❌ 0  ⏭ 0    🔗 See Details
correctness-traces -     🔗  ✅ 17  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.20, internal) -     🔗  ✅ 561  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, extension) -     🔗  ✅ 537  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, internal) -     🔗  ✅ 561  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, processor) -     🔗  ✅ 1513  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.20, extension) -     🔗  ✅ 537  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.20, processor) -     🔗  ✅ 1513  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, receiver-0) -     🔗  ✅ 2577  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.20, receiver-0) -     🔗  ✅ 2577  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, exporter) -     🔗  ✅ 2456  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.20, exporter) -     🔗  ✅ 2456  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, receiver-1) -     🔗  ✅ 1932  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.20, receiver-1) -     🔗  ✅ 1932  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.20, other) -     🔗  ✅ 4779  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, other) -     🔗  ✅ 4779  ❌ 0  ⏭ 0    🔗 See Details
integration-tests -     🔗  ✅ 55  ❌ 0  ⏭ 0    🔗 See Details
setup-environment -     🔗  N/A See Details
check-collector-module-version -     🔗  N/A See Details
check-codeowners -     🔗  N/A See Details
build-examples -     🔗  N/A See Details
checks -     🔗  N/A See Details
lint-matrix (receiver-0) -     🔗  N/A See Details
lint-matrix (receiver-1) -     🔗  N/A See Details
lint-matrix (processor) -     🔗  N/A See Details
lint-matrix (exporter) -     🔗  N/A See Details
lint-matrix (extension) -     🔗  N/A See Details
lint-matrix (internal) -     🔗  N/A See Details
lint-matrix (other) -     🔗  N/A See Details
unittest (1.20) -     🔗  N/A See Details
unittest (1.19) -     🔗  N/A See Details
lint -     🔗  N/A See Details
cross-compile (darwin, amd64) -     🔗  N/A See Details
cross-compile (linux, 386) -     🔗  N/A See Details
cross-compile (linux, amd64) -     🔗  N/A See Details
cross-compile (darwin, arm64) -     🔗  N/A See Details
cross-compile (linux, arm) -     🔗  N/A See Details
cross-compile (linux, arm64) -     🔗  N/A See Details
cross-compile (linux, ppc64le) -     🔗  N/A See Details
cross-compile (windows, 386) -     🔗  N/A See Details
cross-compile (windows, amd64) -     🔗  N/A See Details
build-package (deb) -     🔗  N/A See Details
build-package (rpm) -     🔗  N/A See Details
windows-msi -     🔗  N/A See Details
publish-check -     🔗  N/A See Details
publish-stable -     🔗  N/A See Details
publish-dev -     🔗  N/A See Details

✅  prometheus-compliance-tests workflow has finished in 11 minutes 15 seconds (⚠️ 2 minutes 5 seconds more than main branch avg.) and finished at 24th Feb, 2023.


Job Failed Steps Tests
prometheus-compliance-tests -     🔗  ✅ 21  ❌ 0  ⏭ 0    🔗 See Details

✅  e2e-tests workflow has finished in 14 minutes 19 seconds and finished at 24th Feb, 2023.


Job Failed Steps Tests
kubernetes-test (v1.26.0) -     🔗  N/A See Details
kubernetes-test (v1.25.3) -     🔗  N/A See Details
kubernetes-test (v1.24.7) -     🔗  N/A See Details
kubernetes-test (v1.23.13) -     🔗  N/A See Details

✅  load-tests workflow has finished in 20 minutes 28 seconds (⚠️ 3 minutes 16 seconds more than main branch avg.) and finished at 24th Feb, 2023.


Job Failed Steps Tests
loadtest (TestTraceAttributesProcessor) -     🔗  ✅ 3  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestIdleMode) -     🔗  ✅ 1  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestMetric10kDPS|TestMetricsFromFile) -     🔗  ✅ 6  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestTraceNoBackend10kSPS|TestTrace1kSPSWithAttrs) -     🔗  ✅ 8  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestTraceBallast1kSPSWithAttrs|TestTraceBallast1kSPSAddAttrs) -     🔗  ✅ 10  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestMetricResourceProcessor|TestTrace10kSPS) -     🔗  ✅ 12  ❌ 0  ⏭ 0    🔗 See Details
setup-environment -     🔗  N/A See Details
loadtest (TestBallastMemory|TestLog10kDPS) -     🔗  ✅ 18  ❌ 0  ⏭ 0    🔗 See Details

⭕  build-and-test-windows workflow has finished in 4 seconds (41 minutes 45 seconds less than main branch avg.) and finished at 24th Feb, 2023.


Job Failed Steps Tests
windows-unittest-matrix -     🔗  N/A See Details
windows-unittest -     🔗  N/A See Details

✅  changelog workflow has finished in 1 minute 48 seconds (58 seconds less than main branch avg.) and finished at 24th Feb, 2023.


Job Failed Steps Tests
changelog -     🔗  N/A See Details

🔎 See details on Foresight

*You can configure Foresight comments in your organization settings page.

exporter/awsemfexporter/datapoint.go Outdated Show resolved Hide resolved
exporter/awsemfexporter/datapoint.go Outdated Show resolved Hide resolved
exporter/awsemfexporter/datapoint.go Outdated Show resolved Hide resolved
exporter/awsemfexporter/datapoint.go Outdated Show resolved Hide resolved
exporter/awsemfexporter/datapoint.go Outdated Show resolved Hide resolved
exporter/awsemfexporter/grouped_metric.go Show resolved Hide resolved
exporter/awsemfexporter/datapoint_test.go Outdated Show resolved Hide resolved
exporter/awsemfexporter/metric_translator.go Outdated Show resolved Hide resolved
exporter/awsemfexporter/metric_translator_test.go Outdated Show resolved Hide resolved
exporter/awsemfexporter/util_test.go Outdated Show resolved Hide resolved
@khanhntd khanhntd force-pushed the summary_type branch 3 times, most recently from 160ed7f to f5d00ff Compare January 24, 2023 17:52
@khanhntd khanhntd force-pushed the summary_type branch 4 times, most recently from f9129ee to a039ecb Compare January 25, 2023 10:51
exporter/awsemfexporter/README.md Outdated Show resolved Hide resolved
exporter/awsemfexporter/datapoint.go Show resolved Hide resolved
exporter/awsemfexporter/datapoint.go Outdated Show resolved Hide resolved
@khanhntd khanhntd force-pushed the summary_type branch 2 times, most recently from c809c64 to e3ec463 Compare February 2, 2023 13:11
exporter/awsemfexporter/README.md Outdated Show resolved Hide resolved
exporter/awsemfexporter/datapoint.go Outdated Show resolved Hide resolved
@evan-bradley evan-bradley added the ready to merge Code review completed; ready to merge by maintainers label Feb 24, 2023
@codeboten codeboten merged commit c9b3aba into open-telemetry:main Feb 24, 2023
@khanhntd khanhntd deleted the summary_type branch February 24, 2023 20:18
newly12 pushed a commit to newly12/opentelemetry-collector-contrib that referenced this pull request Feb 28, 2023
As OTEL summary metrics will get exported as a Statistical Set with awsemfexporter, customers will lose the information for each quantile and this would be more critical if receiver does not set Min and Max with Summary Metric (result in setting Statistical Set will have Min and Max at 0). Therefore, by turning on DetailedMetrics, customers can have detailed information regarding each quantile (the DetailedMetrics config option would allow the same with Histogram Metric since its exported as Statistical Set too)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exporter/awsemf awsemf exporter ready to merge Code review completed; ready to merge by maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants