Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting python metrics to flush in the lambda #272

Closed
rberger opened this issue Aug 9, 2022 · 10 comments
Closed

Getting python metrics to flush in the lambda #272

rberger opened this issue Aug 9, 2022 · 10 comments

Comments

@rberger
Copy link

rberger commented Aug 9, 2022

How does one get metrics to emit in short lived lambdas? I never see the metrics emit but I see traces.

If I don't do the force_flush I never see any metrics in the logs or in my oltp exporter destination.

The aws-otel-python-arm64-ver-1-11-1:2 will not let me force_flush the meter_provider. I get the error when I try to do a force_flush.

{
  "errorMessage": "'_ProxyMeterProvider' object has no attribute 'force_flush'",
  "errorType": "AttributeError",
  "requestId": "0ad2840b-4b2f-4fee-a8f1-0763473b5efd",
  "stackTrace": [
    "  File \"/opt/python/wrapt/wrappers.py\", line 578, in __call__\n    return self._self_wrapper(self.__wrapped__, self._self_instance,\n",
    "  File \"/opt/python/opentelemetry/instrumentation/aws_lambda/__init__.py\", line 230, in _instrumented_lambda_handler_call\n    result = call_wrapped(*args, **kwargs)\n",
    "  File \"/var/task/api_handler.py\", line 48, in handler\n    meter_provider.force_flush()\n"
  ]
}

code:

import os
import json
from random import randint
import time
from opentelemetry import trace

from opentelemetry._metrics import (
    get_meter_provider,
    set_meter_provider,
)

# Acquire a tracer
tracer = trace.get_tracer(__name__)

# Acquire a meter and create a counter
meter_provider = get_meter_provider()
meter = meter_provider.get_meter("__name__", "0.1.1")

my_counter = meter.create_counter("my_handler_counter")

# The lambda Handler
def handler(event, context):
    print("Hello from top")
    with tracer.start_as_current_span("top") as topspan:
        res = randint(1, 100)

        # Counter
        my_counter.add(50, {"app": "timescale"})

        json_region = os.environ['AWS_REGION']
        topspan.set_attribute("region", json_region)
        topspan.set_attribute("res.value", res)
        # time.sleep(10)
        result  =  {
            "statusCode": 200,
            "headers": {
                "Content-Type": "application/json"
            },
            "body": json.dumps({
                "Region ": json_region,
                "res ": res
            })
        }
        print("Hello from Lambda")
        print(json.dumps(result))
        print("meter ", meter)
        
        meter_provider.force_flush()
        return result

collector config file:

receivers:
  otlp:
    protocols:
      grpc:
      http:

exporters:
  logging:
    loglevel: debug
  otlp:
    endpoint: "api.honeycomb.io:443"
    headers:
      x-honeycomb-team: "redacted"
      x-honeycomb-dataset: "redacted"

# enables output for traces to xray
service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [logging, otlp]
    metrics:
      receivers: [otlp]
      exporters: [otlp, logging]

I built a version of the lambda layer with the latest v1.12.0rc2 and 0.32b0 and the force_flush worked but only if I included opentelemetry-exporter-otlp-proto-grpc in the build which made the layer huge (200MB) but it did work with the force_flush and I got the metrics end to end.

If I didn't include opentelemetry-exporter-otlp-proto-grpc I would get the error:

RuntimeError: Requested component 'otlp_proto_grpc' not found in entry points for 'opentelemetry_metrics_exporter'

And would also get the same

{
  "errorMessage": "'_ProxyMeterProvider' object has no attribute 'force_flush'",
  "errorType": "AttributeError",
  "requestId": "d5c9b805-28f9-486a-ba65-f63709390e4d",
  "stackTrace": [
    "  File \"/opt/python/opentelemetry/instrumentation/aws_lambda/__init__.py\", line 234, in _instrumented_lambda_handler_call\n    result = call_wrapped(*args, **kwargs)\n",
    "  File \"/var/task/api_handler.py\", line 48, in handler\n    meter_provider.force_flush()\n"
  ]
}

I presume I'm not doing something right, but I can't figure out how to make metrics work other than the v1.12.0rc2 with the opentelemetry-exporter-otlp-proto-grpc library, but then I can't use it as the lambda size gets to big once I add the rest of my real app and its libraries.

@RangelReale
Copy link

I'm having the same problem, in Golang I used the HTTP exporter instead of gRPC, and as it doesn't keep a persistent connection, the flush worked. With gRPC, it sees that the connection was broken, sleeps for reconnecting, but the lambda times out.

But for python the http metrics exporter isn't available yet.

@RangelReale
Copy link

I did a quick and dirty implementation of http metrics which seems to work, but it didn't solve my problem.
In Go I flush the metrics on SIGTERM, but with python SIGTERM is sometimes not catched, and if it is, the execution ends before flushing the metrics.

@rberger
Copy link
Author

rberger commented Aug 21, 2022

@RangelReale Thanks for looking into this. I am no expert in Lambda extensions, but I see that there should be a Shutdown event that should be available to extensions that could be used to trigger the flush: Lambda Extensions API - Shutdown phase Is that different than the SIGTERM?

Though now I see that your implementation isn't an extension, its in the application layer and only can get the SIGTERM

I presume you saw this: Graceful shutdown with AWS Lambda

@RangelReale
Copy link

@rberger yes, only the extensions receives these events, not the function itself.
SIGTERM only is sent if there's an extension registered.

I am doing some tests with Statsd/UDP and it looks like to be more reliable for lambdas, as metrics are sent "synchronously" via UDP, did anyone try this yet?

@rberger
Copy link
Author

rberger commented Aug 22, 2022

I haven't tried Statsd, but thinking about Statsd is giving me flashbacks :-)

@rberger
Copy link
Author

rberger commented Oct 18, 2022

Is it expected that there will be a force_flush or some mechanism so that metrics will be usable in Python Lambdas? Or has some other mechanism been added to enable metrics to work in Python Lambdas? Am I missing something? We're blocked waiting for this so we can fully adopt Open Telemetry. Thanks in advance for any help with this!

@tammy-baylis-swi
Copy link

tammy-baylis-swi commented Mar 26, 2024

Old issue but a force_flush has since been added to the Python AWS Lambda instrumentor: open-telemetry/opentelemetry-python-contrib#1613. Have you given a try with the latest Otel Python layer?

@rberger
Copy link
Author

rberger commented Mar 26, 2024

Thanks. We pretty much abandoned OTel since we had so many issues back then with Lambdas and OTel, particularly the issue of flushing. Hope to eventually try it again someday.

@tammy-baylis-swi
Copy link

Thanks for circling back @rberger .

Just realized this is in different gh group than my usual so I'm not able to close this issue. 🙂

@rberger
Copy link
Author

rberger commented Mar 26, 2024

Let me see if I can close it

@rberger rberger closed this as completed Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants