Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[META 753] Support Partial Transaction for AWS Lambda #1734

Closed
elastic-apm-tech opened this issue Feb 1, 2023 · 13 comments · Fixed by #1784
Closed

[META 753] Support Partial Transaction for AWS Lambda #1734

elastic-apm-tech opened this issue Feb 1, 2023 · 13 comments · Fixed by #1784
Assignees
Milestone

Comments

@elastic-apm-tech
Copy link
Collaborator

See meta issue for the description and details:

@basepi
Copy link
Contributor

basepi commented Feb 1, 2023

For reference (sorry about the private link) https://elastic.slack.com/archives/C5M04AVJA/p1675266805397529

@basepi
Copy link
Contributor

basepi commented Mar 28, 2023

@lahsivjar I'm getting a 415 back from the Lambda Extension when I send the partial transaction. Any ideas? I don't see anything in the extension's debug logs -- it's just an uninterrupted sea of Received log event function for request ID:

{
    "log.level": "debug",
    "@timestamp": "2023-03-28T18:34:05.730Z",
    "log.origin": {
        "file.name": "logsapi/event.go",
        "file.line": 73
    },
    "message": "Received log event function for request ID ",
    "ecs.version": "1.6.0"
}
---
[DEBUG]	2023-03-28T18:34:05.730Z	17ed2914-2793-4ffc-9c5b-cbd7cbb3d3d2	Sent request, url=http://127.0.0.1:8200/register/transaction size=1.00kb status=415
---
[WARNING]	2023-03-28T18:34:05.730Z	17ed2914-2793-4ffc-9c5b-cbd7cbb3d3d2	Failed to send partial transaction to APM Lambda Extension
Traceback (most recent call last):
  File "/var/task/elasticapm/contrib/serverless/aws.py", line 440, in send_partial_transaction
    self.client._transport.send(
  File "/var/task/elasticapm/transport/http.py", line 116, in send
    raise TransportException(message, data, print_trace=print_trace)
elasticapm.transport.exceptions.TransportException: HTTP 415: 
---
{
    "log.level": "debug",
    "@timestamp": "2023-03-28T18:34:05.769Z",
    "log.origin": {
        "file.name": "logsapi/event.go",
        "file.line": 73
    },
    "message": "Received log event function for request ID ",
    "ecs.version": "1.6.0"
}

@axw
Copy link
Member

axw commented Mar 29, 2023

@basepi you need to set Content-Type: application/vnd.elastic.apm.transaction+json for the partial transaction.

@basepi
Copy link
Contributor

basepi commented Mar 29, 2023

Thanks @axw, that got me a little further.

Now I'm successfully getting the transaction registered.....or at least I think I am, I don't think there's any logging for that. Still just a sea of Received log event function for request ID. But I am getting a 200 return.

However, when I put a sleep in the lambda function such that it times out, I'm not getting the partial transaction in Elasticsearch. I see this error in the logs:

{
    "log.level": "error",
    "@timestamp": "2023-03-29T19:05:18.002Z",
    "log.origin": {
        "file.name": "app/run.go",
        "file.line": 167
    },
    "message": "Error finalizing invocation on shutdown: metadata is not yet available",
    "ecs.version": "1.6.0",
    "log.origin.stack_trace": "github.com/elastic/apm-aws-lambda/app.(*App).processEvent.func1\n\tgithub.com/elastic/apm-aws-lambda/app/run.go:167\ngithub.com/elastic/apm-aws-lambda/app.(*App).processEvent\n\tgithub.com/elastic/apm-aws-lambda/app/run.go:234\ngithub.com/elastic/apm-aws-lambda/app.(*App).Run\n\tgithub.com/elastic/apm-aws-lambda/app/run.go:91\nmain.main\n\tgithub.com/elastic/apm-aws-lambda/main.go:71\nruntime.main\n\truntime/proc.go:250"
}

In python, we don't use streaming connections. This means we send the metadata at the end of the invocation, when we flush everything at once. Is that what is causing this problem?

I'm going to hack in a metadata send (by itself) and see if that changes anything.

@basepi
Copy link
Contributor

basepi commented Mar 29, 2023

Alright, that seems to have worked.

@lahsivjar How difficult would it be to support sending the metadata before or after the partial transaction, to the /register/transaction endpoint? It would definitely be more efficient for the python agent to send a single synchronous HTTP request, instead of two.

If that's difficult, I can add support to our transport to force a metadata send as part of the transport thread, even though we're not ready to flush anything else with it. Thoughts? This is my "working, but hacky" current solution: #1784

@lahsivjar
Copy link

lahsivjar commented Mar 30, 2023

@basepi I think it should be quite easy and straightforward to accept partial transaction + metadata as a single payload to the register transaction endpoint. This will also help fix an existing issue. I was initially thinking it might be beneficial to have a separate flow for agents to report metadata as soon as it is available but doing it with registered transactions sounds more reasonable.

@felixbarny @AlexanderWert WDYT? If this sounds reasonable I can put a quick PR to support metadata in the register transaction endpoint.

@lahsivjar
Copy link

transport to force a metadata send as part of the transport thread, even though we're not ready to flush anything else with it

@basepi Does it mean that the metadata at this point is not complete and when the agent starts sending actual data the metadata might be different?

@basepi
Copy link
Contributor

basepi commented Mar 30, 2023

Metadata is always ready, but we don't send it until other data is available to send (and we hit either our time or size threshold).

@basepi
Copy link
Contributor

basepi commented Mar 30, 2023

Again, this might be too Python-specific. Most other agents have streaming connections. I don't know if they open a connection immediately on startup, or just when they have data available, but since they're not waiting for size/time thresholds to send (like Python does) it likely won't be an issue in other agents. But if it solves your other issue, I'm all for it, it would be easier from my end to be able to send it to register/transaction.

Anyway, I'm going to modify my solution to add the metadata directly after the transaction. Though I assume you'll make it order-agnostic, since they both start with either {"transaction": or {"metadata": so they're easy to detect.

Please keep me posted if I need to change the Content-Type since I'm now sending ndjson, not json.

@trentm
Copy link
Member

trentm commented Mar 30, 2023

The Node.js APM agent would be similar: it does not set metadata until there is other data to send. This means that if the first Lambda invocation hangs/timesout (such that there is no complete transaction to send), then the APM agent will not send metadata to the intake v2 -- it actually won't start an intake v2 request at all.

In other words, it would help the Node.js APM agent as well to support sending the metadata via the /register/transaction endpoint.

@basepi
Copy link
Contributor

basepi commented Apr 3, 2023

@lahsivjar Can you link here when you have the PR against the extension? And can you confirm my questions above about Content-Type and ordering?

@lahsivjar
Copy link

Can you link here when you have the PR against the extension?

I have created a PR here

And can you confirm my questions above about Content-Type and ordering?

Content-Type has been changed to application/vnd.elastic.apm.transaction+ndjson. Regarding the ordering, I have currently kept it as metadata should be the first line in the NDJSON body to keep it aligned with the requirements of sending metadata as the first line in the intake V2 request body. Let me know if this will be a concern for you.

@basepi
Copy link
Contributor

basepi commented Apr 10, 2023

Sounds fine 👍

I'll update my PR today and then I'll test when we get the next version of the extension released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.

6 participants