Raise lambda flush timeout to 10 seconds #2855

anuraaga · 2021-04-23T04:52:49Z

@kubawach Please check this out :)

I've been playing the the wrapper recently (much more than I want...) and it generally works quite well. However, I notice that with the current 1s timeout, it takes many requests before the JVM has warmed up enough for any traces to be sent, maybe around 5+. Raising it allows the first request to be traced. I don't know a great value for this timeout but wondering if you have any ideas on raising it so first request can also be traced? In particular, I'm worried about very low QPS functions - a customer may not care about cold start time and have a function that is always cold started. Such a function effectively can't be traced with the default timeout. Though it becomes a tradeoff between "defaults allow as many use cases as possible" to "majority use case may prefer the safety of a lower timeout on flush since they'll be warm anyways".

Edit: I'm also worried about people thinking tracing isn't working since requiring 5+ requests to get them is a lot.

anuraaga · 2021-04-23T04:53:45Z

We could also consider pausing on this sort of change and implementing dynamic timeouts in the SDK

ghost · 2021-04-23T08:24:27Z

@kubawach Please check this out :)

I've been playing the the wrapper recently (much more than I want...) and it generally works quite well. However, I notice that with the current 1s timeout, it takes many requests before the JVM has warmed up enough for any traces to be sent, maybe around 5+. Raising it allows the first request to be traced. I don't know a great value for this timeout but wondering if you have any ideas on raising it so first request can also be traced? In particular, I'm worried about very low QPS functions - a customer may not care about cold start time and have a function that is always cold started. Such a function effectively can't be traced with the default timeout. Though it becomes a tradeoff between "defaults allow as many use cases as possible" to "majority use case may prefer the safety of a lower timeout on flush since they'll be warm anyways".

Edit: I'm also worried about people thinking tracing isn't working since requiring 5+ requests to get them is a lot.

I think currently that's the only way. I've seen it too and always recommend setting this to a higher value. Weak safeguard here is that it's "up to X seconds", meaning it may happen but won't too often (hopefully). I believe our problems will be gone once we have an ability to delivery span async (run a collector as an extension without getting frozen). I remember we discussed this, so just saying aloud for the sake of the rest of the team :)

Raise lambda flush timeout to 10 seconds

a3345bc

anuraaga requested review from iNikem, jkwatson, laurit, mateuszrzeszutek, pavolloffay, trask and tylerbenson as code owners April 23, 2021 04:52

anuraaga mentioned this pull request Apr 23, 2021

Raise lambda flush timeout for java. open-telemetry/opentelemetry-lambda#71

Merged

trask approved these changes Apr 23, 2021

View reviewed changes

mateuszrzeszutek approved these changes Apr 23, 2021

View reviewed changes

ghost approved these changes Apr 23, 2021

View reviewed changes

trask merged commit 7410744 into main Apr 23, 2021

trask deleted the anuraaga-patch-1 branch April 23, 2021 18:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Raise lambda flush timeout to 10 seconds #2855

Raise lambda flush timeout to 10 seconds #2855

anuraaga commented Apr 23, 2021 •

edited

anuraaga commented Apr 23, 2021

ghost commented Apr 23, 2021

Raise lambda flush timeout to 10 seconds #2855

Raise lambda flush timeout to 10 seconds #2855

Conversation

anuraaga commented Apr 23, 2021 • edited

anuraaga commented Apr 23, 2021

ghost commented Apr 23, 2021

anuraaga commented Apr 23, 2021 •

edited