-
Notifications
You must be signed in to change notification settings - Fork 332
[feature request] Add StateActivityProcessor which sends log for span start and stop #2704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[feature request] Add StateActivityProcessor which sends log for span start and stop #2704
Conversation
try | ||
{ | ||
WaitHandle.WaitAny(triggers, this.scheduledDelayMilliseconds); | ||
this.Heartbeat(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of sending heartbeat events?
If there was a "start" event then if there was not "end" event yet then it means that the span is in progress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sending heartbeat events serves the purpose of indicating that a span is still in progress. This improves observability by showing that an operation is active, even if it’s long-running. The heartbeat interval can typically be configured through the constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think it is necessary. An observability backend can indicate that a span is still in progress based on "started" and "ended" events. Such feature would also decrease the performance of the instrumented application. I imagine it could be an opt-in event, but I do not think this should the default behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had some internal discussions about this and our conclusion was more/less the same. This is the first iteration just to get the community feedback on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to what @pellared said. A simple processor emitting an event(log) at the start/end would be a very good start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I can see your point.
My main concern is how to cross-reference the heartbeat events with the partial span events. The obvious thing to use would be service.instance.id
, I suppose.
When I started typing this response I was worried that it would need to be truly per-process unique to catch problems like a k8s pod getting OOMKilled and restarting. I have seen this not be true with the instance id injected by the OTEL k8s operator, for example. In that case it is unique to the pod but not a single run of the container (because it injects env vars to the pod).
Reading the conventions, though, I'm now thinking maybe that's just a problem with relying on the k8s operator. I guess if you've got enough control to add a PartialActivityProcessor then you can just call AddService()
and let it generate a random instance id each time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry if I miss some other part of this discussion, but do we need a generic heartbeat? Most spans are super-short and won't benefit from it.
But if I know that I have a long-running job, I can instrument that long-running job and add domain-specific events (progress updates).
E.g.
timestamp = X, event_name = starting_job, attributes: {"job.name": "foo", "job.id": 42, "total_items": 100500, ...}
timestamp = X + 1, event_name = progress_update, attributes: {"job.id": 42, "done_items": 10, ...}
....
Wouldn't it be much more useful than generic span heartbeat?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't really for reporting progress, this is to enable downstream processing that can recover from long-running (multi minute/hour/day) spans never completing (due to problems from outside the application) by rehydrating the unfinished span from this log-event data and sending it with an error state. This allows the proper visualisation/processing of spans hierarchies that contain failures of this type.
Our main use case is for long-running, complex, multi-level batch processes (think ML training, but we've got a very heterogeneous set of workloads) where individual steps suffer from random failures (viewed at scale). I've seen another use case for the same functionality for mobile apps, though, where the app can be suspended by the OS at any time.
The span heartbeats allow the processing to mark a span as failed (and send it) if it's not seen a heartbeat when it expects one. If we had a process/service heartbeat we could probably use that to mark all uncompleted spans we'd seen from that process as errors if the heartbeats stopped arriving. I've not been involved in the implementation of our PoC, though. I'll defer to @mladjan-gadzic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are other ways to achieve the same and this is a separate use case. I created open-telemetry/semantic-conventions#2159
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the other advantage of per-span heartbeats is that we can pick up changes to the span attributes during its lifetime. It's a bit of an edge-case, but it's a fundamental difference in what's possible.
[buffer, 0, sdkLimitOptions, null!, new Batch<Activity>(data)]); | ||
var writePosition = result as int? ?? 0; // Use a default value if null | ||
|
||
var logRecord = (LogRecord)logRecordConstructor.Invoke(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also set the Severity
and EventId
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about INFO
for Severity
?
About EventId
, it get a little complicated. In docs it says that it should be only set when used through ILogger
- and this is not the case. From where/what would it value be derived from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about INFO for Severity?
Works for me.
From where/what would it value be derived from?
This should be something which defines the name of the event. Maybe simply span.state
.
Example: https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/gen-ai-events.md
this.shutdownTrigger = new ManualResetEvent(false); | ||
|
||
// Access OpenTelemetry internals as soon as possible to fail fast rather than waiting for the first heartbeat | ||
AccessOpenTelemetryInternals( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pellared while at it, do you know of any good practice in open telemetry, where this can be rewritten without any hacks or ugly practices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I am not a maintainer/approver here 😉
CC @open-telemetry/dotnet-approvers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think it is a good idea to rely on any internals. If your intent to send a signal at Activity start/end, accept a ILoggerFactory/iLogger, and then use that to emit the log, just like how end users would do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the idea about accessing internals comes from serializing span into proto base64 encoded representation to body of log record so that partial collector (there should be soon a pr to collector contrib) can reuse span data. this is a broader approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What we really need is just a way to serialise all the Span content into a log message so that if the Span never completes normally (e.g. due to a process crash) we can rehydrate it in the component that has been recording these partial span events (including all the original attributes, etc.), mark it as finished in an error state and send it back in to our OTEL pipeline so it's not lost.
Using the serialiser from OpenTelemetry.Exporter.OpenTelemetryProtocol
seems the obvious way to do that without missing something, but that code's not on the public API.
FWIW I agree that accessing this via reflection is far from ideal (versioning issues worry me). Does anyone know why this isn't public in the upstream exporter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Serializing to OTLP would be coupling the processor to certain exporter implementation. We should simply encode the span as a complex log attribute named span.state
. The the OTLP exporter would be responsible to serialize it accordingly.
PS. I have no idea how OTel .NET SDK handles complex log attributes. Referring to @open-telemetry/dotnet-approvers so that they can you out if necessary.
PTAL @lmolkova |
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
I think that processor can be used as a prototype for:
I am not sure what is the stability policy in this project and if it accepts "Development/Experimental" features. Defering to @open-telemetry/dotnet-maintainers Still I think it would be good to cleanup this PR so that it is in a "mergable" state and that it proposes some semantic conventions. |
thanks for a feedback @pellared! i'll be off until Monday, but i'd clean it up on Monday if not before. |
@pellared cleaned it up as much as i could |
Not sure why this was assigned to me. I don't hold any role in this SIG and searching doesn't show me listed in any workflow. Probably best to re-assign to one of the @open-telemetry/dotnet-contrib-approvers |
@MikeGoldsmith, because you are the code owner of Is not longer valid? Should |
Ah, that makes sense. I thought the workflow was picking me as one of the contrib approvers / maintainers. I'm happy to continue as a codeowner for that package 👍🏻 Feel free to re-assign to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the code looks good and I've left some minor suggestion.
My concern is that the current design will generate a lot of heartbeat messages, one per activity, but won't track long running actions that don't continually create new activities.
Maybe a possible design could be to create some synchronised dictionary of active trace IDs and on some interval heartbeat messages are generated?
</ItemGroup> | ||
|
||
<ItemGroup> | ||
<None Remove="nupkgs\OpenTelemetry.Extensions.1.11.0-beta.1.67.snupkg" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain a little more about why we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably a leftover from testing. it will be removed.
this.EndTime = signal == Signal.Heartbeat | ||
? string.Empty | ||
: FormatTimestamp(activity.StartTimeUtc.Add(activity.Duration)); | ||
this.StatusCode = activity.Status.ToString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a default value here if Status was Unset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am in the middle of transitioning of serialization from https://opentelemetry.io/docs/concepts/signals/traces/#spans to https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/trace/v1/trace.proto. instead of just sending a span i'd send everything like it is in proto spec but it will be in json. this will be fixed.
} | ||
#endif | ||
this.logExporter = logExporter; | ||
this.logProcessor = new SimpleLogRecordExportProcessor(logExporter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's generally preferred to use a BatchLogRecordExportProcessor over a simple one processor for efficiency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i used simple log record export processor because i'd like to have logs pushed asap. do you think i can achieve the same with batch log record export processor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the batch processor still sends fairly quickly (the default batch window is 5s but could be configured lower if needed). It means the SDK doesn't need to send each log record one at a time, instead it waits a short period and sends a group together.
To avoid having to expose those settings in this extension, you could take the processor as an argument so someone can choose how to configure it. Eg use a simple processor, change batch times, etc. I'd recommend the default be the Batch processor if one isn't provided.
this.Attributes = activity.TagObjects | ||
.ToDictionary(tag => tag.Key, tag => tag.Value)!; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is TagObjects guaranteed to be set? Should we have a default here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guarded this in another implmentation which will be available soon.
} | ||
|
||
/// <inheritdoc /> | ||
public override void OnEnd(Activity data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means we'd send a heartbeat on every activity. This will result in a lot of heartbeat logs for regular actions that don't take a long time. Also, it means a long running action that doesn't continually create activities won't generate heartbeat messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm going to introduce heartbeat dealy which is intended for those regular actions that don't take long time. nice observation!
thanks for the review! @MikeGoldsmith
There is going to be heartbeat mechanism in the future. This is just the first step, to make this pr review friendly #2704 (comment).
The idea is to keep two dicts, started and ended traces. Then on heartbeat interval all entries from ended that are still in started are going to be removed and ended dict is going to be emptied. For those remaining in started dict, heartbeat log is going to be pushed. |
Co-authored-by: Mike Goldsmith <goldsmith.mike@gmail.com>
And this can be done by a separate processor, right? One can emit span start/end logs. |
I don't see any obvious blockers for that, so i'd yes. |
would it make more sense to accept ILogger or ILoggerFactory instead of BaseExporter? that would ultimately give more flexibility to end user. |
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
…ogger in processor constructor
Just refactored PR to send logs on span start and stop. |
@MikeGoldsmith i've refactored pr to be inline with your feedback. the idea about heartbeats is to have another pr which will introduce delay and scheduled heartbeats. that way it should be more review friendly. what do you think about it? |
Was pushing d29c158 a mistake? |
thanks for checking out this pr :) after trying to introduce initial delay for start event, so that short lived spans are not spamming with unnecessary start/stop info, i've noticed that it is almost 80% into heartbeat implementation and just went with it - it makes more sense in this case. |
Why do you find this unnecessary? I think there still can be cases when these are nice to have. Even though the spans are "short lived" they can still not be finished e.g. because there was an application crash. |
"unnecessary" might be strong word, but the thing is that without delay, system (collector) gets quickly overwhelmed. that's why we find initial delay "must have" in our case. initial delay can be configured so that there is no initial delay and events will be sent as they happen. i agree with you on this logic, that's why i went ahead and added heartbeat because it goes so well with initial delay / or no initial delay for that matter. |
How is it being overwhelmed by logs and it is not by spans? Maybe you should consider adding some log processing/filtering on the collector. |
we've implemented custom collector that did some gymnastics with log body data and that was a choke point. team that was stress testing this decided to move on with idea to add initial heartbeat delay because of this and it seems to be just right. adding structure to enable initial start log delay was in line what was needed for heartbeat anyways. if you mind having heartbeat as part of this processor or have concerns for that matter, flag can be added that enables/disables heartbeat. by default it can be disabled. what do you think about that? |
It's not about the logs themselves, it the system that handles them. The system that handles these span state events (start/stop/heartbeat) is necessarily stateful. It has to keep track of all the open partial spans, record heartbeats against them, drop them from state when they complete normally and periodically "rehydrate" failed incomplete spans that have expired and send them as (in error state) to the normal collectors. This essentially means it needs a backing store, which then adds a lot more overhead to the handling of each span. Note that we're working with jobs running on a large compute farm. There are many, many thousands of jobs running at any one time, generating spans at various rates, depending on the job. A very significant proportion of spans are for HTTP requests, DB calls, etc. Those are typically very quick, so the statistical chance of the system failing whilst they are incomplete is low (and often the impact of losing them is also low). The spans that inspired us to do this work are the ones that last several minutes up to several days. If one of those is lost then all the child spans lose their relationship to each other (there can be tens of thousands of them in this case). Given all the above, the idea of the initial delay is to allow the trade-off to be made between missing short-lived spans and the resource requirements of the back-end. TBH we've not tuned that trade-off yet. All we know so far is that as soon as we rolled out our version the code in this PR (in a fairly limited way) it more-or-less instantly overwhelmed our backend. (This is a common occurrence when we point our farm at a service for the first time! 😀) |
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
Fixes
Design discussion issue
Changes
StateActivityProcessor is added. It helps by sending logs on span start and stop. Log body is used to json serialize span per OpenTelemetry specification.
Merge requirement checklist
CHANGELOG.md
files updated for non-trivial changes