Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storing session data on Resource #2500

Closed
martinkuba opened this issue Apr 19, 2022 · 10 comments
Closed

Storing session data on Resource #2500

martinkuba opened this issue Apr 19, 2022 · 10 comments
Assignees
Labels
spec:resource Related to the specification/resource directory

Comments

@martinkuba
Copy link
Contributor

The client-side SIG is working on adding the concept of sessions. Sessions correlate signals from the same client / user over a certain period of time.

We have considered two options for attaching session data:

  1. attributes on resource
  2. attributes on individual signals

We are leaning towards resource attributes because the session values are mostly the same for every signal. It is important in client-side applications to minimize the size of the payload.

The challenge with using resource attributes is that sessions can change during the lifetime of the application, e.g. session can be started or ended while the application is running.

We are wondering what it would take to update the spec to allow updating or replacing the resource. This has been discussed at length in this issue (#1298).

@martinkuba martinkuba added the spec:resource Related to the specification/resource directory label Apr 19, 2022
@yurishkuro
Copy link
Member

I disagree that session info belongs to the resource. Efficiency considerations should not override proper data modeling decisions. Resource is meant to identify the service, which doesn't change while that service handles many user sessions.

Conceptually user session belongs at the trace level, which we do not have in the OTEL span model (some systems do support trace level attributes), and it's customary to place those on the root span.

@tedsuo
Copy link
Contributor

tedsuo commented Apr 20, 2022

I agree that trace level attributes would be a good addition to OTel, but to me session IDs still look like they are resources:

  • A user session applies to all telemetry, not just traces. Session IDs also apply to logs/events, metrics, etc. Not all of that telemetry will be encapsulated in a trace.
  • Since resources act as an envelope containing the attributes which are common to all telemetry emitted in a batch, it makes sense for Session ID to be a resource.

There are also some practical implications.

  • There is no "root trace" or "root context" to attach a trace-level session ID to.
  • In practice, not setting the session as a resource forces every piece of instrumentation which starts a trace or creates a log to become "session aware" which complicates all of our instrumentation.
  • Creating span and log processors which attaches the session ID as an attribute to every object looks like a very poor re-implementation of resources – this information belongs in an envelope which applies to all telemetry currently being emitted.

Beyond just sessions, we've also discovered that clients may have other resources which can change over time - timezone and other location data, language preference, etc. Mobile and desktop clients are rarely rebooted; instead they are put to sleep and later re-awaken in an environment where some settings may have changed. We did not consider these client-specific issues when we originally defined how resources should work.

We've discovered all of the above issues while attempting to create a model for client instrumentation/RUM, which is why we're proposing updatable resources as a solution. The Client/RUM SIG can create an OTEP to further explain how an updatable ResourceProvider would solve these issues. But it would be good to understand what side effects this change could create, and how we can mitigate them.

BTW on a related note, I am also seeing a need to have a stable resource attribute for identifying a service/sdk instance. Currently there is no required attribute which would serve this purpose – service.instance.id is optional and telemetry.sdk.id does not exist. A stable instance ID which is always present would work better for identifying individual services than our current practice of saying all resources must be immutable for the life of a service.

@yurishkuro
Copy link
Member

Fair enough. However, rather than proposing a change to the spec, I recommend starting with a sample implementation and trying out different approaches. There are different ways of how this can be achieved

  • mutable resources (could mean significant implications to collection)
  • replaceable resources in the Tracer/etc
  • reinitializable Tracer/etc
  • an alternative path of exporting resources
  • ...

@tedsuo
Copy link
Contributor

tedsuo commented Apr 22, 2022

Yes I totally agree we need to prototype as part of making this proposal! We have a prototype implementation in the works, taking the following approach:

  • Extract resources into a ResourceProvider
  • Tracers and other telemetry generators now call ResourceProvider.resources() when starting spans, creating logs, etc.
  • ResourceProvider.update() creates a new resource set. Future calls to ResourceProvider.resources() now return this new set. This does not mutate existing resource sets.
  • Resource sets are still immutable. Resource sets never change once they are attached to spans, logs, etc, so they can still be stored and accessed as a thread-safe pointer.
  • Exporters already have to create multiple batches of data, sorted by resource set, due to the fact that multiple providers may be sharing the same Exporter. So nothing else in the SDK architecture needs to change, beyond the addition of the ResourceProvider concept.

Once we have a ResourceProvider, it's easy enough to create a SessionManager that updates the resources whenever the session changes. The same approach can be taken to managing other resources that need to be updated when a mobile client reawakens.

The ResourceProvider seems like an effective solution with a limited "blast radius" in terms of what parts of Otel are affected, so we are going to propose this as part of adding RUM support to OTel.

We plan on actually be creating three OTEPs, all with prototypes and examples. Most of this info is not relevant to this current issue, but just as an fyi the three OTEPs will be:

  • OTEP for describing how the RUM concept should be defined within OpenTelemetry. It will point out that two things currently missing from OTel are a well-defined place to efficiently attach session information, and a well-defined way to record events as logs (since traces are not always present when these events happen).
  • OTEP for adding the ResourceProvider concept described above.
  • OTEP describing how events should be recorded as logs. We've already talked to the Log SIG about this so it's won't be an out-of-the-blue proposal.

@dyladan
Copy link
Member

dyladan commented Apr 26, 2022

Maybe the newly proposed instrumentation scope attributes is a more natural place for this open-telemetry/oteps#201

@Oberon00
Copy link
Member

Oberon00 commented Apr 26, 2022

Please read the discussion at #1298: This shows that already just appending to the resource poses some interesting (but IMHO solvable) questions.
Changing/replacing values on the resource is IMHO semantically wrong.

Maybe you want to send something in OTLP at the level where currently there is only resource. But the resource concept itself is incompatible with mutability.

@martinkuba
Copy link
Contributor Author

Would there be any challenges with specifying some resource attributes as identifying (immutable) and some as descriptive (mutable) (as discussed in #1298)? Are there any backends that actually use all resource attributes (e.g. by hashing) to determine identity?

@jsuereth
Copy link
Contributor

So even with a split of resource attributes between identifying and descriptive, the issue here is what a Resource means and what a session means. I still think you're looking for a (currently non-existence) scope/context-based attribute. Where "scope" here means lexical-scope / or scoped in context (not the same as InstrumentationScope).

Specifically -

When a browser session is created we can attach the session id to context and output it appropriately.

Specifically, my main concern with the notion of descriptive attributes on resource vs. "session" is that session can change for the same device, so there's an element of needing to know when an attribute was live for this to work.

@tigrannajaryan and I are working on some refinement to Resource around identifying/non-identifying attributes. I still think this won't solve the client-side instrumentation issue.

For us to make progress, would be willing to help us identify a few things?

  1. Is the identify of a device / browser important or just the identity of the session? E.g. will metrics be generated for a specific device or just a specific session?
  2. Does ALL telemetry emitted need the session labels, or just events (spans/logs)?
    2a. If a session is a span of time w/ attached attributes, is that better modelled as a Span to which you 'link' telemetry?

@t2t2
Copy link

t2t2 commented Sep 16, 2022

Note: I also ended up writing a longer comment on why this really really should be included in resources instead of scopes, but I figured it's better in Ephemeral Resource Attributes otep (oteps#208) (as this issue really morphed into that otep) than here. I recommend reading that comment first since it'd give a lot of useful domain knowledge plus I'll refer parts of it again in this comment:

open-telemetry/oteps#208 (comment)


1: Yes. No. Depends. Maybe. Sometimes. Eh dunno

Since you're here from an issue that talks about metrics, let's say very browser specifically web vitals were sent over metrics were sent over metrics - here you'd have one value for lifetime of a page, that describes that one page load. So if you want, quoting Tigran, "minimal globally unique identifier of the emitting entity", using service.instance.id would likely be the best default value. But for querying, getting avg/median/percentile values you're gonna have so many different attributes to base on

When using a RUM...

Sometimes you'd want data from one opened page, sometimes experience during a session

Sometimes you want a specific user's experience, sometimes you want a group of users (eg. employees from one department)

Sometimes you want all users of one ISP, sometimes you want all users from a specific country

Sometimes everything

So anything usable for rum use cases would have more options for per-attribute filtering

2: Yeah sorry no escaping the longer comment for this one. tl;dr a case for yes. Especially when you have a UI that combines all 3 signals to show the experience of using a site - spans for http requests, logs for, well logs and errors, metrics for info like web vitals score.

2a: Ignoring that session.id use isn't limited to spans, I mean I guess if your systems don't mind linking to spans that likely never exist.

Session can easily cross multiple otel sdk instances (so for web a session id span would be made in one instance, used in N more, sometimes in multiple in parallel, and then ended in another instance, see italics note). Additionally otel does not work well for really long spans like this. For one a session end is quite frequently a while into no otel SDK being active (user opens site, closes site, 15 minutes later session expires, if a tree falls in a forest session expires while nobody is observing, does it really expire?). Secondly otel spans are only sent when span ends -- if there's nothing to end the span, the span is never sent. Lastly systems would probably need to ignore hours long session spans so they don't think the slowest thing in your system is a 4 hour session

@martinkuba
Copy link
Contributor Author

Decision has been made to represent sessions as an attribute on all signals. Closing for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec:resource Related to the specification/resource directory
Development

No branches or pull requests

7 participants