Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project Proposal: Feature Flag Semantic Conventions #2042

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

dyladan
Copy link
Member

@dyladan dyladan commented Apr 5, 2024

This proposes a new project to stabilize feature flag semantic conventions. Created as a draft for now while we fill out the rest of the project template.

TODOs:

  • list engineers committed to working on the project
  • list prototype engineers and maintainers committed to review

@dyladan dyladan changed the title Add draft feature flag project proposal Feature flag project proposal Apr 5, 2024
@svrnm svrnm added the Project Proposal Submitting a filled out project template label Apr 8, 2024
@danielgblanco danielgblanco changed the title Feature flag project proposal Project Proposal: Feature Flag Semantic Conventions Apr 9, 2024
projects/feature-flag.md Outdated Show resolved Hide resolved
projects/feature-flag.md Outdated Show resolved Hide resolved
projects/feature-flag.md Outdated Show resolved Hide resolved
projects/feature-flag.md Outdated Show resolved Hide resolved
- **semantic conventions for feature flag impressions** - make any necessary additions to the semantic conventions to support feature flag impressions
- **semantic conventions for feature flag changes** - make any necessary additions to the semantic conventions to support feature flag change events
- **prototype feature flag impressions**
- **prototype feature flag change events** - feature flag change events may not be generated by SDKs or instrumentations, but by the feature flag management tools.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor comment on this particular point here.

If the management tooling raises the event, this could create a temporal inconsistency where some or all of the applications have not yet fetched the updated values.

In the .NET ecosystem (I'm speaking only on the platform I have experience with here, but this might apply to others), it is incredibly common for one to set a new value to a key (for example, on Azure AppConfiguration), but then have many applications waiting on different refresh timings or cache expirations for that key, meaning they would potentially still serve N requests using the previous value. Even when using something like a per-request middleware, the refresh operation tends to happen in the background and will usually not affect the current request or other parallel requests happening around the same time.

If the only source for a flag change events is the central tool, one could end up checking behavior still running on the old value while the metric/etc indicates a new value should be present.

I think this needs to be reconsidered from the perspective of each application/process as well. Maybe we need a change event from the configuration source, and a second change event on each application when the new value is fetched? That to me would be the most comprehensive approach, although I'm not sure exactly how those event streams would be represented.

I almost want to suggest that OpenTelemetry should define a 4th type of telemetry called "events" (or "state transitions" or something), generalizing the concept and then making span events a particular use case of that. Then, when looking at a particular trace, one could include "global events" like these to understand the complete behavior on a given flow.

Anyways... just throwing some ideas around here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We planned to define semantics for change events from the management tool (source of truth) and the services (SDKs). That way, you could track how long it took for a change to propagate and quickly identify which services haven't been updated.

OpenTelemetry already has an events concept that's basically semantics on top of logs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We planned to define semantics for change events from the management tool (source of truth) and the services (SDKs). That way, you could track how long it took for a change to propagate and quickly identify which services haven't been updated.

Perfect. Love to hear it.

OpenTelemetry already has an events concept that's basically semantics on top of logs.

I remember seeing this but always assumed it was just a "synonym" to logs. Looks like I was wrong as it even has its own classes and abstractions.

Am I correct in that observability tools don't quite have this separation between logs and events yet though? Should they? We use Datadog extensively and I never saw this distinction anywhere over there (although I'm not necessarily an expert in DD either... so it could be me just missing it).

I also don't know what exactly maps to these events from the .NET APIs, since we only have ILogger (for logs), Meter and Instruments (for metrics) and Activity (for spans) there. The only place where I ever saw the concept of "events" was for "activity events", and those only make sense if you are under an activity in the first place.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember seeing this but always assumed it was just a "synonym" to logs. Looks like I was wrong as it even has its own classes and abstractions.

Events and logs are the same data type. Events are LogRecords which have a name which uniquely defines a particular class or type of event. All events with the same name have Payloads that conform to the same schema, which assists in analysis in observability platforms. Events are described in more detail in the semantic conventions.

Am I correct in that observability tools don't quite have this separation between logs and events yet though? Should they?

For the most part they don't. There could be arguments made either way as to if they should.

Co-authored-by: Alexander Wert <AlexanderWert@users.noreply.github.com>
@beeme1mr
Copy link

beeme1mr commented May 7, 2024

@askpt has volunteered to support us from a .NET perspective.

https://cloud-native.slack.com/archives/C06RT64NP37/p1715113701675879?thread_ts=1714057336.076709&cid=C06RT64NP37

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Project Proposal Submitting a filled out project template
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

None yet

8 participants