Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Link produced events to consumed events #10

Closed
afrittoli opened this issue Jan 25, 2022 · 25 comments
Closed

Link produced events to consumed events #10

afrittoli opened this issue Jan 25, 2022 · 25 comments
Assignees
Labels
enhancement New feature or request roadmap Items on the roadmap
Milestone

Comments

@afrittoli
Copy link
Contributor

Feature Description

The CDEvents specification should allow propagating identifiers from consumed events into produced events.
When an activity is directly started as a consequence of a consumed CDEvents, an identifier of the consumed event must be included in the produced events related to the activity.
When an activity is directly started as a consequence of a consumed event (non CDEvent), an identifier of the consumed event may be included in the produced events related to the activity.

Use Cases

Children Pipelines

When a pipeline runs, it generates events as different tasks are started and finished. Children pipelines may be triggered by these events (whether in the same CD system or a different one), and they will in turn send events.
As a DevOps engineer I want to be able to discover children pipelines from the events they generated.

Proposal

These are the spec fields that would be required for CDEvents.

Field Name Type Description CloudEvents Binding Notes
id string Event identifier id Unique for a source
source URI-reference Producer of the event source May be globally unique?
subject string Producer of the event source May be globally unique?
cdevents_platform string Platform namespace for sources <extension> Only needed is source is not globally unique
cdevents_source URI Globally unique source <extension> Only needed is source is not globally unique
cdevents_source-event string Globally unique source event identified <extension> Serialised source + id / cdevents_platform + source + id / cdevents_source

We already use id and source in our PoC, however we do not specify them in our spec, but we should add them.
The source-event field is a globally unique identified of another event, which was the trigger to the activity identified by "source + subject".

CloudEvents have a Source and ID attached to them. Each instance of each platform is responsible for generating unique Source + IDs for its events, but it's not required for one platform to guarantee global uniqueness.

  1. If we enforce global uniqueness, it would responsibility of the administrator of the overall system to ensure each source is configured with a globally unique value (e.g. a DNS name) that can be used to generated the various sources.

  2. If we do not enforce global uniqueness, we could introduce the cdevents_platform extension or the cdevents_source extension to provide a globally unique value. Again it would be responsibility of the overall administrator to configure a globally unique seed. Alternatively platform could use some strategy to pick a name that is likely to be globally unique. Examples:

{
    "specversion" : "1.0",
    "type" : "cd.events.taskrun.started",
    "source" : "/apis/tekton.dev/v1beta1/namespaces/default/taskruns/curl-run-6gplk",
    "subject" : "my-taskrun-123",
    "id" : "A234-1234-1234",
    "time" : "2018-04-05T17:31:00Z",
    "cdeventssource" : "tekton_ci_prod/apis/tekton.dev/v1beta1/namespaces/default/taskruns/curl-run-6gplk",
    "cdeventsplatform" : "tekton_node123_clusterABC",
    "cdeventssourceevent":  "A123-5678-9012@jenkins_ci_prod/pipeline/my-source-pipeline",
    "datacontenttype" : "text/json",
    "data" : "{}"
}

Open Questions

It may not be possible for a platform to implement the source-event extension in all outgoing events, because of a few reason:

  • if cdevents are implemented via an adaptation layer, the information about the source event may not be there at all
  • not all activities that require a cdevent may have a source-event associated. This happens typically with core events, where the end of a task run or pipeline run is not triggered by an event. In that case, when an activity is finished, there will be no source-event associated

This means that a consumer of events that would like to visualise the end to end workflow triggered by an initial event will have to use a combination of source-event, subject and event type to discover all relevant events.

@afrittoli afrittoli changed the title [feature] Link produced events to consumed events Link produced events to consumed events Jan 25, 2022
@afrittoli afrittoli added the enhancement New feature or request label Jan 25, 2022
@m-linner-ericsson
Copy link
Contributor

I like this initiative as it I think we need this in the protocol.

Eiffel also have links but there it stores the information as an object see the links object. Eiffel would call this link a CAUSE link and would define it as shown from an example of a ActivityTriggeredEvent.

CAUSE
Required: No
Legal targets: Any
Multiple allowed: Yes
Description: Identifies a cause of the event occurring. SHOULD not be used in conjunction with CONTEXT: individual events providing CAUSE within a larger context gives rise to ambiguity. It is instead recommended to let the root event of the context declare CAUSE.

and the json for a event would look like this:

"links": [
    {
      "type": "CAUSE",
      "target": "aaaaaaaa-bbbb-5ccc-8ddd-eeeeeeeeeee1",
      "domainId": "com.example"
    }
  ]

From Eiffels perspective I would like to also have the link typed if/when we want to add more links to the protocol.

Today I opened #19 to add a note we made on a vocabulary meeting about how we want to handle extensions. Do we want this as an extension or as part of the data field?

@afrittoli
Copy link
Contributor Author

I like this initiative as it I think we need this in the protocol.

Eiffel also have links but there it stores the information as an object see the links object. Eiffel would call this link a CAUSE link and would define it as shown from an example of a ActivityTriggeredEvent.

CAUSE
Required: No
Legal targets: Any
Multiple allowed: Yes
Description: Identifies a cause of the event occurring. SHOULD not be used in conjunction with CONTEXT: individual events providing CAUSE within a larger context gives rise to ambiguity. It is instead recommended to let the root event of the context declare CAUSE.

Oh, interesting, I will go read more about cause vs. context in Eiffel

and the json for a event would look like this:

"links": [
    {
      "type": "CAUSE",
      "target": "aaaaaaaa-bbbb-5ccc-8ddd-eeeeeeeeeee1",
      "domainId": "com.example"
    }
  ]

From Eiffels perspective I would like to also have the link typed if/when we want to add more links to the protocol.

Today I opened #19 to add a note we made on a vocabulary meeting about how we want to handle extensions. Do we want this as an extension or as part of the data field?

I would not add this to #19 for now. Hopefully we can merge #19 soon and adding more to it might slow the process down.
We can use the cloudevents binding document to track specific extensions that we identify. as we go.

What do you think about the open question on source?
Should we use source from CloudEvents to store a globally unique URI, or define our own extension?

@erkist
Copy link

erkist commented Jan 26, 2022

Really nice! Being able to have links to previous event will really help with things like visualizing chains without knowing beforehand which events to expect.

The "cdevents_source-event" is a bit unclear to me.
It is defined as "The source-event field is a globally unique identified of another event, which was the trigger to the activity identified by "source + subject"." in the text, but as "Serialised source + id / cdevents_platform + source + id / cdevents_source" in the table.
And in the JSON example it seems to be on the format "id@subject" or maybe "id@source".

@erkist
Copy link

erkist commented Jan 26, 2022

not all activities that require a cdevent may have a source-event associated. This happens typically with core events, where the end of a task run or pipeline run is not triggered by an event. In that case, when an activity is finished, there will be no source-event associated

In such cases it would make sense to me to refer back to the corresponding "XStarted" event. The direct "cause" for an activity ending is that it was started in the first place.

@afrittoli
Copy link
Contributor Author

afrittoli commented Jan 26, 2022

Really nice! Being able to have links to previous event will really help with things like visualizing chains without knowing beforehand which events to expect.

The "cdevents_source-event" is a bit unclear to me. It is defined as "The source-event field is a globally unique identified of another event, which was the trigger to the activity identified by "source + subject"."

"Source + Subject" Identifies the activity
"Source + ID" Identifies an event

The "source" part is needed to make the subject / id reference globally unique.

in the text, but as "Serialised source + id / cdevents_platform + source + id / cdevents_source" in the table. And in the JSON example it seems to be on the format "id@subject" or maybe "id@source".

The id@source format is an example of how "source + id" could be serialised.

@afrittoli
Copy link
Contributor Author

not all activities that require a cdevent may have a source-event associated. This happens typically with core events, where the end of a task run or pipeline run is not triggered by an event. In that case, when an activity is finished, there will be no source-event associated

In such cases it would make sense to me to refer back to the corresponding "XStarted" event. The direct "cause" for an activity ending is that it was started in the first place.

This was my initial thought, but ultimately I decided against it in my proposal.
I understand that it might help correlated things together, but it might also create confusion and it would impose a requirement that might be hard for application to satisfy.

The start event is not the trigger of the stop event, not directly, so I think it should be up to the consumer to correlate start and stop events if they want to. They can do so by filtering all events based on subject.

If we added the start event as a source for the stop we might have to do the same also for all the update events, which may not have a source event either. Granted, we don't really have "update" events in "cdevents" today, but we might in future.

For pipelinerun specifically, the initial event is actually "queued" and that would be the one to include the source event.

@m-linner-ericsson
Copy link
Contributor

So again from an Eiffel perspective if we have links with "types" I think we could solve the problem.

For example Eiffels EiffelActivityFinishedEvent links back to the EiffelActivityTriggeredEvent via the ACTIVITY_EXECUTION link.

In CDEvents we could link our finished to our queued event.

@afrittoli Does this give any clue on the source question?

@m-linner-ericsson
Copy link
Contributor

I would not add this to #19 for now. Hopefully we can merge #19 soon and adding more to it might slow the process down. We can use the cloudevents binding document to track specific extensions that we identify. as we go.

@afrittoli Sorry for my fussy language. I wanted to ask if this initiative described by your issue be a extension or should it be moved to the data part.

@afrittoli
Copy link
Contributor Author

I would not add this to #19 for now. Hopefully we can merge #19 soon and adding more to it might slow the process down. We can use the cloudevents binding document to track specific extensions that we identify. as we go.

@afrittoli Sorry for my fussy language. I wanted to ask if this initiative described by your issue be a extension or should it be moved to the data part.

Heh, that was me not reading properly, sorry about that.
I lean towards making this an extension - or storing it into an existing extension if there is a relevant one.
The reason being that I can imagine a consumer wanting to filter events based on this information, and it would be great to be able to filter without having to dig into the body.

"Dig into the body" out of context sounds really bad 🧟

@m-linner-ericsson
Copy link
Contributor

I would not add this to #19 for now. Hopefully we can merge #19 soon and adding more to it might slow the process down. We can use the cloudevents binding document to track specific extensions that we identify. as we go.

@afrittoli Sorry for my fussy language. I wanted to ask if this initiative described by your issue be a extension or should it be moved to the data part.

Heh, that was me not reading properly, sorry about that. I lean towards making this an extension - or storing it into an existing extension if there is a relevant one. The reason being that I can imagine a consumer wanting to filter events based on this information, and it would be great to be able to filter without having to dig into the body.

"Dig into the body" out of context sounds really bad 🧟

I do see the benefit for having the links outside of the body but that would limit us to strings (if I understand the CloudEvents spec correct). If we would do as in Eiffel and type them it would make more sense to have the link as an object but then we are forced inside the data part.

@afrittoli
Copy link
Contributor Author

I would not add this to #19 for now. Hopefully we can merge #19 soon and adding more to it might slow the process down. We can use the cloudevents binding document to track specific extensions that we identify. as we go.

@afrittoli Sorry for my fussy language. I wanted to ask if this initiative described by your issue be a extension or should it be moved to the data part.

Heh, that was me not reading properly, sorry about that. I lean towards making this an extension - or storing it into an existing extension if there is a relevant one. The reason being that I can imagine a consumer wanting to filter events based on this information, and it would be great to be able to filter without having to dig into the body.
"Dig into the body" out of context sounds really bad 🧟

I do see the benefit for having the links outside of the body but that would limit us to strings (if I understand the CloudEvents spec correct). If we would do as in Eiffel and type them it would make more sense to have the link as an object but then we are forced inside the data part.

We could have both. The string is a serialised version, with limited information, used for routing.
The event payload could include the same info in object format, with more data if required.

@m-linner-ericsson
Copy link
Contributor

I do see the benefit for having the links outside of the body but that would limit us to strings (if I understand the CloudEvents spec correct). If we would do as in Eiffel and type them it would make more sense to have the link as an object but then we are forced inside the data part.

We could have both. The string is a serialised version, with limited information, used for routing. The event payload could include the same info in object format, with more data if required.

Hmm, that could be interesting, worth discussing. Could you then have something like <type>:<id>@<source>?

I am pasting this picture from my presentation on link I held on SIG event in May last year showing how Eiffel events are linked together.

image

@afrittoli
Copy link
Contributor Author

I do see the benefit for having the links outside of the body but that would limit us to strings (if I understand the CloudEvents spec correct). If we would do as in Eiffel and type them it would make more sense to have the link as an object but then we are forced inside the data part.

We could have both. The string is a serialised version, with limited information, used for routing. The event payload could include the same info in object format, with more data if required.

Hmm, that could be interesting, worth discussing. Could you then have something like <type>:<id>@<source>?

Why do you need to have type embedded in the source event?
The id + source combination is meant to be globally unique already.

I am pasting this picture from my presentation on link I held on SIG event in May last year showing how Eiffel events are linked together.

image

@m-linner-ericsson
Copy link
Contributor

Hmm, that could be interesting, worth discussing. Could you then have something like <type>:<id>@<source>?

Why do you need to have type embedded in the source event? The id + source combination is meant to be globally unique already.

If we want to have more links types it would be easier to read for a human and also could also be used when validating events. It could look like base:event1@my/source;context:event2@my/source instead of event1@my/source;event2@my/source

@m-linner-ericsson
Copy link
Contributor

Also when querying a database for event chains having the link type in the top level (and not just in the body) enables restricting the query results for the needed link types. E.g. in the picture above if you want to follow the link chain from ArtP to SCS only following ARTIFACT, COMPOSITIOIN, and ELEMENT.

@afrittoli afrittoli added this to the v0.1 milestone Jun 14, 2022
@afrittoli afrittoli assigned salaboy and unassigned salaboy Jun 14, 2022
@afrittoli afrittoli removed this from the v0.1 milestone Jun 14, 2022
@nicolasff
Copy link

Hello,

@e-backmark-ericsson pointed me to this GitHub issue when I asked about a DAG structure between events at the recent CDEvents Community Summit, and this page has some interesting ideas.
I’ve been working with tracing recently and there’s one thing that came to mind reading this, which is that there doesn’t seem to be a clear way for an event (a node in the graph) to know the ID of its parent (another node). Having to “carry” this ID around is not trivial and would likely break a lot of abstractions.

With tracing for example, there is a concept of an all-encompassing “Trace” that contains all the “Spans” inside it. Each span can have a parent (the “link” between nodes), but the code declaring a span does not necessarily need to know its parent ID. This is because the context attached to the trace itself includes the concept of a “current span”, under which new spans can be attached. The parent IDs are propagated in order to maintain these links between nodes.

This model fits pretty well with tracing since a trace has this waterfall-like structure where new spans are often added under one that was previously started.

But how would this work with CDEvents? If my full CI pipeline was something as simple as this:

  • SCM: merge commit
  • CI:
    • checkout new version of the code
    • build
    • test
    • publish

Then at each step the events that are generated would need to “point” to a parent event to gradually build the DAG. Which means… what? That the first event in the CI pipeline would have to be given the event ID for the commit event? and then each one after that would need to remember the last ID? Given the variety of systems involved, doing this without a standardized way of transmitting and receiving this parent information would get messy very quickly.

When you start having complex graphs with multiple branches and systems triggering subsystems, it becomes difficult to always know the parent ID that a new event should be pointing to.

How was this done with Eiffel events?

@e-backmark-ericsson
Copy link
Contributor

The links between events are a very core part of the Eiffel event structure, and I wouldn't say it is particularly messy at all to use them.
In your example the SCM merge commit would emit a Change Merged CDEvent with a particular event id. That id is known to the system emitting that Change Merged event. Then when the CI pipeline is kicked off, it is either explicitly called (through a web hook for example) from that same SCM system process, and the event id would then be part of that call, or, if not called explicitly, a CI system could trigger on the Change Merged event itself, as it appears on some message bus / stream, and by that the CI system will see the event id within that event. Then the CI system checks out the code and performs the build. Once the build is performed an Artifact Package CDEvent would be sent, containing the known event id of the Change Merged event as a tagged link. The tag would probably be "SOURCE" or similar. The CI system would know the Artifact Package event id as it sent it. In the subsequent test step(s) the Artifact Package event id would then be known and could easily be linked to. If a separate test system is used for testing then the event id could easily be passed there. Or, if it is not possible to forward it that way, then the test system could instead perform a lookup of the event id through a database that is known to store all events sent in the system. The lookup would then be based on some unique information that should be well known to the system, for example an artifact identity. The Artifact Published event should probably link both to some test result event and also to the Artifact Packaged event, and either both of them would be well known in the system, or could be looked up in a similar way to the above.

This might require more hands-on examples and clarifications, but the bottom line is that it is not really complicated.

@afrittoli
Copy link
Contributor Author

The links between events are a very core part of the Eiffel event structure, and I wouldn't say it is particularly messy at all to use them.

I think Eiffel definition of links is very neat and clear, and it provides great value when all system involved speak Eiffel natively and there is a global database of events available.
I think the main difficulty with CDEvents is that we will have to deal with incremental adoption (in the best case), and thus the database of events may not be there, and the various systems (SCM, build, test, etc) may not have any context about related event.
This is especially true if we consider a scenario where we may rely on adapters to translate platform native events (or even APIs + polling) into CDEvents, like we discussed with @salaboy during the community summit.

In your example the SCM merge commit would emit a Change Merged CDEvent with a particular event id. That id is known to the system emitting that Change Merged event. Then when the CI pipeline is kicked off, it is either explicitly called (through a web hook for example) from that same SCM system process, and the event id would then be part of that call, or, if not called explicitly, a CI system could trigger on the Change Merged event itself, as it appears on some message bus / stream, and by that the CI system will see the event id within that event. Then the CI system checks out the code and performs the build. Once the build is performed an Artifact Package CDEvent would be sent, containing the known event id of the Change Merged event as a tagged link.

This is possible, but it requires embedding the event ID deeply in the core of the the CI system.
When creating a build, most CI systems will make the git repo and SHA available, but not necessarily the ID of the event that triggered the build. There will usually be some kind of build ID available, so we'll need to make sure that it's possible to correlate the build ID to the event ID.

In some cases the link between platforms may happen through interfaces that are already standardised and it might be non-trivial to introduce extra context there. When an image is pushed to an OCI registry, we may wish to send a CDEvent about it, produced by the registry. The registry however will not have context about what happened before that.

Something tools that work with container images take the approach of storing things like the image signature and attestation into a separate artifact that is linked to the original through the artifact sha. We could design a cdevents artifact for the purpose of providing enough context to the container registry. Alternatively an even consumer could use the artifact sha or the attestation (when available) to infer the relation with the initial part of the workflow that lead to the image push.

The tag would probably be "SOURCE" or similar. The CI system would know the Artifact Package event id as it sent it. In the subsequent test step(s) the Artifact Package event id would then be known and could easily be linked to. If a separate test system is used for testing then the event id could easily be passed there. Or, if it is not possible to forward it that way, then the test system could instead perform a lookup of the event id through a database that is known to store all events sent in the system. The lookup would then be based on some unique information that should be well known to the system, for example an artifact identity. The Artifact Published event should probably link both to some test result event and also to the Artifact Packaged event, and either both of them would be well known in the system, or could be looked up in a similar way to the above.

This might require more hands-on examples and clarifications, but the bottom line is that it is not really complicated.

My feeling is that the further we move towards the end of the workflow (deployments, incidents, rollbacks etc), the harder it becomes for the event producer to have a source event ID in the context.

My current take on links is that we should specify them as optional, and use case by use case provide recommendations about how a system may provide such link. Alternatively we could state that links may be used only when a database of links is available, and somehow build in the SDK the logic to pick the correct link.

@nicolasff
Copy link

This is possible, but it requires embedding the event ID deeply in the core of the the CI system.

That's a good way to put it, and what I thought made it more strongly coupled with the system being monitored than it needs to be.

When an image is pushed to an OCI registry, we may wish to send a CDEvent about it, produced by the registry. The registry however will not have context about what happened before that.

This is another good example, which shows how lightly-connected these systems can be, and how passing IDs around makes it more difficult to think of them as various building blocks that can be arranged in different ways. It's also not a given that all protocols will support the injection of this additional metadata: when you push an image with docker push, how do you add this parent ID?

The mention of the events database where IDs could be looked up would help with making these events a lighter addition to the existing process than a core piece of state that is passed around. If your CI pipeline already an identifier like a Jenkins build ID or its own "pipeline ID", then it would be possible for step N in this pipeline to look up the event ID produced by step N-1 and use it to build its parent link; this lookup would be based on the CI pipeline ID.

My current take on links is that we should specify them as optional

I tend to agree, but it seems to me that without such links the only structure left to arrange events in a coherent sequence is to use their timestamps. Putting aside the very real issue of synchronizing clocks across space, it feels like this ends up limiting their use to a linear sequence of events - that is, without branches. Even for a basic pipeline with two branches where events have monotonically increasing timestamps in each branch, it becomes impossible to distinguish the events emitted by one branch from those emitted by the other. Doing so requires additional metadata.

I suppose you could also tag the events with branch-continue or branch-rollback for example, but then you have some cases where a tag is used to rebuild the structure, other cases where they use links… this is starting to stray from a standard solution. The overall structure linking these events can't be left to some ad-hoc solution that varies from pipeline to pipeline if the point is to provide a standard way to attach semantic meaning to processing steps by the emission of metadata events.

@e-backmark-ericsson
Copy link
Contributor

It's hard to explain this in text only so I drew a picture to exemplify how the linking could be achieved. The event abbreviations stand for "Change Merged", "Artifact Packaged", "Test Suite Finished", "Artifact Published".

image

Alt 1. Each step is controlled by a tool that is fully supporting CDEvents so it can listen to CDEvents events and aggregate data from multiple events if needed. Each step is directly triggered on an incoming event. No external event storage needed. The steps triggered by events will see the event id in the event it is triggered by and can easily include that in the event sent from the step.

Alt 2. Each step is controlled by a tool that is aware of CDEvents but cannot directly trigger on such events. Instead it is explicitly called through a REST call or other RPC like call. The tool uses data recieved from the caller (e.g. SCM change/commit id, artifact identity and version, test suite identity, or similar) and uses that to perform lookups towards an event storage. The id of the event returned by that lookup can be included as a link in the new event to be sent.

Alt 3. Each step is controlled by a CI/CD orchestrator, for example Tekton, Argo Workflows, Keptn, or similar. The steps themselves has no knowledge about CDEvents, and not the SCM system either. The orchestrator that sends all events will be aware of the ids of them and can easily add them to the events sent as links as needed.

And of course any combination of the above could be used as a CI/CD setup. With this I believe I have showed that is shouldn't be very hard to accomplish the needed linking between the events. Of course we could leave the links as optional, at least to begin with, to prove them out with PoCs and examples, but I do believe they would satisfy several needs of measurability, visibility, traceability etc of end-to-end CI/CD pipelines. They could even span between SW projects, if there are means to propagate events between them and provide means to store events and query for them as needed.

@afrittoli
Copy link
Contributor Author

It's hard to explain this in text only so I drew a picture to exemplify how the linking could be achieved. The event abbreviations stand for "Change Merged", "Artifact Packaged", "Test Suite Finished", "Artifact Published".

image

Thanks @e-backmark-ericsson for drawing the diagrams.

I think links would be undoubtedly beneficials; my only concern is that adding links into the events requires an amount of knowledge or context that may not be easily available to the sender, at least not right now, which is why I think they should be in the spec but optional for now. If not, I think we would very much inhibit the adoption, or perhaps even worst foster an adaption that is not compliant to the spec.

Alt 1. Each step is controlled by a tool that is fully supporting CDEvents so it can listen to CDEvents events and aggregate data from multiple events if needed. Each step is directly triggered on an incoming event. No external event storage needed. The steps triggered by events will see the event id in the event it is triggered by and can easily include that in the event sent from the step.

I think it's fair to assume that most tools won't receive events directly. They may produce them. When they do receive events, the receiver is likely to be a dedicated microservice, that accepts the event(s), filters them based on some logic, and runs the tool if needed. So, for the tool to produce an event which links back to the source event(s), it must have some place it its own input model where to get the source event IDs from.

Wrapping a tool with your pipeline orchestrator of choice is a common strategy to add an event receiver and producer to your tool, but even so the link may not be obvious. In Tekton, the triggers component receives the events and runs the pipeline that wraps the tool, but there is no Tekton native place where to carry source event information so that is sent in the outgoing event. Today in existing PoCs we use k8s annotations with specific names to achieve that.

We could devise and document recommended strategies to achieve this.

Alt 2. Each step is controlled by a tool that is aware of CDEvents but cannot directly trigger on such events. Instead it is explicitly called through a REST call or other RPC like call. The tool uses data recieved from the caller (e.g. SCM change/commit id, artifact identity and version, test suite identity, or similar) and uses that to perform lookups towards an event storage. The id of the event returned by that lookup can be included as a link in the new event to be sent.

The caveat here is the lookup. Events are unequivocally identified by an (event ID, source) tuple, but no other that in the event is expected to unequivocally identify the event.
We may use data from the context of the sender to discover linked messages, that's something the SDK could do, but I we may only do that if the selection is unequivocal.

Alt 3. Each step is controlled by a CI/CD orchestrator, for example Tekton, Argo Workflows, Keptn, or similar. The steps themselves has no knowledge about CDEvents, and not the SCM system either. The orchestrator that sends all events will be aware of the ids of them and can easily add them to the events sent as links as needed.

This approach works when events are used by an "external" observer. In this case the orchestrator can also embed a context ID of some kind (build ID, pipeline ID) in the events to give more context to the observer.
Note that in Jenkins for instance the component that receives events and the one that triggers them would be two separate plugins, which may not be able to share the context required to produce links. Similarly in Tekton triggers and pipeline are two different microservices and do not have today a native way to share this context.

And of course any combination of the above could be used as a CI/CD setup. With this I believe I have showed that is shouldn't be very hard to accomplish the needed linking between the events. Of course we could leave the links as optional, at least to begin with, to prove them out with PoCs and examples, but I do believe they would satisfy several needs of measurability, visibility, traceability etc of end-to-end CI/CD pipelines. They could even span between SW projects, if there are means to propagate events between them and provide means to store events and query for them as needed.

@e-backmark-ericsson
Copy link
Contributor

I agree that we should have the links as optional parameters, at least to begin with.

There is, based on my experience and also on discussions in several CDEvents meetings, a strong need to relate events to each other. And the need is the relate the events end-to-end, from source change to customer deployment (and beyond?), to be able to visualize, trace and provide metrics end-to-end.

There are at least three different ways to do that. Two of those require some logic on the producer side - adding links to the events, or adding a well-known context in all "related" events. The third option would be to let the consumers of the events implement that logic themselves, without any relational info in the events. Implementing that logic on the consumer side would be problematic, as many consumers will not have a required knowledge about the producers and their domain within themselves to figure out how the events relate to each other, so I think that is a dead end.

Remains the two options for the producer side. Including a dedicated context in each event that are related to each other could be an option, but it is very limited. It would be singe-dimensional and would be depending on event timestamps to find out the order of occurrences. Being single-dimensional makes it virtually impossible to describe parallel (fan-out/fan-in) activities in the pipelines. And the dedicated context object needs to be propagated to each event-sending service to be included in the events to sent.

To be able to use links between events, the source event ids needs to be propagated in a similar way to if a dedicated context was used, and that could be an issue. Not least in the case described by @afrittoli above where a dedicated CDEvents aware wrapper microservice triggers on a CDEvent and the calls a tool internally as there might not be simple ways to propagate that data to an event sender within the tool. We'd need to look at each such tool case by case. Most tools have some kind of environment/workflow parameters that could be used for such needs.

So, to answer some comments above:

I think links would be undoubtedly beneficials; my only concern is that adding links into the events requires an amount of knowledge or context that may not be easily available to the sender, at least not right now, which is why I think they should be in the spec but optional for now. If not, I think we would very much inhibit the adoption, or perhaps even worst foster an adaption that is not compliant to the spec.

I agree, the links should be optional

I think it's fair to assume that most tools won't receive events directly. They may produce them. When they do receive events, the receiver is likely to be a dedicated microservice, that accepts the event(s), filters them based on some logic, and runs the tool if needed. So, for the tool to produce an event which links back to the source event(s), it must have some place it its own input model where to get the source event IDs from.

Yes, but the same goes for any other relational info needed to be included on the producer side. Without it the relations need to be created on the consumer side which will be almost impossible in many cases.

Wrapping a tool with your pipeline orchestrator of choice is a common strategy to add an event receiver and producer to your tool, but even so the link may not be obvious. In Tekton, the triggers component receives the events and runs the pipeline that wraps the tool, but there is no Tekton native place where to carry source event information so that is sent in the outgoing event. Today in existing PoCs we use k8s annotations with specific names to achieve that.

Yes, there needs to be tool specific implementations for this kind of relational data, being it "links" or "dedicated context" data

We could devise and document recommended strategies to achieve this.

Yes

The caveat here is the lookup. Events are unequivocally identified by an (event ID, source) tuple, but no other that in the event is expected to unequivocally identify the event. We may use data from the context of the sender to discover linked messages, that's something the SDK could do, but I we may only do that if the selection is unequivocal.

Without unequivocally identifiable event data it will not be possible to relate events deterministically. References such as links (event ID, source), dedicated contexts, or some other event specific unique data to be evaluated by the consumer, needs to be available to be able to relate the events.

Note that in Jenkins for instance the component that receives events and the one that triggers them would be two separate plugins, which may not be able to share the context required to produce links.

This is not true. We've used one single plugin for Eiffel events for years, and it is responsible both for triggering builds on incoming events and for sending events linking to those incoming events.

@afrittoli afrittoli added the roadmap Items on the roadmap label Oct 6, 2022
@e-backmark-ericsson e-backmark-ericsson added this to the v0.2 milestone Oct 21, 2022
@afrittoli afrittoli self-assigned this Dec 13, 2022
@afrittoli
Copy link
Contributor Author

#104

@afrittoli afrittoli modified the milestones: v0.2, v0.3 Mar 13, 2023
@afrittoli afrittoli modified the milestones: v0.3, v0.4 May 2, 2023
@afrittoli
Copy link
Contributor Author

Related PR: #139

@afrittoli afrittoli modified the milestones: v0.4, v0.5 Apr 8, 2024
@xibz
Copy link
Contributor

xibz commented Apr 8, 2024

After carefully reading this issue, most have been solved by #139.

What's left is propagation, which is mentioned in the spec already via headers, but needs actual implementation. This issue can be closed, and I will create a follow up issue in the appropriate place (still looking to where that is) for implementing propagation in the SDKs and document how services can propagate these headers, etc.

@xibz xibz closed this as completed Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request roadmap Items on the roadmap
Projects
Status: Done
Development

No branches or pull requests

7 participants