Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionally emit events during execution #2082

Closed
bobcatfish opened this issue Feb 21, 2020 · 21 comments · Fixed by #2938
Closed

Optionally emit events during execution #2082

bobcatfish opened this issue Feb 21, 2020 · 21 comments · Fixed by #2938
Assignees
Labels
area/roadmap Issues that are part of the project (or organization) roadmap (usually an epic) kind/feature Categorizes issue or PR as related to a new feature.

Comments

@bobcatfish
Copy link
Collaborator

Expected Behavior

Although a user can use the CloudEvent PipelineResource to emit events during a TaskRun or PipelineRun, some folks may want this to happen by default, for significant events such as:

  • Starting execution
  • Finishing execution
  • Condition skipped
  • ....?

Actual Behavior

The CloudEvent PipelineResource is the only option for this right now, but even that needs #1684 before it can be relied on (see #2053)

Additional info

  • One open question is where it would be emitting events to, e.g. does the Run provide an endpoint address that CloudEvents are sent to via HTTP, or do the events go to some kind of message bus, or something else?
  • Note that the keptn.sh folks are interested in creating standards around these kinds of events Investigate joint CloudEvents with Tekton keptn/keptn#1259
@dibyom
Copy link
Member

dibyom commented Feb 24, 2020

/kind feature

@tekton-robot tekton-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 24, 2020
@afrittoli
Copy link
Member

@bobcatfish I've been thinking about this for a bit, and I think it would be great to have the ability to specify a sink in the Task|PipelineRun so that cloud events could be sent by a task without the need of altering the task to add the pipeline resource. This would greatly improve the re-usability of tasks, e.g. when using Tasks as CI Jobs.

It gets a bit more complicated than just a sink though, since we might need to:

  1. send the event to multiple sinks
  2. send the event to different sinks depending on the event
  3. send the event to different sinks depending on the task in a pipeline

A lot of those might be avoided when using tekton triggers by using a single EventListener with multiple triggers and CEL filters to select the right thing to select.

This is highly related to the notification work too, as I think the basic infrastructure required in the controller to make this happen is the same as that needed for notifications.

@afrittoli
Copy link
Member

/assign

@afrittoli
Copy link
Member

afrittoli commented Mar 30, 2020

As a starting point, I think it's a fair assumption to let receivers / channels filter events, and err on the side of sending too many events, in favour of a lean syntax on Tekton side.
In practice this means that a PipelineRun and a TaskRun will get a new field that allows specifying a list of sinks. If sinks are specified, a cloud event will be sent for all events to all sinks.

Options for the field name: sinks, cloudEvents, cloudEventSinks, notify

apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  name: pipeline-with-sinks
spec:
  serviceAccountName: bot
  cloudEvents:
    - http://my-event-listener
    - https://another-event-listener
    - https://some-external-service
  pipelineRef:
    name: my-pipeline
  params:
    - name: p1
      value: v1
  resources:
  - name: source
    resourceRef:
      name: git-resource

Similar syntax for TaskRuns. When set on PipelineRun level, all siniks would apply to all TaskRuns.

@skaegi
Copy link
Contributor

skaegi commented Mar 31, 2020

This might just be an implementation detail, but one option to consider is have the pipeline controller only "generate" Tekton lifecycle Kubernetes events and then leave the decision on CloudEvents or http endpoints up to a sub-project with a controller that listens for the Tekton events and distributes them.

We actually need the Kubernetes events in our service's "agent" already so this would simplify our implementation.

@vdemeester
Copy link
Member

This might just be an implementation detail, but one option to consider is have the pipeline controller only "generate" Tekton lifecycle Kubernetes events and then leave the decision on CloudEvents or http endpoints up to a sub-project with a controller that listens for the Tekton events and distributes them.

We actually need the Kubernetes events in our service's "agent" already so this would simplify our implementation.

I think we can do both, aka : by default sending k8s events, and if cloudEvents field is specified send them there too 😉

@ghost
Copy link

ghost commented Apr 1, 2020

I really like the ease-of-use of the proposed syntax, and pushing the responsibility of filtering the events to the receiver rather than Tekton.

Per discussion in the WG just now, I wanted to put forward a use case that might affect the design a little bit:

A team authors their organization's Tasks and Pipelines. That team wants to ensure that every Run emits cloudEvents to a specific set of endpoints for monitoring / data collection / regulatory requirements. With the current design each developer team, who mostly write Runs, has to remember to include the cloudEvents end points in those Runs. So I wonder if there is a way to set the "default" set of endpoints that every Run receives. Three options we talked about on the call:

  1. Pipelines and Tasks can also specify cloudEvents that get added to any Run using them.
  2. A global config location includes a cloudEvents section (like our "defaults" ConfigMap).
  3. Pipelines and Tasks can specify cloudEvents "slots" which must be bound by Run authors.

After thinking about it a little bit I like number 2 the best so far. I can also see that there are different levels of granularity here: 1 is Pipeline / Task specific, 2 is global, 3 is Run-specific but with stricter validation that endpoints were included.

Another wrinkle: Do we merge the cloudEvents lists if they're specified in both a global location and a Run? Or replace them with those specified in the Run? My hunch is we should merge.

@afrittoli
Copy link
Member

Thank you @sbwsg for your feedback and for capturing the WG discussion here.
We're going to emit k8s events for all lifecycle events, which partly covers the monitoring use case. Nonetheless, I think it would make sense to support sending cloud events for all pipeline/taskruns.

My favourite option would be (2), i.e. use sinks from a global configmap, and send cloud events for all lifecycle events for all tasks/pipelines. Other options are not necessarily orthogonal, so we could also add them as we need in future.

The way I would go about implementing this is:

  1. Ensure we have clear points / hooks in the code that allow us to identify the different lifecycle events for tasks and pipelines. I would actually start with task start and finish and add incrementally. Emit k8s events there.
  2. Add the configmap for sinks, and start sending cloudevents in the same place we add k8s events.
  3. Extend the taskrun API with the list of sinks. I would opt for merging lists from global config with taskrun specific ones.
  4. Add more lifecycle events for Task and add lifecycle events for pipelineruns (this can be multiple PRs)

How does this sound?

@ghost
Copy link

ghost commented Apr 1, 2020

Cheers @afrittoli Sounds like a really solid approach to me!

@afrittoli
Copy link
Member

I will also address #2328 as part of this work

afrittoli added a commit to afrittoli/pipeline that referenced this issue Apr 5, 2020
Start emitting events for additional TaskRun lifecyle events:
- taskrun started
- taskrun timeout

Introduce pre-run and post-run functions that are invoked
asynchronously when the taskrun starts and completes, to emit
events.

These same functions shall be used to trigger any other async
behaviour on start/stop of taskruns.

Add documentation on events.

Fixes tektoncd#2328
Work towards tektoncd#2082
afrittoli added a commit to afrittoli/pipeline that referenced this issue Apr 5, 2020
Start emitting events for additional TaskRun lifecyle events:
- taskrun started
- taskrun timeout

Introduce pre-run and post-run functions that are invoked
asynchronously when the taskrun starts and completes, to emit
events.

These same functions shall be used to trigger any other async
behaviour on start/stop of taskruns.

Add documentation on events.

Fixes tektoncd#2328
Work towards tektoncd#2082
@bobcatfish
Copy link
Collaborator Author

Note: priorities for this issue are emitting events at end of Pipeline and Task execution, #742 additionally adds after step execution

@shimmerjs
Copy link

Is this the intended replacement for CloudEvent PipelineResources in the future, since PipelineResources are more-or-less deprecated in the latest release of Tekton?

@afrittoli
Copy link
Member

This will provide more than we can achieve with CloudEventPipelineResource, since it will include k8s and CloudEvents for every life cycle event of both tasks and pipelines.

Resources are not officially deprecated, but likely they will either reworked or replaced. In either case the cloud event resource might go away since it doesn't really fit the same pattern as other pipeline resource, its functionality is not executed in injected steps but in the controller.

afrittoli added a commit to afrittoli/pipeline that referenced this issue Apr 9, 2020
Start emitting events for additional TaskRun lifecyle events:
- taskrun started
- taskrun timeout

Introduce pre-run and post-run functions that are invoked
asynchronously when the taskrun starts and completes, to emit
events.

These same functions shall be used to trigger any other async
behaviour on start/stop of taskruns.

Add documentation on events.

Fixes tektoncd#2328
Work towards tektoncd#2082
afrittoli added a commit to afrittoli/pipeline that referenced this issue Apr 16, 2020
Start emitting events for additional TaskRun lifecyle events:
- taskrun started
- taskrun timeout

Introduce pre-run and post-run functions that are invoked
asynchronously when the taskrun starts and completes, to emit
events.

These same functions shall be used to trigger any other async
behaviour on start/stop of taskruns.

Add documentation on events.

Fixes tektoncd#2328
Work towards tektoncd#2082
afrittoli added a commit to afrittoli/pipeline that referenced this issue Apr 16, 2020
Start emitting events for additional TaskRun lifecyle events:
- taskrun started
- taskrun timeout

Introduce pre-run and post-run functions that are invoked
asynchronously when the taskrun starts and completes, to emit
events.

These same functions shall be used to trigger any other async
behaviour on start/stop of taskruns.

Add documentation on events.

Fixes tektoncd#2328
Work towards tektoncd#2082
afrittoli added a commit to afrittoli/pipeline that referenced this issue Apr 26, 2020
Emit events for additional TaskRun lifecyle events:
- taskrun started
- taskrun timeout

Introduce pre-run and post-run functions that are invoked
asynchronously when the taskrun starts and completes, to emit
events.

These same functions shall be used to trigger any other async
behaviour on start/stop of taskruns.

Add documentation on events.

Fixes tektoncd#2328
Work towards tektoncd#2082
afrittoli added a commit to afrittoli/pipeline that referenced this issue May 20, 2020
Move both events.go and the cloudevents module under a new module
reconciler.events. We want to the controllers to be able to emit
both kubernetes events and cloudevents, so this change is in
preparation to creating a common event interface to be used
by controllers.

Partially addresses tektoncd#2082
tekton-robot pushed a commit that referenced this issue May 21, 2020
Move both events.go and the cloudevents module under a new module
reconciler.events. We want to the controllers to be able to emit
both kubernetes events and cloudevents, so this change is in
preparation to creating a common event interface to be used
by controllers.

Partially addresses #2082
@jlpettersson
Copy link
Member

There is also Eventrouter that can forward Kubernetes Event to a Sink. We use this today to forward "platform"-events to e.g. Kafka and Dashboards.

@afrittoli
Copy link
Member

The reason why I'd like to emit cloud events directly, either from the main controller or from a dedicated one, is that k8s events have a fixed structure and afaik do not allow storing the whole source object in the event.

afrittoli added a commit to afrittoli/pipeline that referenced this issue May 25, 2020
Change the event type string for taskruns from dev.tekton.event.task.*
to dev.tekton.event.taskrun.*. Since cloud events are only sent by
pipeline resources - which are alpha - we don't need to comply with
the beta deprecation policy for this.

Extend the cloudevents module to include more event types for TaskRuns
and event types for PipelineRuns, with unit test coverage.
This is achieved by introducing the RunsToCompletion and
RunsToCompletionStatus interface which are met by TaskRun, PipelineRun
and their respective status types.

At this stage there is no event producer that generates these new
events. This is preparing the stage for the next changes where
we will start to emit cloud events in addition to the k8s events
we emit today.

Partially addresses tektoncd#2082
Partially addresses tektoncd#2684
afrittoli added a commit to afrittoli/pipeline that referenced this issue May 25, 2020
Change the event type string for taskruns from dev.tekton.event.task.*
to dev.tekton.event.taskrun.*. Since cloud events are only sent by
pipeline resources - which are alpha - we don't need to comply with
the beta deprecation policy for this.

Extend the cloudevents module to include more event types for TaskRuns
and event types for PipelineRuns, with unit test coverage.
This is achieved by introducing the RunsToCompletion and
RunsToCompletionStatus interface which are met by TaskRun, PipelineRun
and their respective status types.

At this stage there is no event producer that generates these new
events. This is preparing the stage for the next changes where
we will start to emit cloud events in addition to the k8s events
we emit today.

Partially addresses tektoncd#2082
Partially addresses tektoncd#2684

Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
afrittoli added a commit to afrittoli/pipeline that referenced this issue May 29, 2020
Change the event type string for taskruns from dev.tekton.event.task.*
to dev.tekton.event.taskrun.*. Since cloud events are only sent by
pipeline resources - which are alpha - we don't need to comply with
the beta deprecation policy for this.

Extend the cloudevents module to include more event types for TaskRuns
and event types for PipelineRuns, with unit test coverage.
This is achieved by introducing the RunsToCompletion and
RunsToCompletionStatus interface which are met by TaskRun, PipelineRun
and their respective status types.

At this stage there is no event producer that generates these new
events. This is preparing the stage for the next changes where
we will start to emit cloud events in addition to the k8s events
we emit today.

Partially addresses tektoncd#2082
Partially addresses tektoncd#2684

Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
afrittoli added a commit to afrittoli/pipeline that referenced this issue Jun 3, 2020
Change the event type string for taskruns from dev.tekton.event.task.*
to dev.tekton.event.taskrun.*. Since cloud events are only sent by
pipeline resources - which are alpha - we don't need to comply with
the beta deprecation policy for this.

Extend the cloudevents module to include more event types for TaskRuns
and event types for PipelineRuns, with unit test coverage.
This is achieved by introducing the RunsToCompletion and
RunsToCompletionStatus interface which are met by TaskRun, PipelineRun
and their respective status types.

At this stage there is no event producer that generates these new
events. This is preparing the stage for the next changes where
we will start to emit cloud events in addition to the k8s events
we emit today.

Partially addresses tektoncd#2082
Partially addresses tektoncd#2684

Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
afrittoli added a commit to afrittoli/pipeline that referenced this issue Jun 6, 2020
Change the event type string for taskruns from dev.tekton.event.task.*
to dev.tekton.event.taskrun.*. Since cloud events are only sent by
pipeline resources - which are alpha - we don't need to comply with
the beta deprecation policy for this.

Extend the cloudevents module to include more event types for TaskRuns
and event types for PipelineRuns, with unit test coverage.
This is achieved by introducing the RunsToCompletion and
RunsToCompletionStatus interface which are met by TaskRun, PipelineRun
and their respective status types.

At this stage there is no event producer that generates these new
events. This is preparing the stage for the next changes where
we will start to emit cloud events in addition to the k8s events
we emit today.

Partially addresses tektoncd#2082
Partially addresses tektoncd#2684

Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
afrittoli added a commit to afrittoli/pipeline that referenced this issue Jun 8, 2020
Change the event type string for taskruns from dev.tekton.event.task.*
to dev.tekton.event.taskrun.*. Since cloud events are only sent by
pipeline resources - which are alpha - we don't need to comply with
the beta deprecation policy for this.

Extend the cloudevents module to include more event types for TaskRuns
and event types for PipelineRuns, with unit test coverage.
This is achieved by introducing the RunsToCompletion and
RunsToCompletionStatus interface which are met by TaskRun, PipelineRun
and their respective status types.

At this stage there is no event producer that generates these new
events. This is preparing the stage for the next changes where
we will start to emit cloud events in addition to the k8s events
we emit today.

Partially addresses tektoncd#2082
Partially addresses tektoncd#2684

Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
tekton-robot pushed a commit that referenced this issue Jun 8, 2020
Change the event type string for taskruns from dev.tekton.event.task.*
to dev.tekton.event.taskrun.*. Since cloud events are only sent by
pipeline resources - which are alpha - we don't need to comply with
the beta deprecation policy for this.

Extend the cloudevents module to include more event types for TaskRuns
and event types for PipelineRuns, with unit test coverage.
This is achieved by introducing the RunsToCompletion and
RunsToCompletionStatus interface which are met by TaskRun, PipelineRun
and their respective status types.

At this stage there is no event producer that generates these new
events. This is preparing the stage for the next changes where
we will start to emit cloud events in addition to the k8s events
we emit today.

Partially addresses #2082
Partially addresses #2684

Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
@tom24d
Copy link

tom24d commented Jun 19, 2020

Hi @afrittoli , I am genuinely staggered by your hard work. Thanks so much for doing this!

I got one question:
What is the expectation if the third party component registers to monitor every events generated by any Run? In order to monitor every events, I am imagining the situation of injecting sinkURI from third party controller into the ConfigMap, but I want to make sure that no breakage occur in Tekton ecosystem.

@afrittoli
Copy link
Member

Hi @afrittoli , I am genuinely staggered by your hard work. Thanks so much for doing this!

I got one question:
What is the expectation if the third party component registers to monitor every events generated by any Run? In order to monitor every events, I am imagining the situation of injecting sinkURI from third party controller into the ConfigMap, but I want to make sure that no breakage occur in Tekton ecosystem.

Hi @tom24d, thank you for your interest in this topic. The way I imagined this is that the administrator of a Tekton deployment would configure the target URL for cloud events.
The config map can also be edited by a third party controller as you mentioned, as long as it has the rights to edit a configmap in the namespace where Tekton is deployed.

Your question makes me wonder if we should support a list of sinkURI instead of a single one?
In #2837 I introduced the configuration option as a single string, but it should be possible to change that to a list if needed.

Once #2837 is finished I will add a similar PR for pipelineruns, and later I plan to enabled configuring a sinkURI at taskrun and pipelinerun level, to allow for better granularity.

@tom24d
Copy link

tom24d commented Jun 21, 2020

@afrittoli

In 2837 I introduced the configuration option as a single string, but it should be possible to change that to a list if needed.

If there is no problem for that idea, I think that gonna make the basic way for third party actor! 😀
The reason is, I was wondering if there is a general registration mechanism to receive every events from Tekton. As long as regarding the discussion here, it seems like injecting sinkURI to the ConfigMap is the simplest way for me.

Thanks!

@afrittoli
Copy link
Member

@afrittoli

In 2837 I introduced the configuration option as a single string, but it should be possible to change that to a list if needed.

If there is no problem for that idea, I think that gonna make the basic way for third party actor! 😀
The reason is, I was wondering if there is a general registration mechanism to receive every events from Tekton. As long as regarding the discussion here, it seems like injecting sinkURI to the ConfigMap is the simplest way for me.

Thanks!

Yes, with #2837 you will be able to set the sinkURI in the config map to receive cloud events.
Tekton also publishes kubernetes events to the cluster, so you could intercept those too, but you'd need an app running on the same cluster with access to those events.

@vdemeester
Copy link
Member

Your question makes me wonder if we should support a list of sinkURI instead of a single one?
In #2837 I introduced the configuration option as a single string, but it should be possible to change that to a list if needed.

I would think it would make sense to support more than one sink indead 🙃

@afrittoli
Copy link
Member

As per discussion during the API WG on 13/07/20 we will not support for now any option beyond the single sink in the platform wide configuration.

The following use cases:

  • send events to multiple sinks
  • send different events to different sinks
    can be supported using an event broker with event filtering capabilities, or where interested parties can subscribe to different events.

One concern with this approach is the volume of events sent. Right now it's all or nothing, and in case of tasks and pipelines running in different namespaces, they would have to report events to a central broker, which may raise concerns in terms of traffic separation.

If Tekton users will start using cloud events, they may report on missing features and those will be addressed via dedicated TEPs / PRs.

Tekton Pipelines automation moved this from Roadmap to Closed Jul 27, 2020
@bobcatfish bobcatfish added the area/roadmap Issues that are part of the project (or organization) roadmap (usually an epic) label Aug 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/roadmap Issues that are part of the project (or organization) roadmap (usually an epic) kind/feature Categorizes issue or PR as related to a new feature.
Projects
No open projects
Tekton Pipelines
  
Closed
Development

Successfully merging a pull request may close this issue.

9 participants