Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support trigger via CloudEvents #1439

Closed
afrittoli opened this issue Sep 1, 2022 · 19 comments
Closed

Support trigger via CloudEvents #1439

afrittoli opened this issue Sep 1, 2022 · 19 comments
Assignees
Labels
area/roadmap Issues that are part of the project (or organization) roadmap (usually an epic) kind/feature Categorizes issue or PR as related to a new feature.

Comments

@afrittoli
Copy link
Member

Feature request

Support CloudEvents compliant responses.

I see a few options:

  • When a cloudevent is received, respond with a cloudevent. The current body could be placed in the payload.
  • When a cloudevent is received, respond with no payload
  • Allow users to disable the response payload platform wise

Use case

When the project was started, it was decided to support generic HTTP requests with JSON payload rather than focus specifically on CloudEvents. The current behaviour of triggers is not compliant with the CloudEvents spec though, which makes it hard to use Triggers as a sink for CloudEvents.

When using triggers with Knative broker, the broker checks if the response is a cloud event, or empty, and reports a 502 back otherwise. See the code and the logs from a test cluster:

{"level":"debug","ts":"2022-09-01T10:59:16.299Z","logger":"mt_broker_filter","caller":"filter/filter_handler.go:173","msg":"Received message","commit":"56edb3b","triggerRef":"default/all-to-tekton-el-events"}
{"level":"debug","ts":"2022-09-01T10:59:16.299Z","logger":"mt_broker_filter","caller":"filter/filter_handler.go:353","msg":"Applying attributes filter.","commit":"56edb3b","trigger":"default/all-to-tekton-el-events","filter":{}}
{"level":"debug","ts":"2022-09-01T10:59:16.350Z","logger":"mt_broker_filter","caller":"filter/filter_handler.go:222","msg":"Successfully dispatched message","commit":"56edb3b","target":"http://el-events.default.svc.cluster.local:8080"}
{"level":"error","ts":"2022-09-01T10:59:16.350Z","logger":"mt_broker_filter","caller":"filter/filter_handler.go:227","msg":"failed to write response","commit":"56edb3b","error":"received a non-empty response not recognized as CloudEvent. The response MUST be either empty or a valid CloudEvent","stacktrace":"knative.dev/eventing/pkg/broker/filter.(*Handler).send\n\tknative.dev/eventing/pkg/broker/filter/filter_handler.go:227\nknative.dev/eventing/pkg/broker/filter.(*Handler).ServeHTTP\n\tknative.dev/eventing/pkg/broker/filter/filter_handler.go:209\ngo.opencensus.io/plugin/ochttp.(*Handler).ServeHTTP\n\tgo.opencensus.io@v0.23.0/plugin/ochttp/server.go:92\nknative.dev/pkg/network/handlers.(*Drainer).ServeHTTP\n\tknative.dev/pkg@v0.0.0-20220524202603-19adf798efb8/network/handlers/drain.go:110\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2879\nnet/http.(*conn).serve\n\tnet/http/server.go:1930"}
@afrittoli afrittoli added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 1, 2022
afrittoli added a commit to afrittoli/cdevents-metrics-poc that referenced this issue Sep 1, 2022
We use gitea as SCM provider, so this includes some values for
deploying gitea on the IBM cloud cluster.
Repo can be cloned anonymously but anything else requires login.

Tekton triggers and pipeline are going to be used, in combination
with a new tool, "cdeventer", to react to webhook/events and
produce cdevents. We can use vanilla Tekton here so pretty much
just install the components for now.

The Tekton dashboard is handy for the demo, and it's deployed
behind an oauth-proxy to secure it via GitHub login.

cdeventer itself is a controller plus tekton resources.
The controller for now is added on the cluster using ko, and it's
not yet included in this commit.
The Tekton resources are developed in the POC and will be copied
over to cdeventer later on to make the more reusable.

The sink for CDEvents is a CloudEvent broker provided by Knative
with its in-memory channel implementation. Knative triggers
are used to define subscriptions and get CDEvents to the
interested parties.

The flow to produce CDEvent for now goes as follows:
- demo developer merges a PR in gitea
- webhook goes to an event listener in the tekton-ci namespace
- the tekton trigger/binding extracts the data required for the
  change CDEvent from the webhook headers and payload
- the tekton trigger template creates a "Run" i.e. a custom
  task run with the data for the CDEvent
- the cdeventer custom task reconciler reconciles the "Run".
  It uses the new go-sdk to build, validate and send a CDEvent
  to the CloudEvents broker

Now, to consume / display the CDEvent, the flows for now is:
- an all catching knative trigger gets the CDEvent to a different
  Tekton event-listener, el-events, in the default namespace
- *note* the knative broker must be set to 0 retries because of
  tektoncd/triggers#1439
- A Tekton trigger matched on CDEvents runs a display pipeline
  that shows headers and payload of the CDEvent in the Tekton
  dashboard

Signed-off-by: Andrea Frittoli <andrea.frittoli@gmail.com>
@dibyom
Copy link
Member

dibyom commented Sep 1, 2022

We can probably add a cloud event response type behind a opt in flag?

@afrittoli
Copy link
Member Author

We can probably add a cloud event response type behind a opt in flag?

It sounds good, but I think it should be only for incoming CloudEvents.

@dibyom
Copy link
Member

dibyom commented Sep 1, 2022

Are you saying we return the current response if the input is not a cloud event? I was thinking we could eventually stabilize on always returning a cloud event?

@afrittoli
Copy link
Member Author

Are you saying we return the current response if the input is not a cloud event? I was thinking we could eventually stabilize on always returning a cloud event?

I guess we could... I'm not sure if there is any spec about expected replies for a webhook.
If we use CloudEvents binary format, the body would not change at all, only HTTP headers would be added... so I guess we could send that to all eventually.

@dibyom
Copy link
Member

dibyom commented Sep 14, 2022

Yeah, there are no expected replies as far as I know (other than the HTTP status code)

@Fabian-K
Copy link
Contributor

Just to add some context/motivation. This is especially an issue in combination with knatives retry function (https://knative.dev/docs/eventing/event-delivery/). Because of the unexpected response, it is treated as a failure and retried. The retry results in duplicated resources created by triggers.

@dibyom dibyom added the area/roadmap Issues that are part of the project (or organization) roadmap (usually an epic) label Sep 23, 2022
@bigkevmcd
Copy link
Member

I ran into this too, and I had a look at the code...

https://github.com/tektoncd/triggers/blob/main/pkg/sink/sink.go#L211-L223

We could detect CloudEvent requests there, and either write the response as JSON, or, as a CloudEvent message

@afrittoli
Copy link
Member Author

I think the response is JSON today, but there are no CloudEvents headers, so it's rejected by the broker

@bigkevmcd
Copy link
Member

That's why we'd write it as a CloudEvent message (using the cloudevents code) which would set the response correctly.

@bigkevmcd
Copy link
Member

I had a crack at this earlier this morning main...bigkevmcd:triggers:cloudevents-triggers it needs a bit of tidying but the test passes.

@treyhyde
Copy link

I make extensive use of triggering pipelines via cloudevents delivered from knative... so that works... but I'm forced to disable retry since knative thinks all the responses are bogus regardless of the 202 since it responds with JSON that isn't a CE.

Doing any of the things in the original topic here works for me, even if guarded by a feature flag.

My old discussion on the subject: https://knative.slack.com/archives/C017X0PFC0P/p1652901710903979

@tekton-robot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 25, 2023
@daper
Copy link

daper commented Jan 25, 2023

/remove-lifecycle stale

@tekton-robot tekton-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 25, 2023
@tekton-robot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 25, 2023
@daper
Copy link

daper commented Apr 25, 2023

/remove-lifecycle stale

@tekton-robot tekton-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 25, 2023
@afrittoli
Copy link
Member Author

@dibyom i think this has been implemented, right?

@savitaashture
Copy link
Contributor

There is a implementation related to slack interceptor which also supports http form data as part of this #1548 which is released in Triggers v0.24

@afrittoli does that PR helps ?

@afrittoli
Copy link
Member Author

Thanks @savitaashture - that's a nice feature to have, but I don't think it would help with CloudEvents specifically.
The problem with CloudEvents is that the specification mandates the reply to be compliant with the CloudEvents specification.

I believe that has been addressed and solved in #1469

@khrm
Copy link
Contributor

khrm commented May 30, 2023

Yes, it has been resolved by #1469 by following 1st approach. I don't see a strong need for disabling the response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/roadmap Issues that are part of the project (or organization) roadmap (usually an epic) kind/feature Categorizes issue or PR as related to a new feature.
Projects
Status: Done
Development

No branches or pull requests

9 participants