Export/Publish notifications #5337

kahowell · 2024-05-08T15:23:45Z

Is your feature request related to a problem? Please describe.
As a service that integrates with Pulp, I'd like to be notified when there is any change to the availability of content.

Describe the solution you'd like

I'd like to be able to subscribe to machine-readable notifications. I expect these to come across via a messaging protocol (e.g. AMQP or Kafka). Ideally, the solution should be messaging protocol agnostic, so that different deployment options can be supported. (I'd personally recommend Kafka as a baseline).

There should be a versioned schema of the notification messages. (I'd personally recommend jsonschema written in YAML format for simplicity).

For interoperability, cloud events in structured mode should be used as the message format.

Describe alternatives you've considered

Polling the APIs can be used as a stop-gap, at the expense of API load.
A publish/subscribe style REST-interface (see e.g. Google Cloud Pub/Sub), but this seems much more complex than simply integrating with a messaging platform.
Configurable callbacks to external systems to notify of changes - introduces unnecessary coupling, resiliency challenges.

daviddavis · 2024-05-09T11:27:35Z

We've requested a very similar feature: #4785

dkliban · 2024-05-09T13:38:03Z

@daviddavis would kafka messages suffice?

daviddavis · 2024-05-09T14:11:26Z

@dkliban I think we were kind of hoping for something similar to how Pulp handles signing services. I think such a feature would be more flexible for users who could use it to create notifications (or whatever else they want). That said, it would be more work for users who only want notifications.

If Pulp was set on notifications, I wonder if maybe redis (or I guess valkey) be an option? Especially since it's already part of the Pulp stack. For Kafka, I do see there is a service that Azure provides that maybe we could use. I don't know anything about it but I could dig into it more if you all were thinking of only supporting it.

dkliban · 2024-05-09T15:01:26Z

@daviddavis i agree that calling into a script that is provided by the administrator would be the most flexible. we could then provide some examples of scripts perform calls to web servers and send notifications in our docs.

sdherr · 2024-05-09T15:30:57Z

Hmm, I think I disagree that calling a script in a subprocess (like we do for SigningServices) is the "right" thing to do here. Event / messaging services are the modern solution that exists to fill this gap.

On the other hand if you do just add a script-runner we could just write a 10-line script to send a message where we need it to go. Maybe that does make sense from a minimal-dependency perspective. But there's a not-insignificant amount of code in Pulp for registering / using SigningSerivces, and you probably would have to duplicate a lot of it to create something similar for a publication-notification script, so you may be making it harder on yourselves than just plugging in to a standard solution. Or maybe integrating with a Kafka-esque service would make sense if you had a larger use-case for it than just publication notifications.

dkliban · 2024-05-09T15:35:23Z

To expand on the previous comments:

A Pulp administrator would be provided with a pulpcore-manager command to create a Notifier (or some other name). The notifier would be mostly a path to a script that each worker and api processes would have access to. The script would then perform any kind of action needed. The script would be provided with a list of environment variables. Some I can think of REPOSITORY_NAME, TASK_TYPE, STATE.

To start with, the docs could provide an example script that POSTs the data to a web server.

For your use case @kahowell we could write a script that produces kafka messages. What do you think?

dkliban · 2024-05-09T15:50:37Z

@sdherr You make a good point about the cost of implementation. There is a lot of boilerplate code that would be duplicated. Let's integrate directly with Kafka. A few places where I think a notification is appropriate are Repository Version created, Publication created, Distribution created, Distribution updated, Distribution deleted.

daviddavis · 2024-05-09T15:56:01Z

That sounds good. Is the idea to make this generic enough so that in theory, other tasks/events could be added eventually? Also, I think you left out one of the events that @kahowell requested (export create).

As for fields, I think ideally the message should have task id/href so that we can query the API if there is some data that we need that wasn't passed in the message.

dkliban · 2024-05-09T16:21:07Z

The messages could be just integrated into the tasking system. So when a task finishes a messages is emitted then. THe message has a task href, task type, state, resources associated with it.

mdellweg · 2024-05-13T09:18:23Z

Webhooks and Websockets come to my mind too.

decko · 2024-05-13T19:08:46Z

So, a thing generic enough so you could plug any sort of "connector" here. A Kafka publisher, Webhooks and Websockets, Cloud Notification services and so on.

mdellweg · 2024-05-14T06:48:25Z

Or we just feed Kafka and all the others are created as consumers of that.

To try locally, you can use the oci-env kafka profile from pulp/oci_env#159. Set up the oci-env to use the kafka profile: ``` COMPOSE_PROFILE=kafka ``` From a fresh oci-env pulp instance, try: ```shell export REPO_NAME=$(head /dev/urandom | tr -dc a-z | head -c5) export REMOTE_NAME=$(head /dev/urandom | tr -dc a-z | head -c5) oci-env pulp file repository create --name $REPO_NAME oci-env pulp file remote create --name $REMOTE_NAME \ --url 'https://fixtures.pulpproject.org/file/PULP_MANIFEST' oci-env pulp file repository sync --name $REPO_NAME --remote $REMOTE_NAME ``` Then inspect the kafka message that is produced via: ```shell oci-env exec -s kafka \ /opt/kafka/bin/kafka-console-consumer.sh \ --bootstrap-server=localhost:9092 \ --offset earliest \ --partition 0 \ --topic pulpcore.tasking.status \ --max-messages 1 ``` Closes pulp#5337

kahowell added Feature Triage-Needed labels May 8, 2024

dkliban removed the Triage-Needed label May 14, 2024

kahowell linked a pull request May 21, 2024 that will close this issue

Add kafka producer to send task status messages #5405

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Export/Publish notifications #5337

Export/Publish notifications #5337

kahowell commented May 8, 2024

daviddavis commented May 9, 2024

dkliban commented May 9, 2024

daviddavis commented May 9, 2024

dkliban commented May 9, 2024

sdherr commented May 9, 2024 •

edited

dkliban commented May 9, 2024

dkliban commented May 9, 2024 •

edited

daviddavis commented May 9, 2024 •

edited

dkliban commented May 9, 2024

mdellweg commented May 13, 2024

decko commented May 13, 2024

mdellweg commented May 14, 2024

Export/Publish notifications #5337

Export/Publish notifications #5337

Comments

kahowell commented May 8, 2024

daviddavis commented May 9, 2024

dkliban commented May 9, 2024

daviddavis commented May 9, 2024

dkliban commented May 9, 2024

sdherr commented May 9, 2024 • edited

dkliban commented May 9, 2024

dkliban commented May 9, 2024 • edited

daviddavis commented May 9, 2024 • edited

dkliban commented May 9, 2024

mdellweg commented May 13, 2024

decko commented May 13, 2024

mdellweg commented May 14, 2024

sdherr commented May 9, 2024 •

edited

dkliban commented May 9, 2024 •

edited

daviddavis commented May 9, 2024 •

edited