Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export/Publish notifications #5337

Open
kahowell opened this issue May 8, 2024 · 12 comments · May be fixed by #5405
Open

Export/Publish notifications #5337

kahowell opened this issue May 8, 2024 · 12 comments · May be fixed by #5405
Labels

Comments

@kahowell
Copy link

kahowell commented May 8, 2024

Is your feature request related to a problem? Please describe.
As a service that integrates with Pulp, I'd like to be notified when there is any change to the availability of content.

Describe the solution you'd like

I'd like to be able to subscribe to machine-readable notifications. I expect these to come across via a messaging protocol (e.g. AMQP or Kafka). Ideally, the solution should be messaging protocol agnostic, so that different deployment options can be supported. (I'd personally recommend Kafka as a baseline).

There should be a versioned schema of the notification messages. (I'd personally recommend jsonschema written in YAML format for simplicity).

For interoperability, cloud events in structured mode should be used as the message format.

Describe alternatives you've considered

  • Polling the APIs can be used as a stop-gap, at the expense of API load.
  • A publish/subscribe style REST-interface (see e.g. Google Cloud Pub/Sub), but this seems much more complex than simply integrating with a messaging platform.
  • Configurable callbacks to external systems to notify of changes - introduces unnecessary coupling, resiliency challenges.
@daviddavis
Copy link
Contributor

We've requested a very similar feature: #4785

@dkliban
Copy link
Member

dkliban commented May 9, 2024

@daviddavis would kafka messages suffice?

@daviddavis
Copy link
Contributor

@dkliban I think we were kind of hoping for something similar to how Pulp handles signing services. I think such a feature would be more flexible for users who could use it to create notifications (or whatever else they want). That said, it would be more work for users who only want notifications.

If Pulp was set on notifications, I wonder if maybe redis (or I guess valkey) be an option? Especially since it's already part of the Pulp stack. For Kafka, I do see there is a service that Azure provides that maybe we could use. I don't know anything about it but I could dig into it more if you all were thinking of only supporting it.

@dkliban
Copy link
Member

dkliban commented May 9, 2024

@daviddavis i agree that calling into a script that is provided by the administrator would be the most flexible. we could then provide some examples of scripts perform calls to web servers and send notifications in our docs.

@sdherr
Copy link
Contributor

sdherr commented May 9, 2024

Hmm, I think I disagree that calling a script in a subprocess (like we do for SigningServices) is the "right" thing to do here. Event / messaging services are the modern solution that exists to fill this gap.

On the other hand if you do just add a script-runner we could just write a 10-line script to send a message where we need it to go. Maybe that does make sense from a minimal-dependency perspective. But there's a not-insignificant amount of code in Pulp for registering / using SigningSerivces, and you probably would have to duplicate a lot of it to create something similar for a publication-notification script, so you may be making it harder on yourselves than just plugging in to a standard solution. Or maybe integrating with a Kafka-esque service would make sense if you had a larger use-case for it than just publication notifications.

@dkliban
Copy link
Member

dkliban commented May 9, 2024

To expand on the previous comments:

A Pulp administrator would be provided with a pulpcore-manager command to create a Notifier (or some other name). The notifier would be mostly a path to a script that each worker and api processes would have access to. The script would then perform any kind of action needed. The script would be provided with a list of environment variables. Some I can think of REPOSITORY_NAME, TASK_TYPE, STATE.

To start with, the docs could provide an example script that POSTs the data to a web server.

For your use case @kahowell we could write a script that produces kafka messages. What do you think?

@dkliban
Copy link
Member

dkliban commented May 9, 2024

@sdherr You make a good point about the cost of implementation. There is a lot of boilerplate code that would be duplicated. Let's integrate directly with Kafka. A few places where I think a notification is appropriate are Repository Version created, Publication created, Distribution created, Distribution updated, Distribution deleted.

@daviddavis
Copy link
Contributor

daviddavis commented May 9, 2024

That sounds good. Is the idea to make this generic enough so that in theory, other tasks/events could be added eventually? Also, I think you left out one of the events that @kahowell requested (export create).

As for fields, I think ideally the message should have task id/href so that we can query the API if there is some data that we need that wasn't passed in the message.

@dkliban
Copy link
Member

dkliban commented May 9, 2024

The messages could be just integrated into the tasking system. So when a task finishes a messages is emitted then. THe message has a task href, task type, state, resources associated with it.

@mdellweg
Copy link
Member

Webhooks and Websockets come to my mind too.

@decko
Copy link
Member

decko commented May 13, 2024

So, a thing generic enough so you could plug any sort of "connector" here. A Kafka publisher, Webhooks and Websockets, Cloud Notification services and so on.

@mdellweg
Copy link
Member

Or we just feed Kafka and all the others are created as consumers of that.

kahowell added a commit to kahowell/pulpcore that referenced this issue May 21, 2024
To try locally, you can use the oci-env kafka profile from pulp/oci_env#159.

Set up the oci-env to use the kafka profile:

```
COMPOSE_PROFILE=kafka
```

From a fresh oci-env pulp instance, try:

```shell
export REPO_NAME=$(head /dev/urandom | tr -dc a-z | head -c5)
export REMOTE_NAME=$(head /dev/urandom | tr -dc a-z | head -c5)
oci-env pulp file repository create --name $REPO_NAME
oci-env pulp file remote create --name $REMOTE_NAME \
    --url 'https://fixtures.pulpproject.org/file/PULP_MANIFEST'
oci-env pulp file repository sync --name $REPO_NAME --remote $REMOTE_NAME
```

Then inspect the kafka message that is produced via:

```shell
oci-env exec -s kafka \
  /opt/kafka/bin/kafka-console-consumer.sh \
  --bootstrap-server=localhost:9092 \
  --offset earliest \
  --partition 0 \
  --topic pulpcore.tasking.status \
  --max-messages 1
```

Closes pulp#5337
@kahowell kahowell linked a pull request May 21, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Needs review
Development

Successfully merging a pull request may close this issue.

6 participants