-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fedmsg should provide support for schemas #404
Comments
I won't block this. I think I'm ... neutral on it. :) Some thoughts to bring up:
|
Cool. To expand upon this a little (and maybe this should be an entirely separate issue since schemas are just a part of it), the current interface is causing me pain in a few ways, but they mostly boil down to the fact that you can throw anything and everything onto the message bus (which, although nice, has a lot of downsides). The activity I was engaged in when I filed this was trying to update the The Process TodayThe current process for updating a fedmsg format is:
That's a lot of steps to go through and it's easy to forget one (as we all have seen on a regular basis) or miss a user. It also means that, as a consumer, I have to lockstep with the publishing application to the new format. Potential New ProcessWhat if, instead of having Processors, we have a This would lead to something like this when creating a new message:
And when you need to update your message:
TL;DRThe basic problem is that we use fedmsg to let applications interface with each other, but there's nowhere to define, document, deprecate, etc. the API. It's really easy to get something out there, but it's really hard to maintain, refine, and improve your interface. I'm not really breaking new ground here, this is a problem people have recognized long before me and made tools like protocol buffers to handle. Maybe we could leverage these tools. I haven't done an in-depth investigation to say whether that's something worth-while or not. Anyway, those are my meandering thoughts. They're certainly in need of refinement, and quite possibly not worth acting upon. TL;DR for the TL;DR😢 |
Just a quick note on this, since we have to still support the past messages, we may gain on processing newer messages but we will need to keep the current code in place, so we may end up adding code for the new schemas without removing any/much. |
One thing that @abompard mentioned once and which is totally doable and likely fairly easy is just to add a version to the message. This way (except for miss/bugs) we can easily bump the version in the producer and adjust the behaviour of the consumer accordingly. |
I'm a +1 for having schemas on our messages. The protocol buffers thing looks nice. |
I've done some investigation about how this API might get implemented. It was satisfying to see the pyzmq documentation recommend the approach I had in mind, but unfortunately there's a bit of a snag. The problem is how fedmsg is abstracting the ZMQ underpinnings. ZMQ messages are published by fedmsg. However, the subscriber code (including the bits that would let us manipulate incoming messages prior to handing them to consumers) lives in moksha. It seems (from my investigation of moksha) that we use it to support various messaging technologies besides ZMQ. However, the fedmsg documentation does not give any indication (that I can find) that this is a focus of fedmsg, and within Fedora Infrastructure we don't use it (as far as I know). This leads me to ask a few questions:
|
fedmsg has now code that allows using it with another message bus than zmq, @ralphbean added some changes for this recently among others in #380 and #387 |
I guess what I'm driving at is what does fedmsg want to be. Does it aim to be a high-level messaging library very much like kombu? Does it want to focus exclusively on ZMQ and make that experience very easy and clear? Something else? I've used fedmsg quite a bit now and I've read the docs, but I don't know what fedmsg's goal is, exactly. |
I did some digging and found fedmsg used to have schema support, but it was removed early on. I'd like to propose that we work towards adding them back. Here are some of the benefits I see for supporting schemas:
We already have schemas, but they aren't explicit. Schemas are defined in the message processors by the series of
try/except
blocks sharded across many functions. Take, for example, the packages function. Buried in that scary set of blocks is the schema for all pkgdb messages relating to packages.When the schema of messages is validated before publishing and after receiving, you can be confident of the message structure and, when the structure changes, it's very clear to both the publisher and receiver. This helps developers avoid situations where they accidentally change their message format and break the world when they deploy the new version.
There's a one-stop shop for all message formats and their history.
We can make message processing much cleaner. Rather than huge try/except blocks, an interface is defined (basically the
BaseProcessor
) that each schema declaration implements. This way you have one class for a message type that has its schema, and how to get at the information without knowing much about its schema (if you so choose).I realize this is a large architectural change, but I think it'll make working with fedmsg much easier for developers. What does everyone think?
The text was updated successfully, but these errors were encountered: