Skip to content
This repository has been archived by the owner on Aug 9, 2024. It is now read-only.

fedmsgs converted to AMQP include the entire fedmsg (not just the msg) as message.body #20

Closed
AdamWill opened this issue Jun 8, 2019 · 17 comments

Comments

@AdamWill
Copy link

AdamWill commented Jun 8, 2019

When the zmq_to_amqp bridge catches a fedmsg and emits an AMQP message, what it winds up storing in the AMQP message's body is the entire fedmsg. So for instance, for this fedmsg, the corresponding AMQP message.body is this dict:

{
  "username": "apache", 
  "source_name": "datanommer", 
  "certificate": "blahblah", 
  "i": 1, 
  "timestamp": 1559956368.0, 
  "msg_id": "2019-f059828d-0969-4545-be7a-db8334f6a71f", 
  "crypto": "x509", 
  "topic": "org.fedoraproject.prod.pungi.compose.status.change", 
  "headers": {}, 
  "signature": "blahblah", 
  "source_version": "0.9.0", 
  "msg": {
    "status": "FINISHED", 
    "release_type": "updates", 
    "compose_label": "Update-20190608.0059", 
    "compose_respin": 0, 
    "compose_date": "20190608", 
    "release_version": "7", 
    "location": "https://kojipkgs.fedoraproject.org/compose/updates/Fedora-Epel-7-updates-20190608.0/compose", 
    "compose_type": "production", 
    "release_is_layered": false, 
    "release_name": "Fedora-Epel", 
    "release_short": "Fedora-Epel", 
    "compose_id": "Fedora-Epel-7-updates-20190608.0"
  }
}

So if you want, for instance, the 'location', your AMQP consumer has to do this:

location = message.body['msg']['location']

The problem I see with this is that whenever the thing that publishes this message gets converted to fedora-messaging, it's probably not going to use that structure. It would be natural to put only what is the msg dict in the fedmsg into the AMQP message.body. So any consumer that has already been converted to AMQP is likely going to break when the publisher is converted.

It seems to me that it would be better for the message.body of the translated AMQP message to be just the msg dict from the original fedmsg. The other bits of the fedmsg could be provided as other attributes of the message.

Of course, changing this now will require any fedora-messaging consumers which already consume translated fedmsgs to change, but I think right now there aren't any (I just wrote three, but I don't know if there are many/any others).

@AdamWill
Copy link
Author

AdamWill commented Jun 8, 2019

@jeremycline

@abompard
Copy link
Member

abompard commented Jun 8, 2019

Yeah I agree with Adam's analysis. There are AMQP consumers of translated fedmsgs, I'm thinking of Bodhi for example which consumes messages from Greenwave and Koji. Greenwave has been converted to AMQP but Koji's migrated version is still in staging according to the board.

The keys that aren't in msg are already available in the AMQP message headers so I don't think we need to add them to the main dict. There's only the username key which may be useful so we can add it to the message headers too (all keys except the main msg key in fedmsg were intended as headers anyway).

Bodhi has actually been "fixed" because it was receiving in staging messages without the msg subkey. This, I think, is an issue, because Koji sends AMQP messages in staging but not in prod yet, so if that Bodhi changeset is deployed to prod, it will break because Bodhi will be received translated fedmsg that do contain the msg key.

So I think we should fix the bridges as Adam suggests. What do you think @jeremycline ?

@AdamWill
Copy link
Author

AdamWill commented Jun 8, 2019

So I figured it would make sense to make my consumers 'agnostic' so far as this is concerned. Perhaps Bodhi, and anything else that already consumes bridged fedmsgs, could do something similar? That should make it more robust if we do 'fix' the bridge, or as publishers are converted.

@jeremycline
Copy link
Member

Ooof, yeah I definitely thought only the value of "msg" was the body of bridged messages. Let's fix the bridge and make sure to run through all the active consumers and make sure they're also fixed before we deploy. It'll be easy to get a list of the consumers from the active connections on the broker so we don't forget anything.

@bowlofeggs
Copy link

Yeah this seems to be the cause of fedora-infra/bodhi#3304 which is causing Bodhi to be unable to process messages from Koji tag events. In my case it's actually OK because Bodhi has a backup process to handle the message handler not working (since we had to assume we missed fedmsgs), so it's not an emergency for me.

@bowlofeggs
Copy link

To be clear: Bodhi is expecting to receive the new format, not the old format, and Bodhi works in stg but not production because of this. But again, it's not an emergency, because Bodhi has a cron job that will make sure it becomes "eventually consistent", so things are working OK anyway.

jeremycline added a commit that referenced this issue Jun 10, 2019
AMQP publishers don't wrap everything in a "msg" key and the intention
was for the ZMQ->AMQP bridge to exactly mimic those messages. However,
the entire fedmsg was used as the body. This sets the body to the value
of the fedmsg's "msg" key and uses the value of the fedmsg's "headers"
key as the headers.

Fixes #20

Signed-off-by: Jeremy Cline <jcline@redhat.com>
@jeremycline
Copy link
Member

Alright, there's a PR posted. The only consumers I see in staging are bodhi, greenwave, and the-new-hotness. I propose we deploy this to staging once it's merged and get the-new-hotness and greenwave fixed up in staging ASAP. Once everything is working in staging we can roll out the change to prod before any new the-new-hotness, greenwave, or bodhi update (although bodhi sounds fine).

jeremycline added a commit that referenced this issue Jun 10, 2019
AMQP publishers don't wrap everything in a "msg" key and the intention
was for the ZMQ->AMQP bridge to exactly mimic those messages. However,
the entire fedmsg was used as the body. This sets the body to the value
of the fedmsg's "msg" key and uses the value of the fedmsg's "headers"
key as the headers.

Fixes #20

Signed-off-by: Jeremy Cline <jcline@redhat.com>
@AdamWill
Copy link
Author

I guess it's possible there are other consumers (like mine) using the 'public' account? Can you check what queues exist on that or anything?

@jeremycline
Copy link
Member

I see two connections to the public vhost on the staging broker, both have their app property set to "Fedora openQA" which I assume is you. So fortunately, no one is currently using the public staging broker except you.

@AdamWill
Copy link
Author

yup, those are both me. :P How about the public production broker?

@jeremycline
Copy link
Member

There is a FAF queue in production using the private vhost and the public vhost has:

"Example Application"
"Example Application"
"Example Application"
"yanetis dlkoji"
"joystick"
"Example Application"
"Fedora Messaging Zuul"
"Fedora openQA"

I think @sayanchowdhury might be responsible for joystick. Not sure who to contact* about Zuul or yanetis dlkoji, and the folks who didn't set a name for their consumers are out of luck, I suppose.

*I realize now that the documentation should tell folks to add a key to their client properties with contact information.

@abompard
Copy link
Member

I checked with Michal for The New Hotness, and his code is expecting the new format so it's not working in prod currently.
We can update the bridges in staging but all subscribers of Koji messages will see nothing because Koji has already been ported to Fedora Messaging in staging. But it's definitely good to test for regressions! :-)

@jeremycline
Copy link
Member

Okay, the change is in staging. I'll see about tracking down everyone is prod (except the example applications) to make them aware of the upcoming change.

@abompard
Copy link
Member

I looked at the bindings in prod and I think it's only Bodhi and The New Hotness, both are aware of the change. I haven't checked the public broker though.

@jeremycline
Copy link
Member

Okay, I pinged the people I knew to contact for the public broker, but I guess we'll put the fix out in prod and see who complains.

@AdamWill
Copy link
Author

so to be clear, this is changed in prod now?

@jeremycline
Copy link
Member

Correct

dustymabe added a commit to coreos/fedora-coreos-releng-automation that referenced this issue Jun 19, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants