Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publisher only sends a message if at least one (ign-transport) subscriber exists #225

Closed
FirefoxMetzger opened this issue Mar 11, 2021 · 3 comments
Labels
help wanted Extra attention is needed

Comments

@FirefoxMetzger
Copy link

Environment

  • OS Version: Ubuntu 20.04 kernel 5.4.91-microsoft-standard-WSL2
  • Source or binary build? binary build (9.X) from Ignition Dome

Description

  • Expected behavior: Publishers should send messages irrespective of subscribers being present. Since most of the current communication uses TCP (instead of UDP/multicast), this should apply in a weakened form: publishers should send messages as long as there is at least one active connection on the socket.
  • Actual behavior: Publishers appear to only send messages if at least one subscriber is connected via ign-transport and ignore the presence of other subscribers that directly connect to the underlying zmq-socket.

Steps to reproduce

  1. Start gazebo with an empty world (ign gazebo)
  2. In a new shell run the python script shared below
  3. Notice that the script halts and prints nothing
  4. In a new shell subscribe an echo subscriber to the /clock topic (ign -e -t /clock)
  5. Notice that the echo subscriber floods the console with timestamp messages
  6. Also notice that the python subscriber now, too, floods the console with received messages; each line corresponds to a single message for the listed topic.
  7. Terminate the echo subscriber
  8. Notice that the python subscriber, too, stops receiving new messages (indicated by it halting and no longer printing)
import zmq
import subprocess
import socket
import os
import pwd

ctx = zmq.Context()
zmq_socket = ctx.socket(zmq.SUB)
ign_topic = "/clock"


# this is a bad hack and should be implemented cleaner
result = subprocess.check_output(f"ign topic -i -t {ign_topic}", shell=True)
address = result.decode("utf-8").split("\n")[1].split(",")[0].replace("\t", "").replace(" ", "")
zmq_socket.connect(address)

host_name = socket.gethostname()
user_name = pwd.getpwuid(os.getuid())[0]
zmq_socket.subscribe(f"@/{host_name}:{user_name}@{ign_topic}")

while True:
    try:
        print(zmq_socket.recv_multipart()[0])
    except zmq.ZMQError:
        break

I realize that this usage is (currently 👼) quite the hack and not following the "intended" way of subscribing to topics. My use case is that I need access to sensor data from within python. The message layer is already language-independent (protobuf+zmq), and it seems cleaner to use native zmq and protobuf libraries to subscribe to topics instead of having to create an ign-transport wrapper in my language of choice. Potentially, this approach could scale to all zmq supported languages (JS, Python, Rust, Java, MATLAB (via Java) ...).

I have a related question on gazebosim, but it seems rather unpopular. Now that I've made some progress on it via trial-and-error, it looks more like a bug (or is it a design decision?).

I'm only surface-level familiar with the codebase, so I'm doing a lot of guessing of what's actually happening. Here, I'm at a bit of a loss regarding what could be happening. Does ign-transport mediate messaging (e.g. via a central message broker)?

If somebody could help reduce my confusion, I'll be happy to help with a PR (if applicable).

@FirefoxMetzger FirefoxMetzger added the bug Something isn't working label Mar 11, 2021
@osrf-triage osrf-triage added this to Inbox in Core development Mar 11, 2021
@chapulina chapulina added help wanted Extra attention is needed and removed bug Something isn't working labels Mar 15, 2021
@chapulina chapulina removed this from Inbox in Core development Mar 15, 2021
@caguero
Copy link
Contributor

caguero commented May 20, 2021

This is by design because publishing a message requires to serialize the message, among other things. This way we can save some CPU cycles if nobody is subscribed.

@FirefoxMetzger
Copy link
Author

FirefoxMetzger commented May 20, 2021

@caguero Do you happen to know the section in the code where this happens?

Maybe my understanding is wrong, but I thought the zmq+protobuf solution exists precisely because we want to have cheap serialization. ZMQ should not copy the message internally - it's zero-copy afterall - and PUB sockets without subscriber drop messages without them hitting the wire already. I don't see protobuf doing much work neither since the binary format is optimized for space, and we compile the message to get language-native bindings to directly interface with the objects; is the C-binding somehow inefficient?

I can see how this could be a problem if we had to construct a fresh protobuf object from a different data structure each time. In that case, my question would be why there is a mismatch between them?

@azeey
Copy link
Contributor

azeey commented Jan 26, 2024

I don't think we would be changing the behavior here. Depending on the message type, serialization is not cheap, so if there are no subscribers, it doesn't make sense to have to pay the cost. Also, we now have python bindings for gz-transport, so you should be able to subscribe to topics.

@azeey azeey closed this as completed Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants