Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current Transport Provider architecture makes it hard to synchronise bootstrapping #213

Closed
brett-smith opened this issue May 18, 2023 · 8 comments

Comments

@brett-smith
Copy link
Contributor

While using dbus-java for simple access to an existing broker and exported services is very easy, I have found that some more complex uses require additional code and considerations that ideally could be better dealt with by dbus-java itself.

This has mostly turned up when using the embedded broker on non-Linux systems, where I bundle the broker along with a service. The service will attempt to start a broker, export some services on it, and then wait for a client to connect to the bus all in one JVM. All the while, a client will be attempting to connect to the bus from another JVM. If it happens to manage to make a physical connection to the bus, before the service has finished setting up (exporting services etc), then I may hit problems. The necessitates code (Thread.sleeps, polling code, excessive error checking) to deal with this.

A similar problem exists when trying to use DirectConnection, either in the same JVM or split over two or more. This is demonstrated by TestPeer2Peer. It only works as well as it does because there are a couple of Thread.sleeps() in there.

The crux of the problem is the blocking nature of the DirectConnectionBuilder.build() on the listening side. This will not return until the first client connection is received, so it is not possible to export services and have them ready for exactly when the client is "allowed in".
While withAutoConnect(false) helps, in that at least you can get the connection instance without waiting for a client, you still cannot easily synchronise reliably.

Going back to basics, this is a pattern I have used often for client/server sockets. It makes use of the natural blocking you get with a socket, and is totally reliable with no sleeps necessary for it to work (the Thread.sleep is just simulating work).

try(var srv = new ServerSocket(0))  /* Socket is bound at this point to a random port, but not listening */ {
		
	new Thread(()->{
		/* Different thread of execution. Can be same JVM in a Thread, or separate JVM */
		try(var clnt  = new Socket(InetAddress.getLocalHost(), srv.getLocalPort()) /* Will be blocked until ServerSocket.accept() is run */) {
			System.out.println("I got a " +  clnt.getInputStream().read());			
		}
		catch(Exception e) {
			//
		}
	}).start();
			
	//
	// Do other stuff that may take some time .....
	//
	Thread.sleep(5000);
			
	try(var srvClnt = srv.accept()) /* Will be blocked until thread opens socket */ {
		srvClnt.getOutputStream().write(new Random().nextInt(256));
	}
}

As it stands, this pattern is not possible with dbus-java due to it combining the binding and accepting of the sockets when dealing with listening sockets.

So in short, I think I am asking if you think this can decoupled, so it is possible to get a DirectConnection (or DBusConnection) for a listening transport and have it's listening socket bound, but not yet accepting until instructed to do so.

@hypfvieh
Copy link
Owner

Uff... this is a really big change.
I'm afraid that I cannot change the behavior in version 4.x.

Creating the accepting connection is done by the actual transport (native/jnr/tcp). So the blocking will happen there.
To get around this limitation one have to decouple the accept and connect call in the transports.
This will require changes in AbstractTransport and and will therefore force all third party transport providers to implement the changes as well (if there is any). That's why I won't change this in 4.x.

I may add the required changes and for 5.x. Just keep an eye on the 5.x-develop branch.
(Caution 5.x will require Java 17 due to usage of newer Java features).

@brett-smith
Copy link
Contributor Author

Heh, yup understood, was expecting such a response :) It's something that's been bothering me for a while, I think I can wait.

I hadn't noticed you'd started on a new major version, will definitely keep an eye on it. Most of the projects I use dbus-java in use Java 17 anyway, if not for source then at least at runtime. I look forward to it.

@brett-smith
Copy link
Contributor Author

I've just taken a look at the latest version 5 wrt to this change, and I think we are kind of half way there now?

As I understand it, the listenImpl() and connectImpl() are now separate operations. However, listenImpl() is still responsible for the socket.accept().

@hypfvieh
Copy link
Owner

The idea was to set autoConnect to false, so you can create the transport without connection instantly. Then do all the stuff you want to do and if you are done call listen() on the created instance to actually bind the socket and start accepting connections.

Working with ServerSocketChannel here is not possible because the jnr-transport does not extend ServerSocketChannel. Only workaround here would be to work with AbstractSelectableChannel which all transports derive from somehow - but this will cause other issues (e.g. accept() is not part of that class but of the ServerSocketChannel interface).

Someone has to call the socket.accept() and in my opinion this is something the transport should take care of.
Dbus-java-core expects to get a properly configured SocketChannel when listenImpl() or connectImpl() return.
I don't want dbus-java-core to return the underlying SocketChannel at some point.

Any suggestion?

@brett-smith
Copy link
Contributor Author

brett-smith commented Oct 19, 2023

I still think it should be possible to separate the binding of the listening socket and the accepting of client socket from that listening socket, but maybe something else can be done in EmbeddedDBusDaemon,the whole reason for this ticket in the first place.

To recap, the architecture of my app is ..

  • I have a service/demon type part. This contains an EmbededDBusDaemon, and a bunch of exported interfaces. This service could potentially be restarted at any point.
  • I have client type part(s). These come and go, and they also need to cope with the broker disappearing (not just the exported service like if you had a permanent broker like on Linux). If the broker disappears, they go into a polling mode waiting for it to come back.

The setup code for the embedded daemon looks something like.

daemon = new EmbeddedDBusDaemon(listenBusAddress);
daemon.setSaslAuthMode(authMode);
daemon.startInBackground();

/* The first ugly hack - sleep some magic amount of time to guarantee there is something listening for connections */
Thread.sleep(1000);

/* Despite the above, we still sometimes saw the above taking more than one second on slow machines, so to reliably 
  get a connection we loop until it fails.  And no, we can't just do this without the above sleep as well, as strange things
occasionally happen */

long expire = TimeUnit.SECONDS.toMillis(15) + System.currentTimeMillis();
while(System.currentTimeMillis() < expire) {
	try {
		conn = configureBuilder(DBusConnectionBuilder.forAddress(busAddress)).build();
		log.info("Connected to embedded DBus {}", busAddress);
		break;
	} catch (DBusException dbe) {
		try {
			Thread.sleep(1000);
		} catch (InterruptedException e) {
	        }
        }
}

/* Now we need to loop to request a bus name too. Without this, its possible we try to request the bus name 
before the "Hello" message message has been sent. */
while(true) {
    try {
        ((DBusConnection)conn).requestBusName("com.logonbox.vpn");
        break;
     }
     catch(DBusException dbe) {
         if(!(dbe.getCause() instanceof AccessDenied))
             throw dbe;
         Thread.sleep(500);
      }
}

/* It's finally safe to actually start exporting services on this bus. But be aware, its possible that a client managed to
   connect at this point, and sees the bus name, but doesn't yet see the exported services. So the client also needs to
   retry if this happens */
exportServices();

So what I would like to see at this level, is something more like this.

daemon = new EmbeddedDBusDaemon(listenBusAddress);
daemon.setSaslAuthMode(authMode);

 // after "startInBackground()",  I want guarantees i (or other clients) can immediately create a good connection
daemon.startInBackground();

// after this "build()", i want guarantees i can immediately use this connection with no errors
conn = configureBuilder(DBusConnectionBuilder.forAddress(busAddress)).build();
		log.info("Connected to embedded DBus {}", busAddress);

// i want this bus name request to always succeed without retrying
((DBusConnection)conn).requestBusName("com.logonbox.vpn");

// And now export everything. As for other clients who happen to connect during the above,  I
// can live with them having to retry if the expected service is not there, as long as the connection
// is *good*
exportServices();

As far as I can make, this would still not be possible with version 5. As you can see, the entry point for this problem is really kind of startInBackground(). I'm wondering if a callback may be better (either when running in background or foreground).

daemon = new EmbeddedDBusDaemon(listenBusAddress);
daemon.setSaslAuthMode(authMode);
daemon.startInForeground(() -> {
     /* This callback is called when the daemon is completely ready to accept a connection */

    // do all the other setup stuff
}));

hypfvieh added a commit that referenced this issue Oct 20, 2023
…callback after bind has been called (also usable in EmbeddedDBusDaemon)
@hypfvieh
Copy link
Owner

I just pushed changes related to this.

First of all, calling accept() and bind() on the server socket are now two distinct methods in the transport.
I also added the possibility to provide a callback (afterBindCallback) which is called after bindImpl() was called on the transport (so right after bind() but before accept()).
This callback can also be setup on the EmbeddedDBusDaemon which will pass it through to the transport.

Speaking of EmbeddedDBusDaemon: I updated the startInBackgroundAndWait method and added second variant which wait indefinitely. These methods will return after bindImpl() has been called on transport but accept() may not.

This should eliminate the needs for random sleeps. I'm not sure if this fixes other problems you mentioned, but we'll see.

@brett-smith
Copy link
Contributor Author

Awesome.

It works well. I don't seem to be seeing the problem with the request of the bus name happening before "Hello" either. I'll be working with this code over the coming weeks and it will get some solid testing, so am sure I will see it if it does still happen.

Thanks as always for your work, this is a nice improvement to using the embedded daemon.

@hypfvieh
Copy link
Owner

5.0.0 is now released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants