Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instance without Command Handlers cannot be discovered #327

Closed
kagof opened this issue Oct 24, 2023 · 4 comments · Fixed by #330
Closed

Instance without Command Handlers cannot be discovered #327

kagof opened this issue Oct 24, 2023 · 4 comments · Fixed by #330

Comments

@kagof
Copy link

kagof commented Oct 24, 2023

Basic information

  • Axon Framework version: 4.6.8
  • JDK version: 11
  • Spring Cloud Extension version: 4.6.0
  • Complete executable reproducer if available (e.g. GitHub Repo): None, but described below

Steps to reproduce

Have an application using the SpringCloudCommandRouter with the RestCapabilityDiscoveryMode which sends commands, but does not have any command handlers.

Expected behaviour

I'd expect this class to be able to send commands to other applications registered with the SpringCloudCommandRouter.

Actual behaviour

We get a "No node known to accept command" error when attempting to send any command from the application with no handlers.

We see the following message in our logs:

o.a.e.s.c.m.RestCapabilityDiscoveryMode : Failed to receive the capabilities from ServiceInstance [<application name>] under host [<ip address>] and port [<port number>]. Will temporarily set this instance to deny all incoming messages.

for each instance the application finds through service discovery. This explains why it cannot send commands; the application has an exception when attempting to discover capabilities on any other application, and so records them as not having any capabilities.

Enabling debug logging allows us to see the stacktrace from the RestCapabilityDiscoveryMode:

RestCapabilityDiscoveryMode  : Service Instance [<instance info>] is denying all messages due to the following exception: 

java.lang.NullPointerException: null
	at org.axonframework.extensions.springcloud.commandhandling.mode.RestCapabilityDiscoveryMode.isLocalServiceInstance(RestCapabilityDiscoveryMode.java:123)
	at org.axonframework.extensions.springcloud.commandhandling.mode.RestCapabilityDiscoveryMode.requestMessageRoutingInformation(RestCapabilityDiscoveryMode.java:107)
	at org.axonframework.extensions.springcloud.commandhandling.mode.RestCapabilityDiscoveryMode.capabilities(RestCapabilityDiscoveryMode.java:91)
	at org.axonframework.extensions.springcloud.commandhandling.SpringCloudCommandRouter.updateMemberships(SpringCloudCommandRouter.java:223)
	at org.axonframework.extensions.springcloud.commandhandling.SpringCloudCommandRouter.resetLocalMembership(SpringCloudCommandRouter.java:183)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
...

According to the stacktrace the error appears to be coming from this method:

    private boolean isLocalServiceInstance(ServiceInstance serviceInstance) {
        return Objects.equals(serviceInstance, localInstance.get())
                || Objects.equals(serviceInstance.getUri(), localInstance.get().getUri());
    }

Debugging through it confirms that localInstance is an empty AtomicReference, and so the NPE comes from calling .get().getUri() on it.

Extra info

It looks like the RestCapabilityDiscoveryMode expects AbstractCapabilityDiscoveryMode#updateLocalCapabilities to be called at least once, as it has the side effect of setting the reference for the localInstance. But it seems that since the localCapabilities are already correct by default if there are no command handlers on the local instance, that method never gets called and thus the side effect never occurs.

This appears to be very similar to #1; maybe you could call it a regression, although that issue was for the old Metadata based approach, so that's maybe unfair.

Workarounds

Adding a class like this to the application:

@Service
public class FooCommandHandler {
    @CommandHandler
    public void handle(final Foo command) {}

    public static class Foo {

    }
}

causes the updateLocalCapabilities method to be called and gets rid of the NullPointerException thus allowing commands to be routed properly.

A (slightly?) more graceful workaround we've found is to just call

commandBus.updateLoadFactor(commandBus.getLoadFactor());

in the method where we configure the DistributedCommandBus bean, which also forces the updateLocalCapabilities method to be called and allows command routing.

@smcvb
Copy link
Member

smcvb commented Oct 25, 2023

Fix

Thanks for this detailed bug description, @kagof!

You have definitely found a misser on our end here.
And a scenario that should be applicable, namely services that only dispatch commands but don't handle any.

Checking the implementation, I think we would also be saved by setting a fixed NoOpServiceInstance in the constructor of the AbstractCapabilityDiscoveryMode.
As long as we ensure it never matches with the ServiceInstance passed during the CapabilityDiscoveryMode#capabilities(ServiceInstance) call, we should be good.

What's your thought on that?

Release

Although small, I do think the fix would merit a release. Especially since the traffic on our extensions is low, such an issue with a workaround is still worthy of a new release I think.
However, we are currently on Axon Framework 4.9.0 and Axon Spring Cloud Extension 4.9.0.

If I would release a fix in, let's say, 4.6.1 of the Spring Cloud Extension, we'd be required to release a 4.7.1, 4.8.1, and 4.9.1 too.
To safe us (read: the framework team) some work, it would be awesome if you'd be helped with a 4.9.1 release of this.

So, if possible, I'd like your 2-cents on that. Thanks in advance! 🙏

@smcvb smcvb self-assigned this Oct 25, 2023
@smcvb smcvb changed the title unable to send commands due to a NullPointerException in RestCapabilityDiscoveryMode when an application doesn't have any command handlers Instance without Command Handlers cannot be discovered Oct 25, 2023
@kagof
Copy link
Author

kagof commented Oct 25, 2023

Hey @smcvb , thanks for the speedy reply. Your proposed solution makes sense to me!

I do understand not wanting to manage a backport of this fix all the way back to 4.6, I'm sure that's a lot of extra effort manage 4 versions instead of just 1.

Since the workaround we mentioned above is quite simple (just calling commandBus#updateLoadFactor at startup time) and seems to work, I think it is pretty reasonable to just keep the fix in the 4.9.x stream. For now, we'll just keep the workaround in place until we're able to upgrade to 4.9 and pick up the more elegant solution.

@smcvb smcvb added this to the Release 4.9.1 milestone Oct 26, 2023
smcvb added a commit that referenced this issue Oct 26, 2023
Adjust test to not update the local capabilities, as this reflects the
behavior when a node never updates its capabilities since it has not
command handlers

#327
smcvb added a commit that referenced this issue Oct 26, 2023
Default the ServiceInstance to a fixed URI version i.o. null. Doing so,

#327
smcvb added a commit that referenced this issue Oct 26, 2023
Drop test prefix from tests, as that's the new style

#327
smcvb added a commit that referenced this issue Oct 26, 2023
Add descriptive JavaDoc to the NoopUriServiceInstance

#327
smcvb added a commit that referenced this issue Oct 26, 2023
Rename the nop service instance to FixedURIServiceInstance

#327
smcvb added a commit that referenced this issue Oct 26, 2023
…ommand-handlers

[#327] Default local `ServiceInstance` to a fixed `URI` i.o. `null`
@smcvb
Copy link
Member

smcvb commented Oct 26, 2023

Thanks for the feedback, @kagof!
I went ahead and made the fix today, and merged it already :-)
A release will follow soon.

Thanks again for bringing this to our attention!!

@kagof
Copy link
Author

kagof commented Oct 31, 2023

Much appreciated, thanks for resolving this so quickly @smcvb 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants