Skip to content

ARTEMIS-3223 - ensure distribution uses the address from the message,…#3673

Merged
gtully merged 2 commits into
apache:mainfrom
gtully:ARTEMIS-3223
Sep 6, 2021
Merged

ARTEMIS-3223 - ensure distribution uses the address from the message,…#3673
gtully merged 2 commits into
apache:mainfrom
gtully:ARTEMIS-3223

Conversation

@gtully
Copy link
Copy Markdown
Contributor

@gtully gtully commented Jul 26, 2021

… rather than the address from the queue which may be a wildcard sub and not valid for publishng on, fix and test

@gtully gtully marked this pull request as draft July 26, 2021 21:56
@gtully
Copy link
Copy Markdown
Contributor Author

gtully commented Jul 26, 2021

I am thinking there must be some tests that depend on the old behaviour, lets see. sanity checks on the ones that appear relevant don't show a culprit so I will await a full suite run to validate.

@gtully
Copy link
Copy Markdown
Contributor Author

gtully commented Jul 27, 2021

there was one failure in redistribution with prefixes, which pointed to the cause of this problem. #1782

re working the fix to update the message address once the routing is transformed seems correct. It keeps the prefix logic confined to the acceptor/session that produces the message. I notice that there is already some mangling going on for legacy clients in relation to prefixes.
Of note in the test on redistribution and prefixes, the consumer is on a "non prefixed" queue, the old behaviour would have an address with a prefix on messages on that queue, which seems odd.

@gtully gtully marked this pull request as ready for review July 27, 2021 12:04
@Elyytscha
Copy link
Copy Markdown

if you can link the docs where its explained how to build artemis from your fork, i would be willing to test this on my local machine, we have written a docker-compose based test suite which spins up an artemis cluster and which is able to reproduce this error. so we would just need a compiled artemis.jar from your fork to test this out on our side i think.

best regards and thanks for working on this :)

@gtully
Copy link
Copy Markdown
Contributor Author

gtully commented Jul 27, 2021

clone https://github.com/gtully/activemq-artemis/tree/ARTEMIS-3223
set JAVA_HOME=..
mvn clean install -Pdev -DfailIfNoTests=false -Dtest=MqttClusterWildcardTest

in a few minutes you will have a distribution in:
artemis-distribution/target/apache-artemis-2.18.0-SNAPSHOT-bin

there are more instructions at: https://activemq.apache.org/components/artemis/documentation/2.0.0/hacking-guide/tests.html but I think you will be good to go.

Copy link
Copy Markdown
Contributor

@michaelandrepearce michaelandrepearce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see inline comments

@gtully
Copy link
Copy Markdown
Contributor Author

gtully commented Jul 28, 2021

there may be some issue though, I see an intermittent failure in org.apache.activemq.artemis.tests.integration.amqp.AmqpBridgeClusterRedistributionTest#testSendMessageToBroker0GetFromBroker2 that I need to investigate. It fails most of the time.

@Elyytscha
Copy link
Copy Markdown

there may be some issue though, I see an intermittent failure in org.apache.activemq.artemis.tests.integration.amqp.AmqpBridgeClusterRedistributionTest#testSendMessageToBroker0GetFromBroker2 that I need to investigate. It fails most of the time.

i have a compiled artemis.jar with the latest commit from this branch, does it make sense to test already or should i wait until you had time to investigate and maybe push some changes?

@gtully
Copy link
Copy Markdown
Contributor Author

gtully commented Jul 28, 2021

@Elyytscha I think go ahead and verify on your end. From what I have been able to understand so far the intermittent failure in org.apache.activemq.artemis.tests.integration.amqp.AmqpBridgeClusterRedistributionTest#testSendMessageToBroker0GetFromBroker2 seems to be timing related. thanks for the help :-)

@Elyytscha
Copy link
Copy Markdown

Hello,

we tested already, but i'm not sure if we did everything correct.

redistribution on the same node is now okay with mqtt and clean session false,

when you subscribe to a wildcard topic with clean session false, then disconnect, then send 2 messages on node a and node b to a topic in this wildcard range with different publisher, then reconnect the client, the messages are appearing with their original topic, not with a wildcard anymore.

but we have some weird behavior with the management console and with same steps above but connecting to node b instead of node a where subscriber was connected before, it does not work, there are generally no messages arriving. the weird error we see and why we cant check the settings in the webconsole is actually:

artemis1_1   | 2021-07-29 20:07:33,496 WARN  [io.hawt.system.Authenticator] Login failed due to: No LoginModules configured for activemq-management-management-management-management-management-management-management-management
artemis1_1   | 2021-07-29 20:07:34,026 WARN  [io.hawt.system.Authenticator] Login failed due to: No LoginModules configured for activemq-management-management-management-management-management-management-management-management
artemis1_1   | 2021-07-29 20:07:35,454 WARN  [io.hawt.system.Authenticator] Login failed due to: No LoginModules configured for activemq-management-management-management-management-management-management-management-management

with this login.config:

activemq {
  org.apache.activemq.artemis.spi.core.security.jaas.GuestLoginModule sufficient
      debug=true
      org.apache.activemq.jaas.guest.user="guest"
      org.apache.activemq.jaas.guest.role="guest";
};

activemq-management {
  org.apache.activemq.artemis.spi.core.security.jaas.PropertiesLoginModule sufficient
      debug=true
      org.apache.activemq.jaas.properties.user="artemis-users.properties"
      org.apache.activemq.jaas.properties.role="artemis-roles.properties";
};

but i can't imagine how that should be related to this pullrequest, so i think we have some errors in the setup or something has changed since activemq 2.16 which we used before.. so i think we have some configuration error, we are investigating this..

best regards

@Elyytscha
Copy link
Copy Markdown

Elyytscha commented Aug 3, 2021

Hello

we can confirm that with this pullrequest our problem is solved:

we can now do the following in an artemis cluster with mqtt with clean session false in mosquitto-client:
subscribe suscriber1 to node1, disconnect subscriber1 from node1 , publish message test1 on node1 with publisher1 and test2 on node2 with publisher2, connect subscriber1 to node2, topic arrives correctly:

mosquitto_sub -i subscriber1 -h artemis-node1 -p 1883 -t 'drivers/+/services' -d -c
# CTRL+C for disconnect
mosquitto_pub -i publisher1 -h artemis-node1 -p 1883 -t drivers/user-x/services -m test1 -d
mosquitto_pub -i publisher2 -h artemis-node2 -p 1883 -t drivers/user-x/services -m test2 -d
mosquitto_sub -i subscriber1 -h artemis-node2 -p 1883 -t 'drivers/+/services' -d -c
Client subscriber1 sending CONNECT
Client subscriber1 received CONNACK
Client subscriber1 sending SUBSCRIBE (Mid: 1, Topic: drivers/+/services, QoS: 0)
Client subscriber1 received PUBLISH (d0, q0, r0, m0, 'drivers/user-x/services', ... (5 bytes))
test1
Client subscriber1 received PUBLISH (d0, q0, r0, m0, 'drivers/user-x/services', ... (5 bytes))
test2
Client subscriber1 received SUBACK
Subscribed (mid: 1): 0

without this pullrequest (plain artemis 2.18) the same test looks like this:

mosquitto_sub -i subscriber1 -h artemis-node1 -p 1883 -t 'drivers/+/services' -d -c
# CTRL+C for disconnect
mosquitto_pub -i publisher1 -h artemis-node1 -p 1883 -t drivers/user-x/services -m test1 -d
mosquitto_pub -i publisher2 -h artemis-node2 -p 1883 -t drivers/user-x/services -m test2 -d
mosquitto_sub -i subscriber1 -h artemis-node2 -p 1883 -t 'drivers/+/services' -d -c
Client subscriber1 sending CONNECT
Client subscriber1 received CONNACK
Client subscriber1 sending SUBSCRIBE (Mid: 1, Topic: drivers/+/services, QoS: 0)
Client subscriber1 received PUBLISH (d0, q0, r0, m0, 'drivers/+/services', ... (5 bytes))
test1
Client subscriber1 received PUBLISH (d0, q0, r0, m0, 'drivers/+/services', ... (5 bytes))
test2
Client subscriber1 received SUBACK
Subscribed (mid: 1): 0

@michaelandrepearce michaelandrepearce dismissed their stale review September 1, 2021 14:45

Clicked wrong button

@gtully gtully force-pushed the ARTEMIS-3223 branch 2 times, most recently from a0a947e to 0fec379 Compare September 3, 2021 13:48
@gtully
Copy link
Copy Markdown
Contributor Author

gtully commented Sep 3, 2021

finally got to the source of the failure in org.apache.activemq.artemis.tests.integration.amqp.AmqpBridgeClusterRedistributionTest#testSendMessageToBroker0GetFromBroker2
the test sometimes uses redistribution, depending if the bridge gets setup on time. When it did the forward errored out silently... fixed that... but the root case was the need to keep the setAddress on the copy.

… rather than the address from the queue which may be a wildcard sub and not valid for publishng on, fix and test
…dress such that further routing will match the lack of prefixes in broker routing, different fix for redistribution with prefixes
@gtully
Copy link
Copy Markdown
Contributor Author

gtully commented Sep 6, 2021

rebased

@gtully gtully merged commit b27aa03 into apache:main Sep 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants