Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return addresses with RabbitMQ sitting on separate box #22

Closed
peuramaki opened this issue Feb 27, 2014 · 18 comments
Closed

Return addresses with RabbitMQ sitting on separate box #22

peuramaki opened this issue Feb 27, 2014 · 18 comments

Comments

@peuramaki
Copy link
Contributor

I have a NSB setup using RabbitMQ transport, and I'm running Rabbit on separate server (will be RabbitMQ cluster in production).

It looks like that message return addresses are set incorrectly in this scenario.

  1. Service A sends a message to Service B
  2. Saga on Service B is started
  3. Service B does it's thing (long-running)
  4. Service B calls Saga.ReplyToOriginator

In this case the originator of the saga points at queue.NSBMachineA@NSBMachineA
but I think it should point at queue.NSBMachineA@RabbitMQMachine

Currently, the reply never arrives to ServiceA (it does, if I run Rabbit on the same box than NSB).

@andreasohlund
Copy link
Member

How are you hosting this?

Just a hunch but can you force UseSingleBrokerQueue to true?

Configure.ScaleOut(s=>s.UseSingleBrokerQueue());

@peuramaki
Copy link
Contributor Author

I'm hosting both services with NserviceBus.Host. ServiceA is hosted AsA_Client and ServiceB is hosted AsA_Server. Service A is actually an integration test project. Both are services are sitting on the same box, RabbitMQ is on separate box.

Configuring services to use single broker queue has no effect on behavior.

Currently, I'm working around the issue by

  • intercepting outgoing transport messages (IMutateOutgoingTransportMessages.MutateOutgoing),
  • and overriding machine name in TransportMessage.ReplyToAddress based on configuration

This works alright for now, since I can do everything using interceptors.

@peuramaki
Copy link
Contributor Author

Apparently, the timeout messages get wrong machine name as well. I tracked down the issue to TimeoutManager.Initialize(). DispacherAddress always gets the value on the processing machine, not the queue machine.

I couldn't figure out a workaround not involving reflection - I'm overriding TimeoutManager.DispacherAddress by force.

@andreasohlund
Copy link
Member

Can you share your message mappings from config? (something spound vierd, the rabbit transport should ignore all machine names if UseSingleBrokerQueue is true)

@peuramaki
Copy link
Contributor Author

Sure. These are on service B:

    <MessageEndpointMappings>
      <add Messages="System.Storage.Messages.V1" Endpoint="Customer.Storage@server01.localdomain" />
      <add Messages="System.Config.Messages.V1" Endpoint="Customer.Storage@server01.localdomain" />
    </MessageEndpointMappings>

and this is on service A (integration test client)

    <MessageEndpointMappings>
      <add Messages="Customer.Storage.Adapter.Messages.V1" Endpoint="Customer.Storage.Adapter.IntegrationTest@server01.localdomain" />
    </MessageEndpointMappings>

Afaik, you just described the problem - ignoring machine name even if I have set it. The name of the processing machine always pops up.

@andreasohlund
Copy link
Member

Now I'm confused. Machine names are ignored since all endpoints connect to the same broker which you specify in the connectionstring?

What is it your trying to acheive by putting machine names in the message mappings?

Sent from my iPhone

On 19 mar 2014, at 16:45, peuramaki notifications@github.com wrote:

Sure. These are on service B:

<MessageEndpointMappings>
  <add Messages="System.Storage.Messages.V1" Endpoint="Customer.Storage@server01.localdomain" />
  <add Messages="System.Config.Messages.V1" Endpoint="Customer.Storage@server01.localdomain" />
</MessageEndpointMappings>

and this is on service A (integration test client)

<MessageEndpointMappings>
  <add Messages="Customer.Storage.Adapter.Messages.V1" Endpoint="Customer.Storage.Adapter.IntegrationTest@server01.localdomain" />
</MessageEndpointMappings>

Afaik, you just described the problem - ignoring machine name even if I have set it. The name of the processing machine always pops up.


Reply to this email directly or view it on GitHub.

@peuramaki
Copy link
Contributor Author

Why, I'm trying to force NSB to use my RabbitMQ box instead of NSB box of course, using the approach of trying anything. It's not working though, behavior is the same with or without the specified queue server.

If you take a look at TimeoutManager.Initialize()

DispatcherAddress = Address.Parse(Configure.EndpointName).SubScope("TimeoutsDispatcher");

the result always points to local box, not rabbit box defined in transport config. I haven't been able to work around this. There's a promisingly named method

Address.OverrideDefaultMachine(rabbitMqHost);

but it doesn't seem to have any effect either.

@andreasohlund
Copy link
Member

I think I'm missing something, the way to specify which broker you connect
to through the "NServiceBus/Transport" connection string?

https://github.com/Particular/NServiceBus.RabbitMQ.Samples/blob/master/VideoStore.RabbitMQ/VideoStore.Sales/App.config#L11

Is that not working for you?

On Thu, Mar 20, 2014 at 7:29 AM, peuramaki notifications@github.com wrote:

Why, I'm trying to force NSB to use my RabbitMQ box instead of NSB box of
course, using the approach of trying anything. It's not working though,
behavior is the same with or without the specified queue server.

If you take a look at TimeoutManager.Initialize()

DispatcherAddress = Address.Parse(Configure.EndpointName).SubScope("TimeoutsDispatcher");

the result always points to local box, not rabbit box defined in transport
config. I haven't been able to work around this. There's a promisingly
named method

Address.OverrideDefaultMachine(rabbitMqHost);

but it doesn't seem to have any effect either.

Reply to this email directly or view it on GitHubhttps://github.com//issues/22#issuecomment-38138693
.

@peuramaki
Copy link
Contributor Author

Yes, I have defined the transport connection string

<add name="NServiceBus/Transport" connectionString="host=rabbitmq01.localdomain" />

And this is working out for me, excecpt when

  • I use Saga.ResponseToOrginator
  • Timeouts are being used
  • SLR's are being used

And I'm not using explicit timeouts in the saga with problems, NSB seems to use them sometimes (I'm not sure on what conditions).
The problem is non-deterministic: sometimes it works, sometimes it doesn't.

@andreasohlund
Copy link
Member

Any chance you can create a little sample project I can run to expose this?

On Thu, Mar 20, 2014 at 8:22 AM, peuramaki notifications@github.com wrote:

Yes, I have defined the transport connection string

And this is working out for me, excecpt when

  • I use Saga.ResponseToOrginator
  • Timeouts are being used
  • SLR's are being used

And I'm not using explicit timeouts in the saga with problems, NSB seems
to use them sometimes (I'm not sure on what conditions).
The problem is non-deterministic: sometimes it works, sometimes it doesn't.

Reply to this email directly or view it on GitHubhttps://github.com//issues/22#issuecomment-38140753
.

@peuramaki
Copy link
Contributor Author

I wrote a little project that should reproduce the issue but it doesn't..

I've done some more debugging though. Apparently, using rabbit transport and UseSingleBrokerQueue=true, NServiceBus.Address.Machine never gets used. Instead, address to send messages is always defined in rabbit channel that gets it from NServiceBus/Transport configuration. Am I correct?

Current suspect for the cause of this issue is RabbitMqUnitOfWork - sometimes message sending actions do get added to UoW, but are never run.

@andreasohlund
Copy link
Member

I've done some more debugging though. Apparently, using rabbit transport
and UseSingleBrokerQueue=true, NServiceBus.Address.Machine never gets used.
Instead, address to send messages is always defined in rabbit channel that
gets it from NServiceBus/Transport configuration. Am I correct?

That is correct!

Current suspect for the cause of this issue is RabbitMqUnitOfWork -
sometimes message sending actions do get added to UoW, but are never run.

This sounds vierd, possibly a bug!

On Thu, Mar 20, 2014 at 2:35 PM, peuramaki notifications@github.com wrote:

I wrote a little project that should reproduce the issue but it doesn't..

I've done some more debugging though. Apparently, using rabbit transport
and UseSingleBrokerQueue=true, NServiceBus.Address.Machine never gets used.
Instead, address to send messages is always defined in rabbit channel that
gets it from NServiceBus/Transport configuration. Am I correct?

Current suspect for the cause of this issue is RabbitMqUnitOfWork -
sometimes message sending actions do get added to UoW, but are never run.

Reply to this email directly or view it on GitHubhttps://github.com//issues/22#issuecomment-38166859
.

@peuramaki
Copy link
Contributor Author

Right. I was able to reproduce and "fix" the issue. Turned out that queries to SQL Server in sagas somehow messed up NSB transactions - sometimes transactions were never completed.

Apparently, NServiceBus.RabbitMQ sends messages when Transaction.TransactionCompleted event is raised. If transaction is not committed, messages are not sent. Probably works like it should, but an exception would be nice to get.

"Fixing" the issue involved changing depencency lifecycle of my SQL client component to 'single instance' - using 'instance per unit of work' leads to non-deterministic behavior. SqlConnection is created in components constructor and disposed in IDispose.Dispose().

The issue is reproduced at https://github.com/peuramaki/NServiceBus.RabbitMq.Issue22, please take a look.

To be honest, I don't know what actually causes the problem. Insight appreciated ;-)

@andreasohlund
Copy link
Member

Yes we delay the sending until the tx completes to make it easier for you to handle the lack of DTC. Ie it avoid "ghost" messages to be published in case of a db rollback. To avoid that behaviour you can make sure that there no TransactionScope wrapping the handler by calling:

https://github.com/Particular/NServiceBus/blob/9a6bc4e513bb8082f46a3652b9a037f3eba83e50/src/NServiceBus.Core/Settings/TransactionSettings.cs#L116

"Fixing" the issue involved changing depencency lifecycle of my SQL client component to 'single instance' - using 'instance per unit of work' leads to non-deterministic behavior. SqlConnection is created in components constructor and disposed in IDispose.Dispose().

What container are you using?

@peuramaki
Copy link
Contributor Author

The delaying thingy makes perfect sense. I'd just want to get exception if something goes wrong.

I need to have transactional message handlers, suppressing transactions is not an option.

I'm using default Autofac, but NHIbernate for saga persistence.

@peuramaki
Copy link
Contributor Author

I've continued investigations further with the issue.

It appears that when custom SQL queries to MS SQL Server (using SqlConnection) are made during the lifetime of a NServiceBus message handler, non-deterministic behavior is resulted. Sometimes ambient transaction is never committed, which results in NServiceBus.RabbitMQ never sending out messages that should be sent. That is because flushing of NSB's unit of work is bound to TransactionCompleted event, which sometimes never launhes.

I'm able to work around the issue the following way: instead of letting SqlConnection to automatically enlist to ambient transaction, I'm preventing it using 'Enlist=false' in connection string. I don't use System.Transactions at all with the SqlConnection, but I hook up SqlTransaction using NServiceBus'es IManageUnitsOfWork interface.

Maybe I should close this bug and open up a new one?

@andreasohlund
Copy link
Member

Yes, please reopen another one!

Thanks!!

On Wed, Mar 26, 2014 at 12:54 PM, peuramaki notifications@github.comwrote:

I've continued investigations further with the issue.

It appears that when custom SQL queries to MS SQL Server (using
SqlConnection) are made during the lifetime of a NServiceBus message
handler, non-deterministic behavior is resulted. Sometimes ambient
transaction is never committed, which results in NServiceBus.RabbitMQ never
sending out messages that should be sent. That is because flushing of NSB's
unit of work is bound to TransactionCompleted event, which sometimes never
launhes.

I'm able to work around the issue the following way: instead of letting
SqlConnection to automatically enlist to ambient transaction, I'm
preventing it using 'Enlist=false' in connection string. I don't use
System.Transactions at all with the SqlConnection, but I hook up
SqlTransaction using NServiceBus'es IManageUnitsOfWork interface.

Maybe I should close this bug and open up a new one?

Reply to this email directly or view it on GitHubhttps://github.com//issues/22#issuecomment-38674858
.

@peuramaki
Copy link
Contributor Author

Closing this one, see #26

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants