Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Endpoint is stopped when exception message is too large for transport #2956

Closed
adamralph opened this issue Oct 6, 2015 · 6 comments
Closed
Assignees
Labels
Milestone

Comments

@adamralph
Copy link
Member

Replaces Particular/NServiceBus.RabbitMQ#102

When we create the message to be sent to the error queue, we take the exception message and add it to the headers. In some cases, the exception message is too large for the transport, which causes an unrecoverable failure and the end point is stopped.

I've reproduced the error at https://github.com/Particular/NServiceBus.RabbitMQ-Issue-102. Tests demonstrate that an exception message of lenth 2 ^ 16 causes this failure, but 2 ^15 does not.

The solution is to truncate the exception message before adding it to the headers. Given that we cannot predict the size of message which will make a specific transport refuse it, there is a proposal to make the maximum length configurable.

Until that feature is present, we will release a patch which sets a hard truncation limit of 2 ^ 14 characters in order to fix the immediate problem when using the RabbitMQ transport. This should be long enough not to have any effect on the vast majority of messages and, in the case of RabbitMQ, will leave at least 2 ^ 14 characters for the rest of the message.

Who's affected

  • Any app using the RabbitMQ transport (and possibly other transports) which has the potential for throwing exceptions with very large messages.

Symptoms

The message fails to be sent to the error queue, the original message is left unacked and the endpoint dies.

@adamralph
Copy link
Member Author

@burkhartt this change has been released in NServiceBus 5.2.7 - can you please give it a try and see if it fixes the issue for you.

For reference, I confirmed that the fix works in my repro of the original bug - https://github.com/Particular/NServiceBus.RabbitMQ-Issue-102/commit/01b8b138e31e3d79bb766da95203bdb3f39d6556

@adamralph
Copy link
Member Author

User confirmed that the patch has fixed the issue - Particular/NServiceBus.RabbitMQ#102 (comment)

@burkhartt
Copy link

We actually noticed this on QA today :-/ (Thought it had fixed the issue b/c we had huge stack traces and they weren't blowing up the queue lately.)

=ERROR REPORT==== 13-Oct-2015::16:08:25 ===
Error on AMQP connection <0.11316.60> (IP1 -> IP2, vhost: '/', user: 'tim', state: running), channel 2:
{amqp_error,frame_error,
"type 2, all octets = <<>>: {frame_too_large,189514,131064}",none}

This is running with the 5.2.7 version.

@adamralph
Copy link
Member Author

Hi @burkhartt would you be able to provide us with a simple repro? E.g. something similar to what I did in https://github.com/Particular/NServiceBus.RabbitMQ-Issue-102

@burkhartt
Copy link

Retry 2 - HEADERS: [NServiceBus.MessageId] -- c0830ac6-959d-43d4-bfc3-a53100035a15 ........... [NServiceBus.CorrelationId] -- c0830ac6-959d-43d4-bfc3-a53100035a15 ........... [NServiceBus.MessageIntent] -- Send ........... [NServiceBus.Version] -- 5.2.7 ........... [NServiceBus.TimeSent] -- 2015-10-14 04:12:12:229934 Z ........... [NServiceBus.ContentType] -- text/xml ........... [NServiceBus.EnclosedMessageTypes] -- Engine.Messages.SalesForce.BookingPublishedForCrmCommand, Engine, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null ........... [NServiceBus.RelatedTo] -- 0cb50531-130c-46ba-93df-a531000358ed ........... [NServiceBus.ConversationId] -- cb0c7391-e683-4635-8fba-a531000358ed ........... [WinIdName] -- system ........... [Session.User.Identity.Name] -- system ........... [Country] -- ........... [NServiceBus.RabbitMQ.CallbackQueue] -- GAT.Cloud.AtlasAir.USAM-GATQAWEB1 ........... [NServiceBus.OriginatingMachine] -- USAM-GATQAWEB1 ........... [NServiceBus.OriginatingEndpoint] -- GAT.Cloud.AtlasAir ........... [$.diagnostics.originating.hostid] -- 814b04713d94785409c3f40fa4328670 ........... [NServiceBus.ReplyToAddress] -- GAT.Cloud.AtlasAir ........... [$.diagnostics.hostid] -- 65555f9599731e3d1131f12b2f643b83 ........... [$.diagnostics.hostdisplayname] -- USAM-GATQAWEB1 - BODY: System.Byte[] - MESSAGE: POST https://www.salesforce.com/services/apexrest/orders
With data:

@adamralph
Copy link
Member Author

Hi @burkhartt I've raised a new bug for this #3023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants