Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMQP Link Detach #76

Open
alexg-axis opened this issue Jan 17, 2023 · 3 comments
Open

AMQP Link Detach #76

alexg-axis opened this issue Jan 17, 2023 · 3 comments

Comments

@alexg-axis
Copy link
Contributor

I have an issue where I'm unable to publish events. Unfortunately I can't identify any more related circumstances than that. It has occurred some times, but in most cases it works as expected.

In essence the code works as follows:

ctx := context.Background()
message := []byte("Hello, World!")
expiry := 10 *60 * time.Second
deviceId := "some-device"

if err := client.SendEvent(
  ctx,
  deviceId,
  message,
  iotservice.WithSendAck((iotservice.AckType)("full")), 
  iotservice.WithSendExpiryTime(time.Now().Add(expiry)),
  ); err != nil {
  return err
}

The error is the following:

link detached, reason: *Error{Condition: amqp:link:detach-forced, Description: Server Busy. Please retry operation, Info: map[]}

The Java SDK seems to have this comment regarding the error:

  /**
     * An operator intervened to detach for some reason.
     */
    LINK_DETACH_FORCED("amqp:link:detach-forced"),

Same with the JS one: https://github.com/Azure/amqp-common-js/blob/master/lib/errors.ts#L171.

So to me it seems as if this error may occur from time to time. For me, it has always been solved with a restart, so I assume one way to handle it is to simply reconnect the client.

@alexg-axis
Copy link
Contributor Author

It seems to happen on a weekly basis. It could mean that Azure has some sort of timeout for 7 days and that we should gracefully reconnect when it occurs.

@alexg-axis
Copy link
Contributor Author

Some information from the Python library.

https://github.com/Azure/azure-sdk-for-python/blob/a7ec3bca94251b6a73de347112d4a77e6e615ccc/sdk/eventhub/azure-eventhub/TROUBLESHOOTING.md?plain=1#L32

All Event Hubs exceptions are wrapped in an [EventHubError][EventHubError]. They often have an underlying AMQP error code which specifies whether an error should be retried. For retryable errors (ie. amqp:connection:forced or amqp:link:detach-forced), the client libraries will attempt to recover from these errors based on the [retry options][AmqpRetryOptions] specified when instantiating the client. To configure retry options, follow the sample [Client Creation][ClientCreation]. If the error is non-retryable, there is some configuration issue that needs to be resolved.

@alexg-axis
Copy link
Contributor Author

We believe the following code is the cause - once a link is detached, there's no retry to get a session and link going again.

https://github.com/amenzhinsky/iothub/blob/master/iotservice/client.go#L171-L189

Note how, upon an error when putting a token, we just return and won't try any more. Likely, we become unauthorized and kicked from the server and the link becomes detached.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant