Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dcache-xroot: handle outbound errors on channel promise #6945

Merged
merged 1 commit into from
Jan 11, 2023

Conversation

alrossi
Copy link
Member

@alrossi alrossi commented Jan 9, 2023

Motivation:

Refer to the final comments in
GH 6909 "Proxy xrootd door keep restarting"
#6909

Occasionally we will find unexpected errors on the pipeline being reported as with the example below

04 Dec 2022 04:07:55 (Xrootd-dcdndoor03-externalsubnet) [] An exceptionCaught()
event was fired, and it reached at the tail of the pipeline. It usually means
the last handler in the pipeline did not handle the exception.
io.netty.channel.StacklessClosedChannelException: null
	at io.netty.channel.AbstractChannel$AbstractUnsafe.write(Object, ChannelPromise)(Unknown Source)

The problem arises from the asymmetical nature of
exception propagation on the Netty pipeline. For
exceptions on inbound/read, all that is needed is
for the last handler in the pipeline (the "TOP")
to implement exceptionCaught. This is done by
the final handlers in the door, proxy and pool
pipelines.

However, for outbound exceptions, one needs to
add a listener to the channel future or channel
promise; this needs to be done at the very
beginning of the outbound pipeline ("BOTTOM");
the listener allows exceptions on channel write
to invoke the uncaughtException method on the
inbound handlers.

Modification:

Create an OutboundHandler that simply adds the
listener and add it first to all pipelines.

Result:

We should no longer see exceptions such as the one above reporting no exception handler to handle
the exception.

Target: master
Request: 8.2
Request: 8.1 (Without proxy)
Request: 8.0 (Without proxy)
Request: 7.2 (Without proxy)
Closes: #6909
Patch: https://rb.dcache.org/r/13833/
Requires-notes: yes
Acked-by: Tigran

Motivation:

Refer to the final comments in
GH 6909 "Proxy xrootd door keep restarting"
dCache#6909

Occasionally we will find unexpected errors on the
pipeline being reported as with the example below

```
04 Dec 2022 04:07:55 (Xrootd-dcdndoor03-externalsubnet) [] An exceptionCaught()
event was fired, and it reached at the tail of the pipeline. It usually means
the last handler in the pipeline did not handle the exception.
io.netty.channel.StacklessClosedChannelException: null
	at io.netty.channel.AbstractChannel$AbstractUnsafe.write(Object, ChannelPromise)(Unknown Source)
```

The problem arises from the asymmetical nature of
exception propagation on the Netty pipeline.  For
exceptions on inbound/read, all that is needed is
for the last handler in the pipeline (the "TOP")
to implement exceptionCaught.  This is done by
the final handlers in the door, proxy and pool
pipelines.

However, for outbound exceptions, one needs to
add a listener to the channel future or channel
promise; this needs to be done at the very
beginning of the outbound pipeline ("BOTTOM");
the listener allows exceptions on channel write
to invoke the uncaughtException method on the
inbound handlers.

Modification:

Create an OutboundHandler that simply adds the
listener and add it first to all pipelines.

Result:

We should no longer see exceptions such as the one
above reporting no exception handler to handle
the exception.

Target: master
Request: 8.2
Request: 8.1 (Without proxy)
Request: 8.0 (Without proxy)
Request: 7.2 (Without proxy)
Closes: dCache#6909
Patch: https://rb.dcache.org/r/13833/
Requires-notes: yes
Acked-by: Tigran
@svemeyer
Copy link
Contributor

retest this please

@svemeyer svemeyer merged commit 5601a3d into dCache:7.2 Jan 11, 2023
@alrossi alrossi deleted the fix/7.2/xroot-outbound-errors branch January 11, 2023 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants