New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDFS-16601. DataTransfer should throw IOException to client #4369
base: trunk
Are you sure you want to change the base?
Conversation
🎊 +1 overall
This message was automatically generated. |
@Hexiaoqiao @MingXiangLi can you help me review this patch? thanks~ |
@ZanderXu Thanks for report and contribution. Sorry I don't get what scenario lead this issue. Would you like to offer more information such as version deployed and how to reproduce. Thanks. |
Thanks @Hexiaoqiao . So Dn should return the failed transfer exception to client, so that client can choose anther existed dn as source dn to transfer the block to new DN. |
Thanks for starting this proposal. I think there are still many issues for data transfer for pipeline recovery from my practice, which includes both basic function and performance. IIRC, there are only timeout exception and one no explicit meaning exception, thus client has no helpful information (such src node or target node meet issue, or other exceptions) to make decision. |
Thanks @Hexiaoqiao for your suggestion. Yeah, your are right, we need more failed information for client, like transfer source failed or transfer target failed. If client have more information about failed transfer, It can accurately and efficiently remove abnormal nodes. But this would be a big feature. Fortunately, at present, as long as failed exception throw to client, the client defaults to thinking that the new dn is abnormal, and will exclude it and retry transfer. During retrying transfer, Client will chose new source dn and new target dn. Therefor, the source and target dn in the previous failed transfer round will be replaced. So I think simple process is just throw failed exception to client, and client can find and remove the real abnormal datanode. |
Thanks for furthermore comment here. Agree that it will improve fault-tolerant for transfer, however, we have to accept the truth that the source datanode meets issue and choose the same one when retry, thus we could not avoid to fail. I am not sure if any way to expose exceptions to differ source Node or target Node exception? If it is true, it will be helpful for the following fault-tolerant improvement at client side. |
It will chose the next datanode as source datanode when retry. Code like blew, and tried will +1 when retry.
|
Sorry for not very clear comment. I know not it's round-robin way to pick the source node, and at third round it will pick the original node again (no matter if it is bad/slow node.), of course it will be a tiny probability. Actually, I mean, it will be helpful for client to do many fault-tolerant improvement later if we could differ the exception about transfer. Once more, this is not blocker comment. Thanks again. |
Got, thanks @Hexiaoqiao .
I will try to work for it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patch makes sense and it's great to dig so deep and fix it. But I wish you can state the bug report in the jira more clearly.
Basically you reported a bug where if the replica is corrupt, DataNode should not attempt to transfer from that replica when attempting to recover from a write failure, because it will always fail. HDFS-4660 (the file offset bug) plus this one together caused the client to fail to recover.
// At this condition, transferBlock that happens during | ||
// pipeline recovery would transfer extra bytes to make up to the | ||
// end of the chunk. And this is when the block corruption | ||
// described in HDFS-4660 would occur. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh HDFS-4660 brought back my worst nightmare when I spent a month chasing this bug.
@jojochuang Thanks for you review. We encounter this bug in our prod, because the block‘s checksum file of the source DN is corrupted. It caused transfer failed. And client tried all DNs and failed. So Client should sense the status of transfer. But it's difficult to differ the exception caused by source Node or target Node. Maybe we can first throw the failed exception to Client and let Client try to use the next DN as the source to transfer block. cc @Hexiaoqiao |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. +1 from my side.
@jojochuang Master, I have rebased this patch based on the latest trunk. I'm looking forward to getting some of your thoughts on this issue. BTW, Remote copy of HDFS-2139(FastCopy) will use DataTransfer and it also requires DataTransfer to throw IOException to Client to retry. |
🎊 +1 overall
This message was automatically generated. |
Detail info please refer to HDFS-16601.
Bug stack like: