Skip to content

HDDS-5952. EC: ECBlockReconstructedStripeInputStream should read from blocks in parallel#2899

Merged
sodonnel merged 6 commits intoapache:HDDS-3816-ecfrom
sodonnel:ec-HDDS-5952-parallel-read
Dec 15, 2021
Merged

HDDS-5952. EC: ECBlockReconstructedStripeInputStream should read from blocks in parallel#2899
sodonnel merged 6 commits intoapache:HDDS-3816-ecfrom
sodonnel:ec-HDDS-5952-parallel-read

Conversation

@sodonnel
Copy link
Contributor

@sodonnel sodonnel commented Dec 8, 2021

What changes were proposed in this pull request?

To make reconstruction reads faster, ECBlockReconstructedStripeInputStream could read from the block streams in parallel rather than serially.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-5952

How was this patch tested?

Existing tests

@sodonnel sodonnel force-pushed the ec-HDDS-5952-parallel-read branch from 450c1b5 to bbba145 Compare December 10, 2021 11:16
Copy link
Contributor

@umamaheswararao umamaheswararao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sodonnel for working on this. I have dropped few comments please check. Thanks
Overall approach looks good to me though.

"block", getBlockID(), index + 1, ee.getCause());
failedDataIndexes.add(index);
exceptionOccurred = true;
} catch (InterruptedException ie) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May want to reset this Interrupted exception? SO that we will not suppress it?

} catch (IOException e) {
ImmutablePair<Integer, Future<Void>> pair = pendingReads.poll();
index = pair.getKey();
// Should this future.get() have a timeout? At the end of the call chain
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Actual timeouts should be handled by underlying end points.

failedDataIndexes.add(index);
exceptionOccurred = true;
} catch (InterruptedException ie) {
LOG.error("Interrupted waiting for reads to complete", ie);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we are throwing the IOE on InterruptedException, but we catch the IOE in readStripe and retrying.
So, this interrupted exception would be masked. Probably, we should reset interrupted exception here and handle interruptedException in readStripe as well to throw out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I forgot the IOE would be retried by the higher level. I think we should let the interrupted exception propagate up from this method untouched and then handle it at the higher level.

decoderInputBuffers = new ByteBuffer[getRepConfig().getRequiredNodes()];

ThreadFactory threadFactory = new ThreadFactoryBuilder().setNameFormat(
"ec-reader-for-" + blockInfo.getBlockID() + "-%d").build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just explicitly say -TID-%d, just not confuse with block indexes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean here?

exceptionOccurred = true;
} catch (InterruptedException ie) {
LOG.error("Interrupted waiting for reads to complete", ie);
throw new IOException("Interrupted waiting for reads to complete", ie);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may need to shutdown the pool?

Copy link
Contributor

@umamaheswararao umamaheswararao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 LGTM ( Pending CI green)

@sodonnel sodonnel merged commit c660483 into apache:HDDS-3816-ec Dec 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants