Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-21475][Core]Revert "[SPARK-21475][CORE] Use NIO's Files API to replace FileInputStream/FileOutputStream in some critical paths" #20119

Closed
wants to merge 1 commit into from

Conversation

zsxwing
Copy link
Member

@zsxwing zsxwing commented Dec 29, 2017

What changes were proposed in this pull request?

This reverts commit 5fd0294 because of a huge performance regression.
I manually fixed a minor conflict in OneForOneBlockFetcher.java.

Files.newInputStream returns sun.nio.ch.ChannelInputStream. ChannelInputStream doesn't override InputStream.skip, so it's using the default InputStream.skip which just consumes and discards data. This causes a huge performance regression when reading shuffle files.

How was this patch tested?

Jenkins

…tream/FileOutputStream in some critical paths"

This reverts commit 5fd0294.
@SparkQA
Copy link

SparkQA commented Dec 30, 2017

Test build #85532 has finished for PR 20119 at commit 401d650.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

LGTM

@zsxwing
Copy link
Member Author

zsxwing commented Dec 30, 2017

Thanks! Merging to master.

@asfgit asfgit closed this in 14c4a62 Dec 30, 2017
@zsxwing zsxwing deleted the revert-SPARK-21475 branch December 30, 2017 06:45
@cloud-fan
Copy link
Contributor

let's also cc the author. @jerryshao do you know if there is a way to fix the regression?

@jerryshao
Copy link
Contributor

Sorry I haven't checked the details, let me take a look at it. The changes I made was trying to fix memory issue for shuffle (especially external shuffle service), this issue was occurred in our prod cluster. Let me think if there's a way to fix it.

try {
is = Files.newInputStream(file.toPath());
is = new FileInputStream(file);
ByteStreams.skipFully(is, offset);
return new LimitedInputStream(is, length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this two lines might be the place which suffers from skip issue, can we just only revert this place? @zsxwing @cloud-fan @gatorsmile .

@@ -198,7 +196,7 @@ private[spark] class IndexShuffleBlockResolver(
// find out the consolidated file, then the offset within that from our index
val indexFile = getIndexFile(blockId.shuffleId, blockId.mapId)

val in = new DataInputStream(Files.newInputStream(indexFile.toPath))
val in = new DataInputStream(new FileInputStream(indexFile))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jerryshao this is another place. In addition, I'm not sure if there is any compression codec using skip or not.

I also noticed sun.nio.ch.ChannelInputStream has extra synchronizeds as Files.newInputStream needs to be thread-safe. Not sure if it may cause performance regression or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

@@ -165,7 +165,7 @@ private void failRemainingBlocks(String[] failedBlockIds, Throwable e) {

DownloadCallback(int chunkIndex) throws IOException {
this.targetFile = tempFileManager.createTempFile();
this.channel = Channels.newChannel(Files.newOutputStream(targetFile.toPath()));
this.channel = Channels.newChannel(new FileOutputStream(targetFile));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I think we can use FileChannel.open instead.

@@ -133,7 +132,7 @@ public Object convertToNetty() throws IOException {
if (conf.lazyFileDescriptor()) {
return new DefaultFileRegion(file, offset, length);
} else {
FileChannel fileChannel = FileChannel.open(file.toPath(), StandardOpenOption.READ);
FileChannel fileChannel = new FileInputStream(file).getChannel();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jerryshao I think this is the only line that may reduce the memory pressure for external shuffle service. Right?

@@ -39,7 +39,7 @@ public ShuffleIndexInformation(File indexFile) throws IOException {
offsets = buffer.asLongBuffer();
DataInputStream dis = null;
try {
dis = new DataInputStream(Files.newInputStream(indexFile.toPath()));
dis = new DataInputStream(new FileInputStream(indexFile));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zsxwing also here I think it will affect external shuffle service.

@jerryshao
Copy link
Contributor

@zsxwing maybe we only need to fix above two points related to external shuffle service, what do you think?

@zsxwing
Copy link
Member Author

zsxwing commented Jan 3, 2018

@zsxwing maybe we only need to fix above two points related to external shuffle service, what do you think?

@jerryshao sgtm. Could you submit a PR?

@jerryshao
Copy link
Contributor

OK, I will do it.

asfgit pushed a commit that referenced this pull request Jan 4, 2018
…ternal shuffle service

## What changes were proposed in this pull request?

This PR is the second attempt of #18684 , NIO's Files API doesn't override `skip` method for `InputStream`, so it will bring in performance issue (mentioned in #20119). But using `FileInputStream`/`FileOutputStream` will also bring in memory issue (https://dzone.com/articles/fileinputstream-fileoutputstream-considered-harmful), which is severe for long running external shuffle service. So here in this proposal, only fixing the external shuffle service related code.

## How was this patch tested?

Existing tests.

Author: jerryshao <sshao@hortonworks.com>

Closes #20144 from jerryshao/SPARK-21475-v2.

(cherry picked from commit 93f92c0)
Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>
asfgit pushed a commit that referenced this pull request Jan 4, 2018
…ternal shuffle service

## What changes were proposed in this pull request?

This PR is the second attempt of #18684 , NIO's Files API doesn't override `skip` method for `InputStream`, so it will bring in performance issue (mentioned in #20119). But using `FileInputStream`/`FileOutputStream` will also bring in memory issue (https://dzone.com/articles/fileinputstream-fileoutputstream-considered-harmful), which is severe for long running external shuffle service. So here in this proposal, only fixing the external shuffle service related code.

## How was this patch tested?

Existing tests.

Author: jerryshao <sshao@hortonworks.com>

Closes #20144 from jerryshao/SPARK-21475-v2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants