Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ALLUXIO-262] Cache incomplete blocks during file seek #3089

Merged
merged 24 commits into from May 10, 2016

Conversation

peisun1115
Copy link
Contributor

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@peisun1115
Copy link
Contributor Author

IntelliJ autoformating added some changes to this PR. If you feel they are misleading for code review, I can revert them.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/9449/

Failed Tests: 1

org.alluxio:alluxio-core-client: 1


Test FAILed.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/9451/
Test PASSed.

@@ -78,6 +78,8 @@
private boolean mClosed;
/** Whether or not the current block should be cached. */
private boolean mShouldCacheCurrentBlock;
/** Include incomplete blocks if Alluxio is configured to store blocks in Alluxio storage. */
private boolean mShouldCacheIncompleteBlock;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be final?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@gpang
Copy link
Contributor

gpang commented Apr 27, 2016

Thanks @peisun1115 for this feature! I left some comments.

}

/**
* Seeks to a file position. Blocks are cached even if they are fully read. This is only called by
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they are not fully read?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@calvinjia
Copy link
Contributor

@peisun1115 is the intended behavior consistent with the ticket? It seems like using this option will not cache if a block if the read is just of a small portion of the block. Also if the user seeks to a position inside a large file, all the blocks before that position will need to be cached which is not the intention right?

@@ -78,6 +78,8 @@
private boolean mClosed;
/** Whether or not the current block should be cached. */
private boolean mShouldCacheCurrentBlock;
/** Include incomplete blocks if Alluxio is configured to store blocks in Alluxio storage. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the javadoc is a bit unclear. do you mean:

Whether to store incomplete blocks in Alluxio ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@peisun1115
Copy link
Contributor Author

@gpang @apc999 @calvinjia

I explained this PR and compared it with the other approach (prefetch every block) in https://docs.google.com/document/d/1mR1_NPt60smz5nWMAyeUZfXNllrKKmc39CaUORb1oPg/edit

I can change this PR to do prefetching if you guys prefer. I will address the comments once we agree on the approach.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@@ -345,7 +352,7 @@ private void checkAndAdvanceBlockInStream() throws IOException {
try {
WorkerNetAddress address = mLocationPolicy.getWorkerForNextBlock(
mContext.getAlluxioBlockStore().getWorkerInfoList(), getBlockSizeAllocation(mPos));
// Don't cache the block to somewhere that already has it.
// Don't cache the block to somewhere that already has it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be changed back?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@gpang
Copy link
Contributor

gpang commented May 6, 2016

@peisun1115 Thanks! I left a few comments.

@peisun1115
Copy link
Contributor Author

@gpang I have addressed your comments.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/9652/
Test PASSed.

@gpang
Copy link
Contributor

gpang commented May 9, 2016

Thanks @peisun1115 , I just had one last question.

@peisun1115
Copy link
Contributor Author

@gpang fixed

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/9657/
Test PASSed.

if (mPos == pos) {
return;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we reach here, that means seeks backwards, right?
if yes, please javadoc this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. FYI, I am refactoring this file to make it clearer. I feel FileInStream is a little hard to follow and easy to make mistakes when I make changes to it.

@apc999
Copy link
Contributor

apc999 commented May 9, 2016

I left two post-LGTM comments, if fixed/answered, LGTM to me

@peisun1115
Copy link
Contributor Author

@apc999 Fixed. Thank you.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/9675/

Failed Tests: 1

org.alluxio:alluxio-tests: 1


Test FAILed.

@peisun1115
Copy link
Contributor Author

Jerkins, test this please

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/9678/
Test PASSed.

@gpang
Copy link
Contributor

gpang commented May 10, 2016

LGTM

1 similar comment
@calvinjia
Copy link
Contributor

LGTM

@calvinjia calvinjia merged commit 2451a6c into Alluxio:master May 10, 2016
@peisun1115 peisun1115 deleted the CacheBlock branch May 11, 2016 01:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants