Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRILL-5733: Unable to SELECT from parquet file with Hadoop 2.7.4 #1969

Closed
wants to merge 1 commit into from

Conversation

vvysotskyi
Copy link
Member

DRILL-5733: Unable to SELECT from parquet file with Hadoop 2.7.4

Description

For the case when a single parquet file was selected, Drill built pass for metadata cache files like this is a directory. It caused errors for Hadoop. Added check to verify that the parent path is a directory before constructing a path for metadata cache files.

Documentation

NA

Testing

Checked manually on Apache Hadoop cluster, checked that fix doesn't break existing unit tests. No test is provided since the error is not reproduced even with MiniDFSCluster.

Copy link
Member

@arina-ielchiieva arina-ielchiieva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you check with Hadoop 3 as well?

Thanks for the fix, please address some minor code review comments.

.map(filename -> new Path(p, filename))
.collect(Collectors.toList());
for (String filename : Metadata.OLD_METADATA_FILENAMES) {
// Read the older version of metadata file if the current version of metadata cache files do not exist.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does not exist

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed.

return metaFilepaths;
}
metaFilepaths.clear();
metaFilepaths.add(new Path(p, filename));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that for the last iteration we’ll add file name to metaFilepaths but will never check if it exists and return empty collection?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we will check whether the metadata files from the last iteration exist below the loop and in this case return list with files or an empty collection if they do not exist.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks. Missed check in the end.

Copy link
Member Author

@vvysotskyi vvysotskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arina-ielchiieva, thanks for the review, I have addressed CR comments.

.map(filename -> new Path(p, filename))
.collect(Collectors.toList());
for (String filename : Metadata.OLD_METADATA_FILENAMES) {
// Read the older version of metadata file if the current version of metadata cache files do not exist.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed.

return metaFilepaths;
}
metaFilepaths.clear();
metaFilepaths.add(new Path(p, filename));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we will check whether the metadata files from the last iteration exist below the loop and in this case return list with files or an empty collection if they do not exist.

@arina-ielchiieva
Copy link
Member

+1, LGTM.

@vvysotskyi
Copy link
Member Author

Sorry, missed a question about Hadoop 3. I have checked also with Hadoop 3.2.0 (previously was checked with Hadoop 2.8), and it works fine, even without this fix.

@asfgit asfgit closed this in 806760b Feb 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants