Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add check for Hive 2.0 ACID versioned tables #2989

Closed

Conversation

harmandeeps
Copy link
Member

Fixes: #2790

Copy link
Member

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@@ -419,6 +419,18 @@ private void invokeNoMoreSplitsIfNecessary()
false,
true));

if (AcidUtils.isFullAcidTable(table.getParameters())) {
/**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's inline comment, not javadoc. Make it one-line comment with // ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

directory.getCurrentDirectories().size() > 0 ? directory.getCurrentDirectories().get(0).getPath() : null;

if (baseOrDeltaPath != null && AcidUtils.OrcAcidVersion.getAcidVersionFromMetaFile(baseOrDeltaPath, fs) < 2) {
throw new PrestoException(NOT_SUPPORTED, String.format("Hive 2.0 versioned transactional tables are not supported, Please run major compaction in Hive 3.0"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • format is unnecessary.
  • we don't know whether this is Hive 2 or Hive 1. Maybe
    "Hive transactional tables are supported with Hive 3.0 and only after a major compaction has been run"
    

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rephrased a little, please take a look.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My proposal was:

Hive transactional tables are supported with Hive 3.0 and only after a major compaction has been run

Currently it's:

Hive version <= 2.0 transactional tables are not supported, Please run major compaction in Hive 3.0

The problem with current wording is that exception message should be "declarative", stating the problem.

@electrum can you suggest wording here?

Path baseOrDeltaPath = directory.getBaseDirectory() != null ? directory.getBaseDirectory() :
directory.getCurrentDirectories().size() > 0 ? directory.getCurrentDirectories().get(0).getPath() : null;

if (baseOrDeltaPath != null && AcidUtils.OrcAcidVersion.getAcidVersionFromMetaFile(baseOrDeltaPath, fs) < 2) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getAcidVersionFromMetaFile call is not wrapped hdfsEnvironment.doAs and so i'd expect this to fail in a kerberized env. Yet, the product tests passed. Can you check this gets run with a kerberized HDFS environment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getAcidVersionFromMetaFile takes fs (FileSystem) object which gets initialized by using hdfsAuthentication.doAs() in hdfsEnvironment.getFileSystem(). Therefore, it is not failing.

Files.write(file.toPath(), "2".getBytes(UTF_8));
}
else {
Files.write(file.toPath(), "test".getBytes(UTF_8));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't valid ORC content, so it's better to just use empty file

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

assertTrue(file.createNewFile(), "Failed to create file");
writeDeltaFileData(file);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createNewFile on the line above is now redundant

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@@ -496,7 +495,7 @@ public void testHive2VersionedFullAcidTableFails()
for (String path : filePaths) {
File file = new File(path);
assertTrue(file.getParentFile().exists() || file.getParentFile().mkdirs(), "Failed creating directory " + file.getParentFile());
writeDeltaFileData(file);
assertTrue(writeFile(file), "Failed to write file");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked your previous API better:

writeDeltaFileData(file);

make the method void again and add the assert inside, just for the file.createNewFile() case

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

saw, you already fixed and merged, thank you.

Files.write(file.toPath(), "2".getBytes(UTF_8));
}
else if (file.getName().startsWith("000000_")) {
Files.write(file.toPath(), "test".getBytes(UTF_8));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think i don't understand this. Was my comment #2989 (comment) unsound?

I don't understand why "test" is valid content for 000000_ files but not for 000001_
Isn't most files ORC?

Copy link
Member Author

@harmandeeps harmandeeps Mar 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test testFullAcidTableWithOriginalFilesFails started failing if we don't write anything on original file (000000_). As in test, the original file as only '000000_0', that's why I hard coded the prefix. But, even writing "test" too is not a correct ORC format for original files.

@findepi
Copy link
Member

findepi commented Mar 6, 2020

Merged, thanks!

@findepi findepi closed this Mar 6, 2020
@findepi findepi added this to the 331 milestone Mar 6, 2020
@dain dain mentioned this pull request Mar 6, 2020
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Fail queries explicitly on ORC ACID tables before Hive 3
2 participants