ORC-477: BloomFilter for ACID table does not get created #373

deniskuzZ · 2019-03-08T13:57:44Z

No description provided.

t3rmin4t0r · 2019-03-08T20:53:42Z

java/core/src/java/org/apache/orc/OrcUtils.java

      for (String column: selectedColumns.split((","))) {
-        TypeDescription col = findColumn(column, fieldNames, fields);
+        TypeDescription col = findColumn((isAcid ? "row." + column : column), fieldNames, fields);


This code should ideally be in the Hive codebase and ORC should interpret the selectedColumns there.

ACID schema generation code is under ORC SchemaEvolution class

public static TypeDescription createEventSchema(TypeDescription typeDescr) {
TypeDescription result = TypeDescription.createStruct()
.addField("operation", TypeDescription.createInt())
.addField("originalTransaction", TypeDescription.createLong())
.addField("bucket", TypeDescription.createInt())
.addField("rowId", TypeDescription.createLong())
.addField("currentTransaction", TypeDescription.createLong())
.addField("row", typeDescr.clone());
return result;
}

I am thinking on simplifying this a bit by introducing extra param (isAcid) in findColumn method.

if (resultTypeDesc.getCategory() == TypeDescription.Category.STRUCT && isAcid) {
//unfold
resultTypeDesc = findColumn(columnName, resultTypeDesc.getFieldNames(), resultTypeDesc.getChildren(), false);
}

t3rmin4t0r · 2019-03-08T20:55:04Z

java/core/src/java/org/apache/orc/OrcUtils.java

      } else {
+        if (resultTypeDesc.getCategory() == TypeDescription.Category.STRUCT
+                && fieldName.equalsIgnoreCase(StringUtils.substringBefore(columnName, DOT))) {


This should already work if Hive sends "row." in selected columns?

not completely, column decr are embedded under row struct, so we need to unfold it.

deniskuzZ · 2019-03-09T18:52:32Z

@t3rmin4t0r do you know how to configure the AppVeyor build?

The build phase is set to "MSBuild" mode (default), but no Visual Studio project or solution files were found in the root directory. If you are not building Visual Studio project switch build mode to "Script" and provide your custom build command.

deniskuzZ force-pushed the branch-1.5 branch 3 times, most recently from bd0d862 to 5e19bdb Compare March 8, 2019 20:40

t3rmin4t0r reviewed Mar 8, 2019

View reviewed changes

deniskuzZ force-pushed the branch-1.5 branch 3 times, most recently from 6fbd9b0 to 325d582 Compare March 9, 2019 18:25

deniskuzZ force-pushed the branch-1.5 branch 3 times, most recently from caa49c2 to e2db246 Compare March 11, 2019 22:10

ORC-477: BloomFilter for ACID table does not get created

c0447cf

deniskuzZ force-pushed the branch-1.5 branch from e2db246 to c0447cf Compare March 11, 2019 23:24

prasanthj merged commit f5071f2 into apache:branch-1.5 Mar 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ORC-477: BloomFilter for ACID table does not get created #373

ORC-477: BloomFilter for ACID table does not get created #373

deniskuzZ commented Mar 8, 2019

t3rmin4t0r Mar 8, 2019

deniskuzZ Mar 8, 2019

deniskuzZ Mar 8, 2019 •

edited

t3rmin4t0r Mar 8, 2019

deniskuzZ Mar 8, 2019 •

edited

deniskuzZ commented Mar 9, 2019

ORC-477: BloomFilter for ACID table does not get created #373

ORC-477: BloomFilter for ACID table does not get created #373

Conversation

deniskuzZ commented Mar 8, 2019

t3rmin4t0r Mar 8, 2019

Choose a reason for hiding this comment

deniskuzZ Mar 8, 2019

Choose a reason for hiding this comment

deniskuzZ Mar 8, 2019 • edited

Choose a reason for hiding this comment

t3rmin4t0r Mar 8, 2019

Choose a reason for hiding this comment

deniskuzZ Mar 8, 2019 • edited

Choose a reason for hiding this comment

deniskuzZ commented Mar 9, 2019

deniskuzZ Mar 8, 2019 •

edited

deniskuzZ Mar 8, 2019 •

edited