PHOENIX-5932 View Index rebuild results in surplus rows from other vi… by abhishek-chouhan · Pull Request #797 · apache/phoenix

abhishek-chouhan · 2020-06-04T23:14:47Z

…ew indexes

gjacoby126 · 2020-06-04T23:52:43Z

phoenix-core/src/it/java/org/apache/phoenix/end2end/IndexToolIT.java

+                int numDeletes = 0;
+                for (Result result = scanner.next(); result != null; result = scanner.next()) {
+                    for (Cell cell : result.rawCells()) {
+                        if (KeyValue.Type.codeToType(cell.getTypeByte()) == KeyValue.Type.Put) {


Note for when you port to master: in HBase 2.x Cell has a getType() method that returns the Cell.Type enum. We should avoid using KeyValue wherever possible because it's IA.Private.

gjacoby126 · 2020-06-04T23:53:47Z

phoenix-core/src/it/java/org/apache/phoenix/end2end/IndexToolIT.java

+    }
+
+    @Test
+    public void testUpdatableViewIndex2() throws Exception {


More descriptive name, please.

Tried a more descriptive name in the latest commit :)

gjacoby126 · 2020-06-04T23:57:08Z

phoenix-core/src/it/java/org/apache/phoenix/end2end/IndexToolIT.java

+        Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
+
+        try (Connection conn = DriverManager.getConnection(getUrl(), props)) {
+            // Create Table and Views


Good to have a comment explaining what about the schemas is important to the test (I assume that they're filtering on a non-PK, non-indexed column?

Done. Yes, the point of 2 tests is to test out two different filters that end up being used. One test has view on a non-leading part of pk, other one has view on a non pk column.

gjacoby126 · 2020-06-05T00:00:38Z

phoenix-core/src/it/java/org/apache/phoenix/end2end/IndexToolIT.java

+            rs.next();
+            assertEquals(2, rs.getInt(1));
+            try (org.apache.hadoop.hbase.client.Connection hcon =
+                    ConnectionFactory.createConnection(config)) {


You can factor out the cell counting into its own helper function to avoid duplication between the 2 tests. (TestUtil.getRawCellCount may also be useful if you can extend it to also keep track of what Cell types are scanned.)

gjacoby126 · 2020-06-05T00:04:16Z

phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java

            rawScan.setMaxVersions();
            rawScan.getFamilyMap().clear();
-            rawScan.setFilter(null);
+            if (scan.getFilter() instanceof FirstKeyOnlyFilter) {


Please add a comment to explain why FirstKeyOnlyFilter is a special case. If the rebuild index scan is explicitly asking for only the first keyvalue, why do we avoid using the AllVersions filter which also only gives the first keyvalue?

And if there is a reason not to use the FirstKeyOnlyFilter, are we still OK with using the AllVersionsIndexRebuildFilter if the Scan's filter it will delegate to is a composite filter which contains a FirstKeyOnlyFilter?

Allversions filter does not only give the first key value, its purpose is to make sure all versions of a column are returned(when matched by underlying supplied filter), instead of just one. Usually the filters used in normal queries(which also end up being used for rebuild since we use select count(*)) returns only 1 version of a column, in rebuild we want to return all versions hence this.

gjacoby126 · 2020-06-05T00:05:20Z

phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java

+                rawScan.setFilter(null);
+            } else if (scan.getFilter() != null) {
+                rawScan.setFilter(new AllVersionsIndexRebuildFilter(scan.getFilter()));
+            }


Do we need to do any special filter logic down at line 1099 in the else block if the Scan was raw in the first place?

AFAIK we get raw scan here in case of old design and partial rebuild (Correct me if i'm wrong here @kadirozde ). I didn't want to mess with the old design and hence only made the changes for new.

Yes, we get raw scan only for the old design partial rebuilds (i.e., auto-rebuilds).

gjacoby126 · 2020-06-05T00:11:22Z

phoenix-core/src/main/java/org/apache/phoenix/filter/AllVersionsIndexRebuildFilter.java

+    public ReturnCode filterKeyValue(Cell v) throws IOException {
+        ReturnCode delegateCode = super.filterKeyValue(v);
+        if (delegateCode == ReturnCode.INCLUDE_AND_NEXT_COL) {
+            return ReturnCode.INCLUDE;


This is simulating the effects of a FirstKeyOnlyFilter? Comment would be good

swaroopak · 2020-06-05T00:24:35Z

phoenix-core/src/it/java/org/apache/phoenix/end2end/IndexToolIT.java

    }

+    @Test
+    public void testUpdatableViewIndex() throws Exception {


Please move this and other test to IndexToolForNonTxGlobalIndexIT

…ew indexes

gokceni · 2020-06-05T18:50:17Z

phoenix-core/src/main/java/org/apache/phoenix/filter/AllVersionsIndexRebuildFilter.java

+    @Override
+    public ReturnCode filterKeyValue(Cell v) throws IOException {
+        ReturnCode delegateCode = super.filterKeyValue(v);
+        if (delegateCode == ReturnCode.INCLUDE_AND_NEXT_COL) {


Could you add a comment here why we convert it to INCLUDE? Why are we not happy with NEXT_COL?

@gokceni NEXT_COL skips this column and goes to the next one. What we want to do is when the underlying filter says yes to a column, we want to say yes too, but instead of jumping to the next col since we got a value, we want to get all versions.

Yes, I agree, comment in the code would be helpful here. @abhishek-chouhan

gjacoby126

+1

kadirozde

+1, Thanks

abhishek-chouhan requested review from gjacoby126 and kadirozde June 4, 2020 23:14

abhishek-chouhan force-pushed the PHOENIX-5932 branch from 1c8bb6a to f781099 Compare June 4, 2020 23:31

gjacoby126 requested changes Jun 5, 2020

View reviewed changes

swaroopak reviewed Jun 5, 2020

View reviewed changes

PHOENIX-5932 View Index rebuild results in surplus rows from other vi…

c238805

…ew indexes

abhishek-chouhan force-pushed the PHOENIX-5932 branch from f781099 to c238805 Compare June 5, 2020 05:39

gokceni reviewed Jun 5, 2020

View reviewed changes

gjacoby126 approved these changes Jun 8, 2020

View reviewed changes

kadirozde approved these changes Jun 8, 2020

View reviewed changes

abhishek-chouhan closed this Jun 9, 2020

Conversation

abhishek-chouhan commented Jun 4, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gjacoby126 Jun 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

swaroopak Jun 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gjacoby126 left a comment

Choose a reason for hiding this comment

Uh oh!

kadirozde left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

gjacoby126 Jun 5, 2020 •

edited

Loading

swaroopak Jun 5, 2020 •

edited

Loading

kadirozde left a comment •

edited

Loading