-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PHOENIX-6247 Separating logical and physical table names #1170
Conversation
💔 -1 overall
This message was automatically generated. |
6af705e
to
ff6b4a1
Compare
💔 -1 overall
This message was automatically generated. |
ff6b4a1
to
d0da90d
Compare
💔 -1 overall
This message was automatically generated. |
d0da90d
to
6599570
Compare
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial comments -- reviewed about 50% so far.
assertEquals(0, exitCode); | ||
|
||
ResultSet rs = stmt.executeQuery("SELECT * FROM " + tableName); | ||
assertFalse(rs.next()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this check -- how can SELECT * FROM FOO return no rows but SELECT * FROM FOO WHERE return a row?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The import tool is actually importing on the index table without touching the data table, that is why data table is empty but index table is not empty. I think it will make more sense if I remove --index-table param and import it on the data table and check that the data table whose physical name is changed got populated. And then to check index, change the physical index table name too and then use --index-table param.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also assume that the Phoenix query planner chooses a specific plan. Better to hint the query with /*+ NO_INDEX */ if you do not want the index to be used.
phoenix-core/src/it/java/org/apache/phoenix/end2end/LogicalTableNameIT.java
Outdated
Show resolved
Hide resolved
props.put(QueryServices.MUTATE_BATCH_SIZE_ATTRIB, Integer.toString(3000)); | ||
//When we run all tests together we are using global cluster(driver) | ||
//so to make drop work we need to re register driver with DROP_METADATA_ATTRIB property | ||
destroyDriver(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than doing this manually, should this be a NeedsOwnCluster test?
phoenix-core/src/it/java/org/apache/phoenix/end2end/LogicalTableNameIT.java
Show resolved
Hide resolved
createIndexOnTable(conn, fullTableName, indexName); | ||
} | ||
|
||
SingleCellIndexIT.dumpTable(fullNewTableName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using TestUtil.dumpTable. Also, do we need to output to stdout in the final version of this? Can probably be cut.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we don't need it in the end. This one is simpler to call since it just takes the string tablename. The other one requires HTableInterface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before merging could you please remove?
phoenix-core/src/main/java/org/apache/phoenix/util/PhoenixRuntime.java
Outdated
Show resolved
Hide resolved
phoenix-core/src/main/java/org/apache/phoenix/schema/PMetaDataImpl.java
Outdated
Show resolved
Hide resolved
phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java
Outdated
Show resolved
Hide resolved
phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/IndexScrutinyMapper.java
Outdated
Show resolved
Hide resolved
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial review. Since started some time back, ignore the duplicates with Geoffrey's if any :)
phoenix-core/src/main/java/org/apache/phoenix/util/PhoenixRuntime.java
Outdated
Show resolved
Hide resolved
boolean isNamespaceMapped) { | ||
// Create global sequence of the form: <prefixed base table name>. | ||
// We can't use a tenant-owned or escaped sequence because of collisions, | ||
// with other view indexes that may be global or owned by other tenants that | ||
// also use this same physical view index table. It's also much easier | ||
// to cleanup when the physical table is dropped, as we can delete | ||
// all global sequences leading with <prefix> + physical name. | ||
String schemaName = getViewIndexSequenceSchemaName(physicalName, isNamespaceMapped); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment above won't be relevant anymore?
phoenix-core/src/main/java/org/apache/phoenix/iterate/BaseResultIterators.java
Outdated
Show resolved
Hide resolved
phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/IndexScrutinyMapper.java
Outdated
Show resolved
Hide resolved
phoenix-core/src/main/java/org/apache/phoenix/schema/PTableImpl.java
Outdated
Show resolved
Hide resolved
@@ -328,6 +328,10 @@ private static boolean isExistingTableMappedToPhoenixName(String name) { | |||
SEPARATOR_BYTE_ARRAY, Bytes.toBytes(familyName)); | |||
} | |||
|
|||
public static PName getTableName(PName schemaName, PName tableName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: getTablePName?
phoenix-core/src/main/java/org/apache/phoenix/schema/PMetaDataImpl.java
Outdated
Show resolved
Hide resolved
phoenix-core/src/it/java/org/apache/phoenix/end2end/LogicalTableNameIT.java
Outdated
Show resolved
Hide resolved
💔 -1 overall
This message was automatically generated. |
8466cbd
to
7e48c23
Compare
💔 -1 overall
This message was automatically generated. |
f49ab4c
to
32312f3
Compare
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
32312f3
to
9577961
Compare
💔 -1 overall
This message was automatically generated. |
9577961
to
3c32444
Compare
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments, rest LGTM.
phoenix-core/src/it/java/org/apache/phoenix/end2end/CsvBulkLoadToolIT.java
Outdated
Show resolved
Hide resolved
assertEquals(cnt, expected.size()); | ||
} | ||
|
||
public static void renameAndDropPhysicalTable(Connection conn, String tenantId, String schema, String tableName, String physicalName) throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be broken into 2 and call them "assignNewPhysicalTable" and "dropOldPhysicalTable"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a case right now that requires separate methods. It is a good test to do the switch and drop so that we are sure the old table is not used. In the future, I will consider separating.
phoenix-core/src/it/java/org/apache/phoenix/end2end/LogicalTableNameIT.java
Outdated
Show resolved
Hide resolved
if (parentTable == null) { | ||
physicalTables.add(famName); | ||
// If this is a view index, then one of the link is IDX_VW -> _IDX_ PhysicalTable link. Since famName is _IDX_ and we can't get this table hence it is null, we need to use actual view name | ||
parentLogicalName = (tableType == INDEX ? SchemaUtil.getTableName(parentSchemaName, parentTableName) : famName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
somthing like INDEX.equalsIgnoreCase(tableType) instad of ==
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
INDEX is PTableType not string
3c32444
to
022f222
Compare
@ChinmaySKulkarni - this is a significant revision of how Phoenix approaches metadata naming by @gokceni , would appreciate your opinion as well. |
SequenceKey key = MetaDataUtil.getViewIndexSequenceKey(tenantIdStr, physicalName, | ||
// parentTable is parent of the view index which is the view. | ||
// Since parent is the view, the parentTable.getParentLogicalName() returns the logical full name of the base table | ||
PName parentName = parentTable.getParentLogicalName(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused here. In the original logic, we passed in a physical name, but now we pass in a logical name if parentTable (the view) has one defined and a physical name if it doesn't. From the design doc you shared with me, sounds like it should usually be a constant logical name?
It's actually really important that we use the same table name here in all cases -- that all indexes on all views on a particular physical base table use the same sequence to generate view index ids. Otherwise you can get collisions between view index ids.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was only one place calling this function and it was using parentTable.getPhysicalName(); as physicalName. That mapped to full name of base table like SC1.TBL_1. getParentLogicalName returns the same now. But parentTable.getPhysicalName returns SC1.NEW_PHYSICALNAME_TBL1 which we don't.
I run all view related IT tests. Is there any other tests you would like me to run to check for collusions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically if you change the physical name of a table, this function is going to use the logical name instead of physical name of the base table. So the sequence will not go back to the beginning. In the next pr, I will create a view index, rename the table and create another view index to see that it is not colliding. This should be enough right?
@@ -583,6 +583,7 @@ private static int getReservedQualifier(byte[] bytes, int offset, int length) { | |||
PName getName(); | |||
PName getSchemaName(); | |||
PName getTableName(); | |||
PName getPhysicalTableName(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Distinction between getPhysicalTableName and getPhysicalName can be confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed getPhysicalTableName as getPhysicalTableNameColumnInSyscat. getPhysicalName is inferred and I could have renamed it but that would touch a lot of files.
@@ -1559,7 +1586,29 @@ public PName getParentTableName() { | |||
@Override | |||
public PName getParentName() { | |||
// a view on a table will not have a parent name but will have a physical table name (which is the parent) | |||
return (type!=PTableType.VIEW || parentName!=null) ? parentName : getPhysicalName(); | |||
// Update to above comment: we will return logical name of view parent base table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the parent base table and not the immediate parent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the existing logic. If this is a child view of another view, the parent is the immediate parent view. If it is a view directly on top of a table, the parent is the logical name of the table. Removed that comment.
@@ -1559,7 +1586,29 @@ public PName getParentTableName() { | |||
@Override | |||
public PName getParentName() { | |||
// a view on a table will not have a parent name but will have a physical table name (which is the parent) | |||
return (type!=PTableType.VIEW || parentName!=null) ? parentName : getPhysicalName(); | |||
// Update to above comment: we will return logical name of view parent base table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove "update to above comment"
@@ -1008,6 +1030,11 @@ public final PName getTableName() { | |||
return tableName; | |||
} | |||
|
|||
@Override | |||
public final PName getPhysicalTableName() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned elsewhere, why both the existing getPhysicalName and a new getPhysicalTableName? Can they be consolidated? Or at least named something more clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bit complicated. getPhysicalName is mostly used for views. getPhysicalTableName is mostly for non-views and maps to a column on syscat. Let me see what I can do.
@@ -1586,7 +1635,12 @@ public synchronized boolean getIndexMaintainers(ImmutableBytesWritable ptr, Phoe | |||
|
|||
@Override | |||
public PName getPhysicalName() { | |||
// For views, physicalName is base table name. There might be a case where the Phoenix table is pointing to another physical table. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For views, physicalName is the base table physical table name or logical table name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
physical. Will update the comment
@@ -678,13 +691,13 @@ public static SequenceKey getOldViewIndexSequenceKey(String tenantId, PName phys | |||
return new SequenceKey(isNamespaceMapped ? tenantId : null, schemaName, tableName, nSaltBuckets); | |||
} | |||
|
|||
public static String getViewIndexSequenceSchemaName(PName physicalName, boolean isNamespaceMapped) { | |||
public static String getViewIndexSequenceSchemaName(PName logicalName, boolean isNamespaceMapped) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be logicalParentName or logicalBaseTableName to make it clear that this is not the logical name of the view, but the suffix of the IDX table so that all view indexes of the same base table get the same view index sequence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change to logicalBaseTableName
@@ -111,6 +111,9 @@ message PTable { | |||
optional bool viewModifiedPhoenixTTL = 44; | |||
optional int64 lastDDLTimestamp = 45; | |||
optional bool changeDetectionEnabled = 46; | |||
optional bytes physicalTableNameBytes = 47; | |||
optional bool isModifiable = 48; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is isModifiable set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be set later as part of another change. Since I was changing this part, I added this one as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we generate proto code on-demand now, I don't think we gain anything by clumping unrelated protobuf changes into this PR. We can save isModifiable for next time when the change can be considered as a whole
💔 -1 overall
This message was automatically generated. |
022f222
to
b672e3e
Compare
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the new methods here, and what the metadata methods on PTable around naming should be going forward, needs more thought. The current PR proposes two new PTable properties that are almost-but-not-quite the same as existing properties.
It's really important to get the interfaces right because they'll be hard to change later. Maybe a design doc on the JIRA is a better place for that conversation than a PR?
@ChinmaySKulkarni fyi.
@@ -111,6 +111,9 @@ message PTable { | |||
optional bool viewModifiedPhoenixTTL = 44; | |||
optional int64 lastDDLTimestamp = 45; | |||
optional bool changeDetectionEnabled = 46; | |||
optional bytes physicalTableNameBytes = 47; | |||
optional bool isModifiable = 48; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we generate proto code on-demand now, I don't think we gain anything by clumping unrelated protobuf changes into this PR. We can save isModifiable for next time when the change can be considered as a whole
@@ -583,6 +583,7 @@ private static int getReservedQualifier(byte[] bytes, int offset, int length) { | |||
PName getName(); | |||
PName getSchemaName(); | |||
PName getTableName(); | |||
PName getPhysicalTableNameColumnInSyscat(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this more, I don't think just changing the name solves the problem of having two methods that are almost-but-not-quite the same thing. I'd still be confused about which to call.
Can we not use the existing getPhysicalNames() to return the physical table name column from syscat in the situations where that's appropriate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getPhysicalName already returns physical table name column from syscat when appropriate.
We still need 2 methods. One represents the actual column in Syscat. The other is inferred (like views or view indexes).
@@ -728,6 +729,13 @@ private static int getReservedQualifier(byte[] bytes, int offset, int length) { | |||
* (use @getPhysicalTableName for this case) | |||
*/ | |||
PName getParentTableName(); | |||
|
|||
/** | |||
* @return the logical name of the parent. In case of the view index, it is the _IDX_+logical name of base table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why IDX + logical name of base table for view index? That's the physical table the view index is stored in, not the logical name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I understand your comment. IDX prefix + logical name of the table is where the view index is stored in.
* @return the logical name of the parent. In case of the view index, it is the _IDX_+logical name of base table | ||
* Ex: For hierarchical views like tableLogicalName --> view1 --> view2, for view2, returns _IDX_+tableLogicalName | ||
*/ | ||
PName getParentLogicalName(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise with getParentName vs getParentLogicalName. If I'm reading the current logic right, getParentName also returns only logical names. They're almost-but-not-quite the same thing. Can we consolidate? Otherwise we're going to create lots of subtle, really-hard-to-spot bugs going forward when people use the wrong one.
@@ -583,6 +583,7 @@ private static int getReservedQualifier(byte[] bytes, int offset, int length) { | |||
PName getName(); | |||
PName getSchemaName(); | |||
PName getTableName(); | |||
PName getPhysicalTableNameColumnInSyscat(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getPhysicalNames makes a point of returning a list of PNames to leave the interface open for views that span multiple tables (not currently supported but long on the wishlist). Does anything in this PR, such as getPhysicalTableNameColumnInSyscat returning a single String, prevent us from having multi-table views later?
b672e3e
to
756c1d4
Compare
💔 -1 overall
This message was automatically generated. |
756c1d4
to
a13a0fc
Compare
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some nits and then I think we're good to merge
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java
Outdated
Show resolved
Hide resolved
@@ -896,6 +898,9 @@ private boolean addColumnsAndIndexesFromAncestors(MetaDataMutationResult result, | |||
MetaDataMutationResult parentResult = updateCache(connection.getTenantId(), parentSchemaName, parentTableName, | |||
false, resolvedTimestamp); | |||
PTable parentTable = parentResult.getTable(); | |||
if (LOGGER.isDebugEnabled()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this logging? Wondering if it should be at TRACE level to avoid having a bunch of logs of it, since I think this is a pretty frequently used function.
phoenix-core/src/main/java/org/apache/phoenix/schema/PTable.java
Outdated
Show resolved
Hide resolved
phoenix-core/src/it/java/org/apache/phoenix/end2end/LogicalTableNameIT.java
Outdated
Show resolved
Hide resolved
createIndexOnTable(conn, fullTableName, indexName); | ||
} | ||
|
||
SingleCellIndexIT.dumpTable(fullNewTableName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before merging could you please remove?
phoenix-core/src/it/java/org/apache/phoenix/end2end/LogicalTableNameIT.java
Outdated
Show resolved
Hide resolved
phoenix-core/src/it/java/org/apache/phoenix/end2end/LogicalTableNameIT.java
Outdated
Show resolved
Hide resolved
phoenix-core/src/it/java/org/apache/phoenix/end2end/LogicalTableNameIT.java
Outdated
Show resolved
Hide resolved
a13a0fc
to
88228c4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, thanks @gokceni
88228c4
to
18b4736
Compare
Merged to 4.x-PHOENIX-6247 branch. Thanks for the review! |
💔 -1 overall
This message was automatically generated. |
No description provided.