Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHOENIX-4799 Write cells using checkAndMutate to prevent conflicting … #313

Closed
wants to merge 1 commit into from

Conversation

twdsilva
Copy link
Contributor

…changes

@vincentpoon @karanmehta93 @ChinmaySKulkarni Can you please review?
With this patch, when we add a column we write a cell to SYSTEM.CHILD_LINK with row key (tenantId, schemaName, physicalTableName, columnName) to prevent conflicting concurrent modifications.
While dropping a table or creating a view we also write a cell to SYSTEM.CHILD_LINK with row key (tenantId, schemaName, physicalTableName).
This is done in MetadataClient before we make an rpc to create view / drop table / add column.

@karanmehta93
Copy link
Member

Just curious, why not use SYSTEM.MUTEX table for this? Its a table specially added for use-cases that need time based expiration and distributed locks.

@twdsilva
Copy link
Contributor Author

Thanks for the suggestion, I will update the patch to use SYSTEM.MUTEX

@twdsilva twdsilva force-pushed the PHOENIX-4799 branch 3 times, most recently from e7dbe0e to fb916c4 Compare July 24, 2018 23:50
@twdsilva
Copy link
Contributor Author

@karanmehta93 I have updated the PR to use SYSTEM.MUTEX, please review.

@twdsilva twdsilva force-pushed the PHOENIX-4799 branch 2 times, most recently from f4bbaf9 to 39f39e1 Compare July 25, 2018 01:50

// add a new column to the base table to ensure that the cell used
// to prevent concurrent modifications was removed
tableDdl = "ALTER TABLE " + fullTableName + " ADD v4 INTEGER";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You didn't execute this ddl statement

latch2.countDown();

Exception e = future.get();
assertTrue(e == null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: assertNull

// multiple times
throw new DoNotRetryIOException();
} else if (shouldSlowDown(c, miniBatchOp.getOperation(0))) {
// simulate a slow write to SYSTEM.CATALOG
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

}

/**
* Remove the cell that was written to to the SYSTEM.CHILD_LINK table with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the comment since you're using SYSMUTEX now

@@ -3034,6 +3086,10 @@ MutationState dropTable(String schemaName, String tableName, String parentTableN
return new MutationState(0, 0, connection);
} finally {
connection.setAutoCommit(wasAutoCommit);
// lock the table to prevent concurrent table modifications
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update comment

@ChinmaySKulkarni
Copy link
Contributor

A few comments and nits. Do we have to do anything special to handle indexes on views? I guess a view on a view should be fine with this implementation since the physical table name would resolve to the final base table. @twdsilva

@twdsilva
Copy link
Contributor Author

We can't add/drop columns directly to an index on a view. Views with their own child views will write a mutex using the physical table.

@ChinmaySKulkarni and @karanmehta93 thanks for the feedback. I have updated the PR, please take a look.

Copy link
Contributor

@ankitsinghal ankitsinghal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@twdsilva , general comment
how we are protecting such mutation in case we are using the old client?

SchemaUtil.getSchemaNameFromFullName(physicalName.getString());
String physicalTableName =
SchemaUtil.getTableNameFromFullName(physicalName.getString());
deleteCell(null, physicalSchemaName, physicalTableName,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@twdsilva , shouldn't you check whether you have acquired a lock(or inserted the cell in Mutex) before deleting the lock cell, because here you might be deleting the locks acquired by some other threads. And, You may need to do this per column I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the code to only delete the cells if we were able to successfully do the checkAndPut.
When we add a column we write a cell per column that we are creating.
When we drop a base table or create a view, we write a single cell with the rowkey of the physical table (to prevent a view being created while we are dropping the base table).

// dropping the base table
if (tableType == PTableType.TABLE) {
deleteCell(null, schemaName, tableName, null);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please update the comment here that you are releasing a mutex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@twdsilva
Copy link
Contributor Author

@ankitsinghal I filed PHOENIX-4765 to not allow metadata changes on a base table that has child views when the request is from an older client. This will also allow us to rollback the upgraded server side jar if required.

@twdsilva
Copy link
Contributor Author

@ankitsinghal Thanks for the review. I have updated the PR to based on the feedback. Can you please take a look?

@@ -3604,6 +3681,18 @@ public MutationState addColumn(PTable table, List<ColumnDef> origColumnDefs,
}
} finally {
connection.setAutoCommit(wasAutoCommit);
if (acquiredMutex && !columns.isEmpty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is acquiredMutex set to true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ankitsinghal Sorry, I didn't update the PR with all my latest changes, I fixed this. Can you please review?

@twdsilva twdsilva force-pushed the PHOENIX-4799 branch 2 times, most recently from b49a625 to 7412c39 Compare July 31, 2018 06:51
static final String SYSTEM_MUTEX_IDENTIFIER =
QueryConstants.SYSTEM_SCHEMA_NAME + "." + "\""
+ PhoenixDatabaseMetaData.SYSTEM_MUTEX_TABLE_NAME + "\"";

static final Set<String> PHOENIX_NAMESPACE_MAPPED_SYSTEM_TABLES = new HashSet<>(Arrays.asList(
Copy link
Member

@karanmehta93 karanmehta93 Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to add SYSTEM:MUTEX over here as well since now its visible from Phoenix level as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -2723,6 +2712,9 @@ private void createOtherSystemTables(PhoenixConnection metaConnection, HBaseAdmi
try {
metaConnection.createStatement().executeUpdate(getChildLinkDDL());
} catch (TableAlreadyExistsException e) {}
try {
metaConnection.createStatement().executeUpdate(getMutexDDL());
} catch (TableAlreadyExistsException e) {}
// Catch the IOException to log the error message and then bubble it up for the client to retry.
try {
createSysMutexTableIfNotExists(hbaseAdmin);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this part of code?

@@ -3410,17 +3405,10 @@ void ensureSystemTablesMigratedToSystemNamespace()
// No tables exist matching "SYSTEM\..*", they are all already in "SYSTEM:.*"
if (tableNames.size() == 0) { return; }
// Try to move any remaining tables matching "SYSTEM\..*" into "SYSTEM:"
if (tableNames.size() > 5) {
if (tableNames.size() > 7) {
logger.warn("Expected 5 system tables but found " + tableNames.size() + ":" + tableNames);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: change log message appropriately. Extract the constant to a variable

@@ -1913,6 +1940,21 @@ private PTable createTableInternal(CreateTableStatement statement, byte[][] spli
boolean isLocalIndex = indexType == IndexType.LOCAL;
QualifierEncodingScheme encodingScheme = NON_ENCODED_QUALIFIERS;
ImmutableStorageScheme immutableStorageScheme = ONE_CELL_PER_COLUMN;

if (tableType == PTableType.VIEW) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Would be good to have some logs here for debugging purposes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if (tableType == PTableType.TABLE) {
// acquire a mutex on the table to prevent creating views while concurrently
// dropping the base table
acquiredMutex = writeCell(null, schemaName, tableName, null);
Copy link
Member

@karanmehta93 karanmehta93 Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing is prevented even if acquiredMutex is false here. I think you should throw ConcurrentTableMutationException here as well? You might might wanna push down that logic to writeCell() method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I missed that. I originally had writeCell throw the ConcurrentTableMutationException, but we need to know we need writeCell to return false if it wasn't able to do the checkAndPut so that we can set the acquiredMutex boolean which is later used to determine if we have to delete the cell in the finally block.

@@ -16,8 +16,29 @@
*/
package org.apache.phoenix.end2end;

import com.google.common.base.Joiner;
import com.google.common.base.Throwables;
import static org.junit.Assert.assertEquals;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Diff generated due to change in order of imports

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the dev/phoenix.importorder format, so this should be fine.

@twdsilva
Copy link
Contributor Author

@karanmehta93 or @ankitsinghal Can you please review the updated PR, thanks!

Copy link
Contributor

@ankitsinghal ankitsinghal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @twdsilva for taking care of review comments, have left some more review comments on the recent changes.

+ TABLE_SCHEM + "," + TABLE_NAME + "," + COLUMN_NAME + "," + COLUMN_FAMILY + "))\n" +
HConstants.VERSIONS + "=%s,\n" +
HColumnDescriptor.KEEP_DELETED_CELLS + "=%s,\n" +
PhoenixDatabaseMetaData.TRANSACTIONAL + "=" + Boolean.FALSE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why there is a need of creating a Phoenix managed table for mutex?
And also API in QueryServices.writeMutexCell(byte[] rowKey) and deleteMutexCell(byte[] rowKey) don't enforce the schema of the table will be followed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clusters that use namespace mapping and that map the phoenix system tables to the SYSTEM namespace, we want to be able to use the GRANT/REVOKE statements to grant RW access to the SYSTEM:MUTEX table.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting writeMutexCell/deleteMutexCell should take as arguments (tenantid, schema, tablename, column name, column family) instead of a byte[]? I will make that change.

Copy link
Contributor

@ankitsinghal ankitsinghal Aug 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok make sense, so can you just update QueryServices.writeMutexCell(byte[] rowKey) and deleteMutexCell(byte[] rowKey) to accept all the arguments which form the right primary key for the table just for consistency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, probably our comments have crossed.

logger.error("Failed to created SYSMUTEX table. Upgrade or migration is not possible without it. Please retry.");
throw exception;
}
metaConnection.createStatement().executeUpdate(getMutexDDL());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the mutex table available and have acquired a mutex already for the upgrade before you call createOtherSystemTables

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mutex table would have been created by createSysMutexTableIfNotExists(), we call execute the CREATE TABLE statement so that it exists in SYSTEM.CATALOG, so that we can use GRANT/REVOKE to grant permission on the SYSTEM.MUTEX table.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, then we are good.

@twdsilva
Copy link
Contributor Author

twdsilva commented Aug 2, 2018

Testing issues@phoenix.apache.org.

@twdsilva
Copy link
Contributor Author

twdsilva commented Aug 3, 2018

@ankitsinghal @karanmehta93 I updated the PR with the latest feedback, can you please review?

@twdsilva twdsilva force-pushed the PHOENIX-4799 branch 2 times, most recently from 16db8aa to 8a93ada Compare August 3, 2018 17:57
// dropping the base table
acquiredMutex = writeCell(null, schemaName, tableName, null);
if (!acquiredMutex) {
logger.info("Failed to acquire mutex on physical table " + physicalTableName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: You can push mutex related logs lines to the specific methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, can I get a +1?

@twdsilva twdsilva closed this Aug 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants