Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-19488 : enable CM root based on db parameter, identifying db as a source of replication #345

Closed
wants to merge 3 commits into from

Conversation

maheshk114
Copy link
Contributor

@maheshk114 maheshk114 changed the title HIVE-19488 : enable CM root based on db parameter, identifying a db as source of replication HIVE-19488 : enable CM root based on db parameter, identifying db as a source of replication May 10, 2018
@maheshk114 maheshk114 force-pushed the BUG-102289 branch 9 times, most recently from 04549ce to 1bae630 Compare May 14, 2018 05:19
@@ -90,11 +92,9 @@ private void export_meta_data(PreDropTableEvent tableEvent) throws MetaException
EximUtil.createExportDump(fs, outFile, mTbl, null, null,
new HiveConf(conf, MetaDataExportListener.class));
if (moveMetadataToTrash == true) {
wh.deleteDir(metaPath, true);
wh.deleteDir(metaPath, true, Hive.get().getDatabase(tbl.getDbName()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we are deleting the metaPath which is the export dir not warehouse location. So, last argument should be false always.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

} catch (IOException e) {
throw new MetaException(e.getMessage());
} catch (SemanticException e) {
} catch (IOException | MetaException | HiveException e) {
Copy link
Contributor

@sankarh sankarh May 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wrapping MetaException over another MetaException? Shall remove it from catch block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -349,7 +351,9 @@ private void removeFiles(String location, ValidWriteIdList writeIdList, Compacti

for (Path dead : filesToDelete) {
LOG.debug("Going to delete path " + dead.toString());
replChangeManager.recycle(dead, ReplChangeManager.RecycleType.MOVE, true);
if (ReplChangeManager.isReplPolicySet(Hive.get().getDatabase(ci.dbname))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get the database object outside the loop and store it in local variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

LOG.info("Partition directory rename from " + srcPath + " to " + destPath + " done.");
dataWasMoved = true;
}
} catch (IOException e) {
LOG.error("Cannot rename partition directory from " + srcPath + " to " + destPath, e);
throw new InvalidOperationException("Unable to access src or dest location for partition "
+ tbl.getDbName() + "." + tbl.getTableName() + " " + new_part.getValues());
} catch (MetaException me) {
} catch (MetaException | NoSuchObjectException me) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapped MetaException with MetaException.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

LOG.info("Testing " + name);
String dbName = name + "_" + tid;
run("CREATE DATABASE " + dbName + " WITH DBPROPERTIES ( '" +
SOURCE_OF_REPLICATION + "' = '1,2,3')", myDriver);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this value '1,2,3' have any meaning? If not, can we set it to "true" or "false"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as per the design ..it should have the replication ids

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -1031,7 +1031,7 @@ public void create_catalog(CreateCatalogRequest rqst)
if (!success) {
ms.rollbackTransaction();
if (madeDir) {
wh.deleteDir(catPath, true);
wh.deleteDir(catPath, true, false, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No data in catalog path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -1381,6 +1381,10 @@ private void drop_database_core(RawStore ms, String catName,
ms.openTransaction();
db = ms.getDatabase(catName, name);

if (!isInTest && ReplChangeManager.isReplPolicySet(db)) {
throw new InvalidOperationException("can not drop a database which is a source of replication");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this limitation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a database is source of replication then can not drop it. thats the requirement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -1161,7 +1161,7 @@ private void dropCatalogCore(String catName)
success = ms.commitTransaction();
} finally {
if (success) {
wh.deleteDir(wh.getDnsPath(new Path(cat.getLocationUri())), false);
wh.deleteDir(wh.getDnsPath(new Path(cat.getLocationUri())), false, false, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catalog path won't have any data and hence need not recycle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return replPolicyIds != null && !replPolicyIds.isEmpty();
}

public static String getReplPolicyIdString(Database db) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need not be public method as it is called within this class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as its storing ids ..its better to have a public function to get the ids ..if its required in future

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move the repl id extract logic to database object

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

database is an auto generated file so it is not moved there

@@ -467,4 +470,27 @@ static void scheduleCMClearer(Configuration conf) {
0, MetastoreConf.getTimeVar(conf, ConfVars.REPLCMINTERVAL, TimeUnit.SECONDS), TimeUnit.SECONDS);
}
}

public static boolean isReplPolicySet(Database db) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can have uniform naming like isSourceOfRepl?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -467,4 +470,27 @@ static void scheduleCMClearer(Configuration conf) {
0, MetastoreConf.getTimeVar(conf, ConfVars.REPLCMINTERVAL, TimeUnit.SECONDS), TimeUnit.SECONDS);
}
}

public static boolean isReplPolicySet(Database db) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this isReplPolicySet method as public method in Database object itself? That may be clean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think ..its related to repl change manager ..so better to keep it here ..it will be easier to track if any changes are done

@@ -4024,6 +4022,7 @@ private boolean drop_partition_common(RawStore ms, String catName, String db_nam
new DropPartitionEvent(tbl, part, true, deleteData, this),
envContext);
}
isReplEnabled = ReplChangeManager.isReplPolicySet(ms.getDatabase(catName, db_name));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isReplEnabled name is misleading. Can use isNeedCmRecycle itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the function is giving if repl is enabled or not ...or if its a source of replication ..recycle or not ..thats is different functionality

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

public void testDumpNonReplDatabase() throws IOException {
String dbName = createDBNonRepl(testName.getMethodName(), driver);
verifyFail("REPL DUMP " + dbName, driver);
verifyFail("REPL DUMP " + dbName + " from 1 ", driver);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall Alter database to add source_of_replication DB property and see if Dump succeed afterwards.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

try {
initReplDump(ast);
} catch (HiveException e) {
throw new SemanticException("repl dump failed " + e.getMessage());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why need to convert HiveException to SemanticException. Instead initReplDump shall throw SemanticException itself right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getDatabase will throw hiveExcpetion ..either need to convert it at initreplDump or here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converting in initReplDump seems cleaner. Also, check if it harms if we don't convert as SemanticException is derived from HiveException only.

int numChildren = ast.getChildCount();
dbNameOrPattern = PlanUtils.stripQuotes(ast.getChild(0).getText());

for (String dbName : Utils.matchesDb(db, dbNameOrPattern)) {
if (dbName.equalsIgnoreCase(Warehouse.DEFAULT_DATABASE_NAME)) {
Copy link
Contributor

@sankarh sankarh May 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why default database exempted from source_of_replication check? Also, CM never recycle any dropped files from default database and so allowing it would be problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

replChangeManager.recycle(dead, ReplChangeManager.RecycleType.MOVE, true);
if (ReplChangeManager.isSourceOfReplication(db)) {
replChangeManager.recycle(dead, ReplChangeManager.RecycleType.MOVE, true);
}
fs.delete(dead, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If CM move the file, then why do we need to delete here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cm may not remove the directory so keeping it as it is

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -633,7 +634,8 @@ public Partition alterPartition(final RawStore msdb, Warehouse wh, final String
}

//rename the data directory
wh.renameDir(srcPath, destPath, true);
wh.renameDir(srcPath, destPath,
ReplChangeManager.isSourceOfReplication(msdb.getDatabase(catName, dbname)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

msdb.getDatabase can be stored earlier as it is used somewhere as well and shall get rid if NoSuchObjectException handling too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


public static boolean isSourceOfReplication(Database db) {
// Can not judge, so assuming replication is not enabled.
if (db == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall just assert db != null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return false;
}
String replPolicyIds = getReplPolicyIdString(db);
return replPolicyIds != null && !replPolicyIds.isEmpty();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall use !StringUtils.isBlank() which checks for both null and empty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done ..string util isEmpty checks for null also

public static String getReplPolicyIdString(Database db) {
if (db != null) {
Map<String, String> m = db.getParameters();
if ((m != null) && (m.containsKey(SOURCE_OF_REPLICATION))) {
Copy link
Contributor

@sankarh sankarh May 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall check if it works with different case for the key. I mean, set the property name in upper case during create Db or alter db and try if we able to read it here.

LOG.debug("repl policy for database {} is {}", db.getName(), replPolicyId);
return replPolicyId;
}
LOG.info("Repl policy is not set for database ", db.getName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also can be debug log.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -1761,10 +1767,10 @@ private void create_table_core(final RawStore ms, final Table tbl,

ms.openTransaction();

Database db = ms.getDatabase(tbl.getCatName(), tbl.getDbName());
db = ms.getDatabase(tbl.getCatName(), tbl.getDbName());
if (db == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't return null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -467,4 +470,24 @@ static void scheduleCMClearer(Configuration conf) {
0, MetastoreConf.getTimeVar(conf, ConfVars.REPLCMINTERVAL, TimeUnit.SECONDS), TimeUnit.SECONDS);
}
}

public static boolean isSourceOfReplication(Database db) {
// Can not judge, so assuming replication is not enabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is inappropriate here.

}

public static String getReplPolicyIdString(Database db) {
if (db != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db is always != null here.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Feel free to reach out on the dev@hive.apache.org list if the patch is in need of reviews.

@github-actions github-actions bot added the stale label Jun 16, 2020
@github-actions github-actions bot closed this Jun 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants