-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIVE-28121: Use direct SQL for transactional altering table parameter #5129
Conversation
@pvary Could you please have a look? I'm unable to reproduce the failed test, and I suppose the no-lock feature is disabled by default. |
@Override | ||
public int updateParameterWithExpectedValue(Table table, String key, String expectedValue, String newValue) | ||
throws MetaException { | ||
String dml = String.format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we do something like this:
https://www.datanucleus.org/products/accessplatform_6_0/jdo/query.html#jdoql_bulkupdate
I would like to avoid:
- Writing strings to the queries based on table parameters
- Using native connection
If using jdo queries doesn't work, then I would create a method in MetaStoreDirectSql
for this, and throw an exception if directSql is not turned on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting datanucleus.query.jdoql.allowAll=true
allows us to run UPDATE with JDO query. I tried something like this but it doesn't work.
Map<String, String> newParams = new HashMap<>(table.getParameters());
newParams.put(key, newValue);
openTransaction();
Query query = pm.newQuery("UPDATE org.apache.hadoop.hive.metastore.model.MTable " +
"SET parameters=:newparams WHERE database.name == :dbname && tableName == :tblname && " +
"parameters.containsEntry(:key, :expval)");
int affectedRows = (int) query.executeWithMap(ImmutableMap.of(
"newparams", newParams,
"dbname", table.getDbName(),
"tblname", table.getTableName(),
"key", key,
"expval", expectedValue
));
The error is
Caused by: org.datanucleus.store.rdbms.sql.expression.IllegalExpressionOperationException: Cannot perform operation "==" on org.datanucleus.store.rdbms.sql.expression.MapExpression@6f9c5048 and org.datanucleus.store.rdbms.sql.expression.MapLiteral@5114b7c7
at org.datanucleus.store.rdbms.sql.expression.SQLExpression.eq(SQLExpression.java:381)
at org.datanucleus.store.rdbms.sql.expression.MapExpression.eq(MapExpression.java:80)
at org.datanucleus.store.rdbms.query.QueryToSQLMapper.compileUpdate(QueryToSQLMapper.java:1134)
I believe it's because MapExpression only supports eq with null literals in DN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's sad 😢
Then we need to fall back to directsql.
Something like this:
String queryText = "UPDATE \"TABLE_PARAMS\" SET \"PARAM_VALUE\" = ? " +
"WHERE \"TBL_ID\" = ? AND \"PARAM_KEY\" = ? AND \"PARAM_VALUE\" = ?";
try (QueryWrapper query = new QueryWrapper(pm.newQuery("javax.jdo.query.SQL", queryText))) {
Object[] params = new Object[4];
params[0] = ...;
long res = query.executeWithArray(params);
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that ok, if i change code as below?
String queryText = "UPDATE \"TABLE_PARAMS\" SET \"PARAM_VALUE\" = ? " + "WHERE \"TBL_ID\" in (select TBL_ID from TBLS join DBS ON DBS.DB_ID=TBLS.DB_ID WHERE NAME=? and TBL_NAME=? ) AND \"PARAM_KEY\" = ? AND \"PARAM_VALUE\" = ?"; try (QueryWrapper query = new QueryWrapper(pm.newQuery("javax.jdo.query.SQL", queryText))) { Object[] params = new Object[4]; params[0] = ...; long res = query.executeWithArray(params); ... }
cause i don't cherry pick HIVE-22234, can't get table id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for old versions where Table object doesn't have the ID, I think we need to retrieve it from TBLS
. I'm planning to do this in PRs for the release branches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Let's see if there are any further comments
@Override | ||
public int updateParameterWithExpectedValue(Table table, String key, String expectedValue, String newValue) | ||
throws MetaException { | ||
String dml = String.format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's sad 😢
Then we need to fall back to directsql.
Something like this:
String queryText = "UPDATE \"TABLE_PARAMS\" SET \"PARAM_VALUE\" = ? " +
"WHERE \"TBL_ID\" = ? AND \"PARAM_KEY\" = ? AND \"PARAM_VALUE\" = ?";
try (QueryWrapper query = new QueryWrapper(pm.newQuery("javax.jdo.query.SQL", queryText))) {
Object[] params = new Object[4];
params[0] = ...;
long res = query.executeWithArray(params);
...
}
@pvary Thanks for the reviews. One thing to mention is this change adds a new exception message for commit conflict: And btw the change has been verified with Postgres, MySQL and MS SQL Server. I didn't manage to get a working Oracle instance. |
That’s a good point. If possible we should keep the old message, so we do not create even more confusion of versions... The check there is like this:
Either we can issue a query to check the new value, or change it to "the parameter value for key ... is different" |
@pvary I have made the exception messages consistent. Let me know if you have any further comments. |
@lirui-apache: we need to fix the ci errors |
unrelated test failure, restarted the build |
Tried to merge using the GitHub UI. On the UI the PR is still not merged. @deniskuzZ: Could you please take a look at, if you have some time, to confirm that everything is OK? |
Thanks @pvary for reviewing and merging the PR. I'm closing it manually. |
@lirui-apache: Thanks for the report and the fix @lirui-apache! @deniskuzZ: Which branches are likely to get new releases? If we want to fix all of the places where the previous PR was released, we might want to add this to:
|
hey @pvary
we are planning to release 4.0.1 in a month, not sure about other branches. let me add the proper label to the ticket |
The fix versions of HIVE-26882 are 2.3.10 and 4.0.0-beta-1. Does that mean we don't need this in 3.x? |
HIVE-26882 was merged to 3.x and 3.1, so whoever maintains these branches has to cherry-pick |
@deniskuzZ OK, then I'll create PRs for these branches. |
when i use this patch, the error is : and i use mysql, my code is String queryText = "UPDATE \"TABLE_PARAMS\" SET \"PARAM_VALUE\" = ? " + "WHERE \"TBL_ID\" in " +
"(select TBL_ID from TBLS join DBS ON DBS.DB_ID=TBLS.DB_ID WHERE NAME=? and TBL_NAME=? ) " +
"AND \"PARAM_KEY\" = ? AND \"PARAM_VALUE\" = ?";
List<String> pms = new ArrayList<>();
pms.add(newValue);
pms.add(table.getDbName());
pms.add(table.getTableName());
pms.add(key);
pms.add(expectedValue);
Query queryParams = pm.newQuery("javax.jdo.query.SQL", queryText);
return (long) executeWithArray(queryParams, pms.toArray(), queryText); |
@chenwyi2: Could you please share the exception from the metastore side, and your metastore version? Thanks, Peter |
@chenwyi2 Could you check whether |
2024-04-19 12:03:06,849 ERROR [pool-9-thread-1]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(201)) - MetaException(message:JDOQL Single-String query should always start with SELECT) and i use 3.1.2 in my metastore |
sorry it's my fault, i doesn't add parameter in metastore cont just like 7bca1b3, and now it is fine |
@pan3793: It takes some time to do all the backports and tests. If you create the PR, I would be happy to review |
@pan3793 @pvary I already created a git tag for 2.3.10 RC0: https://github.com/apache/hive/releases/tag/release-2.3.10-rc0. Let me know how important it is to backport this PR. I can wait if you decide to do so. |
to @pvary I opened #5204 to backport it to branch-2.3 |
OK, sounds good. I just created 2.3.10 RC0 and started a vote thread. Please checkout the candidate and see if there is any issue in Spark integration. cc @dongjoon-hyun too. |
Thank you, @sunchao and all. |
…e parameter (apache#5129) (cherry picked from commit 7378962)
…from 2.3.10 into ODP hive-2.3.101 and merge into 2.3.102 branch (#28) * ODP-1794 | HIVE-7145: Remove dependence on apache commons-lang (David Lavati via László Bodor, Zoltan Haindrich) Signed-off-by: Zoltan Haindrich <kirk@rxd.hu> (cherry picked from commit 50296ef) * HIVE-20016: Investigate TestJdbcWithMiniHS2.testParallelCompilation3 random failure (Yongzhi Chen, reviewed by Aihua Xu) (cherry picked from commit 3b6d4e2) * ODP-1794 | HIVE-28121: Use direct SQL for transactional altering table parameter (apache#5129) (cherry picked from commit 7378962) * ODP-1794 Chnaged libthrift version to 0.16.0 Signed-off-by: Zoltan Haindrich <kirk@rxd.hu> (cherry picked from commit 50296ef) * Updated Hive pom from 2.3.101 to 2.3.102 * ODP-1794 Changed hive version from 2.3.101 to 2.3.102 and ivy version to 2.5.2 --------- Co-authored-by: David Lavati <david.lavati@gmail.com> Co-authored-by: Yongzhi Chen <ychena@apache.org> Co-authored-by: Rui Li <lirui@apache.org> Co-authored-by: Shubham Sharma <shubh.luck@yahoo.in>
What changes were proposed in this pull request?
Use direct SQL for transactional update table parameter and check the number of affected rows to detect concurrent writes.
Why are the changes needed?
Maintain consistency in case of concurrent writes.
Does this PR introduce any user-facing change?
No
Is the change a dependency upgrade?
No
How was this patch tested?
Covered by existing tests