Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-24360][SQL] Support Hive 3.1 metastore #23694

Closed
wants to merge 2 commits into from

Conversation

@dongjoon-hyun
Copy link
Member

commented Jan 30, 2019

What changes were proposed in this pull request?

Hive 3.1.1 is released. This PR aims to support Hive 3.1.x metastore.
Please note that Hive 3.0.0 Metastore is skipped intentionally.

How was this patch tested?

Pass the Jenkins with the updated test cases including 3.1.

@@ -1179,3 +1180,128 @@ private[client] class Shim_v2_1 extends Shim_v2_0 {
private[client] class Shim_v2_2 extends Shim_v2_1

private[client] class Shim_v2_3 extends Shim_v2_1

private[client] class Shim_v3_1 extends Shim_v2_3 {

This comment has been minimized.

Copy link
@HyukjinKwon

HyukjinKwon Jan 30, 2019

Member

I think it's fine if the signatures are matched and the tests pass. Let me double check them within tomorrow.

This comment has been minimized.

Copy link
@dongjoon-hyun

dongjoon-hyun Jan 30, 2019

Author Member

Thanks!

fix
@SparkQA

This comment has been minimized.

Copy link

commented Jan 30, 2019

Test build #101874 has finished for PR 23694 at commit a6a0810.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.
@SparkQA

This comment has been minimized.

Copy link

commented Jan 30, 2019

Test build #101876 has finished for PR 23694 at commit 9f621e7.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.
@dongjoon-hyun

This comment has been minimized.

Copy link
Member Author

commented Jan 30, 2019

Retest this please.

@@ -100,6 +100,7 @@ private[hive] object IsolatedClientLoader extends Logging {
case "2.1" | "2.1.0" | "2.1.1" => hive.v2_1
case "2.2" | "2.2.0" => hive.v2_2
case "2.3" | "2.3.0" | "2.3.1" | "2.3.2" | "2.3.3" | "2.3.4" => hive.v2_3
case "3.1" | "3.1.0" | "3.1.1" => hive.v3_1

This comment has been minimized.

Copy link
@gatorsmile

gatorsmile Jan 30, 2019

Member

What is the reason we do not support 3.0?

This comment has been minimized.

Copy link
@dongjoon-hyun

dongjoon-hyun Jan 30, 2019

Author Member

It's not stable like Hadoop 3.0. Spark skipped Hadoop 3.0 and go with Hadoop 3.1.

This comment has been minimized.

Copy link
@gatorsmile

gatorsmile Jan 30, 2019

Member

I think we face the same issue in 1.0, 2.0, but we still support them. Could we simply support it and let the end-users decide it by themselves?

This comment has been minimized.

Copy link
@dongjoon-hyun

dongjoon-hyun Jan 30, 2019

Author Member

Then, hopefully, I can proceed that in another PR.

@SparkQA

This comment has been minimized.

Copy link

commented Jan 30, 2019

Test build #101877 has finished for PR 23694 at commit 9f621e7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
@dongjoon-hyun

This comment has been minimized.

Copy link
Member Author

commented Jan 30, 2019

Also, cc @wangyum since he is working on Hive upgrade.

numDP: JInteger, listBucketingLevel, isAcid, writeIdInLoadTableOrPartition,
stmtIdInLoadTableOrPartition, hasFollowingStatsTask, AcidUtils.Operation.NOT_ACID,
replace: JBoolean)
}

This comment has been minimized.

Copy link
@HyukjinKwon

HyukjinKwon Jan 31, 2019

Member

@dongjoon-hyun, not a big deal but how about adding dropIndex with, say, unsupported exception?

This comment has been minimized.

Copy link
@dongjoon-hyun

dongjoon-hyun Jan 31, 2019

Author Member

It's possible, but it doesn't help much. As we see here, Hive.getIndexes raises NoSuchMethodError before we call shim.dropIndex.

This comment has been minimized.

Copy link
@HyukjinKwon

HyukjinKwon Jan 31, 2019

Member

Yea, I noticed it. I am leaving this comment orthogonally just as a sanity check.

@HyukjinKwon
Copy link
Member

left a comment

LGTM except one question

classOf[AcidUtils.Operation],
JBoolean.TYPE)

override def loadPartition(

This comment has been minimized.

Copy link
@HyukjinKwon
assert(session.nonEmpty)
val database = session.get.sessionState.catalog.getCurrentDatabase
val table = hive.getTable(database, tableName)
val loadFileType = if (replace) {

This comment has been minimized.

Copy link
@HyukjinKwon
clazzLoadFileType.getEnumConstants.find(_.toString.equalsIgnoreCase("KEEP_EXISTING"))
}
assert(loadFileType.isDefined)
loadDynamicPartitionsMethod.invoke(hive, loadPath, tableName, partSpec, loadFileType.get,

This comment has been minimized.

Copy link
@HyukjinKwon
writeIdInLoadTableOrPartition, stmtIdInLoadTableOrPartition: JInteger, replace: JBoolean)
}

override def loadDynamicPartitions(

This comment has been minimized.

Copy link
@HyukjinKwon

private[client] class Shim_v3_1 extends Shim_v2_3 {
// Spark supports only non-ACID operations
protected lazy val isAcidIUDoperation = JBoolean.FALSE

This comment has been minimized.

Copy link
@HyukjinKwon

HyukjinKwon Jan 31, 2019

Member

@dongjoon-hyun, isn't it isAcid defined at Shim_v0_14? Was wondering why this was separate variable again. Do you see any possibility that this is different specifically for 3.1? Then, it's fine.

This comment has been minimized.

Copy link
@dongjoon-hyun

dongjoon-hyun Jan 31, 2019

Author Member

Historically, isAcidIUDoperation is an evolved one from isAcid.

In Hive code, isAcid was a general ACID operation and isAcidIUDoperation is now used for ACID Insert/Update/Delete operations. Also, they checks isFullAcidTable and use it together like this.

else if(!isAcidIUDoperation && isFullAcidTable) {
    destPath = fixFullAcidPathForLoadData(loadFileType, destPath, txnId, stmtId, tbl);
}

And yes for your last question. We don't know the future of Hive. So, for the different parameter name, we had better handle differently. That was my logic.

writeIdInLoadTableOrPartition, stmtIdInLoadTableOrPartition, replace: JBoolean)
}

override def loadTable(

This comment has been minimized.

Copy link
@HyukjinKwon
clazzLoadFileType.getEnumConstants.find(_.toString.equalsIgnoreCase("KEEP_EXISTING"))
}
assert(loadFileType.isDefined)
loadTableMethod.invoke(hive, loadPath, tableName, loadFileType.get, isSrcLocal: JBoolean,

This comment has been minimized.

Copy link
@HyukjinKwon
@dongjoon-hyun

This comment has been minimized.

Copy link
Member Author

commented Jan 31, 2019

Thank you so much for review, @HyukjinKwon and @gatorsmile .
Merged to master.

stczwd added a commit to stczwd/spark that referenced this pull request Feb 18, 2019
[SPARK-24360][SQL] Support Hive 3.1 metastore
## What changes were proposed in this pull request?

Hive 3.1.1 is released. This PR aims to support Hive 3.1.x metastore.
Please note that Hive 3.0.0 Metastore is skipped intentionally.

## How was this patch tested?

Pass the Jenkins with the updated test cases including 3.1.

Closes apache#23694 from dongjoon-hyun/SPARK-24360-3.1.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
wangyum added a commit to wangyum/spark that referenced this pull request Mar 14, 2019
[SPARK-24360][SQL] Support Hive 3.1 metastore
## What changes were proposed in this pull request?

Hive 3.1.1 is released. This PR aims to support Hive 3.1.x metastore.
Please note that Hive 3.0.0 Metastore is skipped intentionally.

## How was this patch tested?

Pass the Jenkins with the updated test cases including 3.1.

Closes apache#23694 from dongjoon-hyun/SPARK-24360-3.1.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
SirOibaf added a commit to SirOibaf/spark that referenced this pull request May 26, 2019
[SPARK-24360][SQL] Support Hive 3.1 metastore
Hive 3.1.1 is released. This PR aims to support Hive 3.1.x metastore.
Please note that Hive 3.0.0 Metastore is skipped intentionally.

Pass the Jenkins with the updated test cases including 3.1.

Closes apache#23694 from dongjoon-hyun/SPARK-24360-3.1.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
tkakantousis added a commit to logicalclocks/spark that referenced this pull request Jun 3, 2019
[HOPSWORKS-385] Add support to Hive for Hops TLS model (#8)
* Compiles with Hops Hive

* [SPARK-24360][SQL] Support Hive 3.1 metastore

Hive 3.1.1 is released. This PR aims to support Hive 3.1.x metastore.
Please note that Hive 3.0.0 Metastore is skipped intentionally.

Pass the Jenkins with the updated test cases including 3.1.

Closes apache#23694 from dongjoon-hyun/SPARK-24360-3.1.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
tkakantousis added a commit to logicalclocks/spark that referenced this pull request Jul 16, 2019
[HOPSWORKS-385] Add support to Hive for Hops TLS model (#8)
* Compiles with Hops Hive

* [SPARK-24360][SQL] Support Hive 3.1 metastore

Hive 3.1.1 is released. This PR aims to support Hive 3.1.x metastore.
Please note that Hive 3.0.0 Metastore is skipped intentionally.

Pass the Jenkins with the updated test cases including 3.1.

Closes apache#23694 from dongjoon-hyun/SPARK-24360-3.1.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

[HOPSWORKS-385] append - Bump Hive dependency

[HOPSWORKS-385] append - Exclude log4j dependency  (#10)

* [HOPSWORKS-1105] Bump Hops version to 2.8.2.8

* [HOPSWORKS-385] append - Exclude log4j dependency
wangyum pushed a commit to wangyum/spark that referenced this pull request Aug 1, 2019
[CARMEL-419] cherry pick for Hive3.1 support [SPARK-23510] [SPARK-243…
…12] [SPARK-26091] [SPARK-24360] (apache#161)

* [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 metastore

## What changes were proposed in this pull request?
This is based on apache#20668 for supporting Hive 2.2 and Hive 2.3 metastore.

When we merge the PR, we should give the major credit to wangyum

## How was this patch tested?
Added the test cases

Author: Yuming Wang <yumwang@ebay.com>
Author: gatorsmile <gatorsmile@gmail.com>

Closes apache#20671 from gatorsmile/pr-20668.

(cherry picked from commit ff14801)

* [SPARK-24312][SQL] Upgrade to 2.3.3 for Hive Metastore Client 2.3

## What changes were proposed in this pull request?

Hive 2.3.3 was [released on April 3rd](https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12342162&styleName=Text&projectId=12310843). This PR aims to upgrade Hive Metastore Client 2.3 from 2.3.2 to 2.3.3.

## How was this patch tested?

Pass the Jenkins with the existing tests.

Author: Dongjoon Hyun <dongjoon@apache.org>

Closes apache#21359 from dongjoon-hyun/SPARK-24312.

(cherry picked from commit 7f82c4a)

* [SPARK-26091][SQL] Upgrade to 2.3.4 for Hive Metastore Client 2.3

## What changes were proposed in this pull request?

[Hive 2.3.4 is released on Nov. 7th](https://hive.apache.org/downloads.html#7-november-2018-release-234-available). This PR aims to support that version.

## How was this patch tested?

Pass the Jenkins with the updated version

Closes apache#23059 from dongjoon-hyun/SPARK-26091.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

(cherry picked from commit ed46ac9)

* [SPARK-24360][SQL] Support Hive 3.1 metastore

## What changes were proposed in this pull request?

Hive 3.1.1 is released. This PR aims to support Hive 3.1.x metastore.
Please note that Hive 3.0.0 Metastore is skipped intentionally.

## How was this patch tested?

Pass the Jenkins with the updated test cases including 3.1.

Closes apache#23694 from dongjoon-hyun/SPARK-24360-3.1.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

(cherry picked from commit aeff69b)

* add missed import

* fix after cherry-pick
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.