Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-26869 : Backport of HIVE-19104: Fix TestHS2ImpersonationWithRemoteMS.testImpersonation #3873

Merged
merged 1 commit into from Jan 8, 2023

Conversation

amanraj2520
Copy link
Contributor

@amanraj2520 amanraj2520 commented Dec 18, 2022

HIVE-26869 : Backport of HIVE-19104: When test MetaStore is started with retry the instances should be independent

Change-Id: I4955b9dd2c5b82da9510ae3a2342b8d96ab86781

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

@amanraj2520
Copy link
Contributor Author

@abstractdog @zabetak Can you please review this too and merge

@abstractdog
Copy link
Contributor

the original commit has a minor change in HiveConnection too:
dc8891e
would it make sense to backport that also?

@amanraj2520
Copy link
Contributor Author

amanraj2520 commented Dec 18, 2022

@abstractdog If you see in the code also, the maxRetries is set to 1 which was changed from 5 to 1 in the original commit. So I think should be okay.

@abstractdog
Copy link
Contributor

abstractdog commented Dec 19, 2022

right, thanks for clarifying

also, I can see TestHS2ImpersonationWithRemoteMS failing in the precommit:
http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-3873/1/tests

shouldn't it be fixed with this PR?

@amanraj2520
Copy link
Contributor Author

Hi @abstractdog I had triggered it late at night. Just saw it today that the tests are failing. Let me reproduce this in my local. I thought it was a simple cherry pick to fix the test. Will update you asap.

@amanraj2520
Copy link
Contributor Author

amanraj2520 commented Dec 19, 2022

Hi @abstractdog @ayushtkn @zabetak @sankarh @ashish-kumar-sharma
I was able to produce this issue in my local. Need some suggestion about the fix.

This is the error :

[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 703.357 s <<< FAILURE! - in org.apache.hive.service.TestHS2ImpersonationWithRemoteMS
[ERROR] testImpersonation(org.apache.hive.service.TestHS2ImpersonationWithRemoteMS)  Time elapsed: 668.923 s  <<< FAILURE!
java.lang.AssertionError: Unexpected table directory '34015' in warehouse
        at org.junit.Assert.fail(Assert.java:88)
        at org.apache.hive.service.TestHS2ImpersonationWithRemoteMS.testImpersonation(TestHS2ImpersonationWithRemoteMS.java:115)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

The issue is that in the test the directory structure where this table is getting created is like this hdfs://localhost:43235/base/warehouse/44157 :
image

When it tries to get the directory name it returns 44157 instead of foo_table or bar_table. This is why this is throwing the error. I see a bunch of other test cases failing because of this. Any idea which commit that I need to look into. I also compared this code with oss/master and oss/branch-3.1. There are no differences related to this. Also a point to be noted is that this test fails in oss/branch-3.1.

@amanraj2520
Copy link
Contributor Author

I see a similar issue in another test. (org.apache.hadoop.hive.ql.TestWarehouseExternalDir) Seems like Hadoop is creating a folder structure by adding a number at the end. Adding the snippet for the same:
image

Copy link
Contributor

@cnauroth cnauroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amanraj2520 , I think the reason for the failure in TestWarehouseExternalDir is because MetaStoreTestUtils#startMetaStoreWithRetry configures a testing metastore with the warehouse directory parameterized by the metastore's server port number. For example, when I ran the test locally, I saw this in the output:

2022-12-22T15:44:41,372 ERROR [main] metastore.MetaStoreTestUtils: MetaStore Thrift Server started on port: 42103 with warehouse dir: hdfs://localhost:41827/base/warehouse/42103

The reason for parameterizing the warehouse directory by port is that it can support running multiple instances concurrently and tests can be isolated from each other's data changes.

In the master branch, this test and several other were marked to skip with @org.junit.Ignore. This was done in commit ad925ea. I don't know if there was a JIRA issue associated with that. There is also HIVE-25266 to track fixing TestWarehouseExternalDir, but it isn't resolved.

My opinion (non-binding of course) is that any tests skipped as known failures on master can also be skipped on branch-3. Then, when the permanent fixes come in, they should be committed to both master and branch-3.

Copy link
Contributor

@cnauroth cnauroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question remains, why does this specific test pass on master but fail on branch-3? I've been investigating for any differences in configuration handling in the tests between the 2 branches, but so far, I haven't found an explanation.

@amanraj2520
Copy link
Contributor Author

Hi @cnauroth I have also debugged it. But did not find anything. My hunch is that this can be due to the hadoop version issue. Can you once check this if this can be the issue - https://issues.apache.org/jira/browse/HDFS-13408. Can this be the case

@amanraj2520
Copy link
Contributor Author

This did not go into 3.1.0 but went into 3.1.1 I think. Not sure though

…hould be independent

Change-Id: I4955b9dd2c5b82da9510ae3a2342b8d96ab86781
@cnauroth
Copy link
Contributor

cnauroth commented Jan 4, 2023

@amanraj2520 , I don't think HDFS-13408 would be related, because I think that was a problem specific to Windows, and this test seems to be having a problem handling the parameterized warehouse directory, not the base testing directory.

I think I figured it out. MetaStoreTestUtils#startMetaStoreWithRetry sets the warehouse directory as the new metastore.warehouse.dir property. AbstractHiveService#get/setWareHouseDir later works with the deprecated hive.metastore.warehouse.dir property. MetastoreConf will take care of resolving requests for the new property to values under the old property, but not vice versa.

I found that on master, HIVE-19104 included an additional line in MiniHs2 to make sure these 2 properties would stay in sync for test runs. I can get TestHS2ImpersonationWithRemoteMS to pass after applying a slightly modified version of that patch.

I'll let you know when I have a pull request ready.

@amanraj2520
Copy link
Contributor Author

This is great @cnauroth . Please let me know if you have the PR, I will also test it from my side.

@amanraj2520
Copy link
Contributor Author

@cnauroth Cherry picked the patch and tested locally. It works as expected. Thanks for your suggestion updated this PR.

@amanraj2520 amanraj2520 changed the title HIVE-26869 : Backport of HIVE-25250: Fix TestHS2ImpersonationWithRemoteMS.testImpersonation HIVE-26869 : Backport of HIVE-19104: Fix TestHS2ImpersonationWithRemoteMS.testImpersonation Jan 4, 2023
@amanraj2520
Copy link
Contributor Author

@cnauroth You suggestion worked. The concerned tests have passed

But now new tests are failing :

  1. org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs
    ERROR :
    Error Could not open client transport with JDBC Uri: jdbc:hive2://localhost:42959/default;auth=delegationToken: Peer indicated failure: DIGEST-MD5: IO error acquiring password Stacktrace java.sql.SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:42959/default;auth=delegationToken: Peer indicated failure: DIGEST-MD5: IO error acquiring password at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:269) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:270) at org.apache.hive.minikdc.TestJdbcWithMiniKdc.testTokenAuth(TestJdbcWithMiniKdc.java:172) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
  2. udaf_context_ngrams.q (org.apache.hadoop.hive.cli.TestCliDriver)
    ERROR :
    `
    Client Execution succeeded but contained differences (error code = 1) after executing udaf_context_ngrams.q
    25c25
    < [9.0,8.0,7.0,6.0,5.0,5.0,4.0,4.0,4.0,4.0,4.0,3.0,3.0,3.0,3.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0]

[267.0,171.0,164.0,119.0,108.0,106.0,106.0,82.0,79.0,67.0,67.0,46.0,45.0,42.0,42.0,40.0,39.0,37.0,37.0,36.0,34.0,32.0,32.0,30.0,30.0,29.0,28.0,28.0,28.0,28.0,26.0,25.0,24.0,23.0,23.0,22.0,22.0,21.0,20.0,19.0,18.0,18.0,18.0,17.0,17.0,17.0,16.0,16.0,16.0,16.0,15.0,15.0,14.0,14.0,14.0,13.0,13.0,13.0,13.0,13.0,13.0,13.0,12.0,12.0,12.0,12.0,12.0,12.0,12.0,11.0,11.0,11.0,11.0,11.0,11.0,11.0,10.0,10.0,10.0,10.0,10.0,10.0,10.0,10.0,10.0,10.0,10.0,9.0,9.0,9.0,9.0,9.0,9.0,9.0,9.0,9.0,9.0,9.0,9.0,8.0]
52c52
`

@amanraj2520
Copy link
Contributor Author

I found a fix for the first test in HIVE-19313 (https://issues.apache.org/jira/browse/HIVE-19313). Will cherry pick the same

@amanraj2520
Copy link
Contributor Author

Remaining tests which failed were disabled in https://issues.apache.org/jira/browse/HIVE-20715 and https://issues.apache.org/jira/browse/HIVE-20741 so cherry picked the same

@cnauroth
Copy link
Contributor

cnauroth commented Jan 4, 2023

Thanks, @amanraj2520 . It looks like you've picked up the cherry-pick I had in mind as well as a few others. I don't know if committers would prefer to see them split out individually for cleaner revision history. If so, I'll post my PR for just the HIVE-19104 backport.

@amanraj2520
Copy link
Contributor Author

@cnauroth I agree with you. I was going to revert the additional commits just was making sure that these test work fine when combined together. This PR tests that. This PR will now going to be for HIVE-19104. Will raise new tickets.

@amanraj2520
Copy link
Contributor Author

@cnauroth @zabetak @abstractdog I have done the changes from my side and tested it in my local as well. Can you please approve and merge this PR.

@amanraj2520
Copy link
Contributor Author

@abstractdog Can you please review this PR and merge. We have tested it locally as well as on the pipelines

@cnauroth
Copy link
Contributor

cnauroth commented Jan 5, 2023

I created HIVE-26910 / #3918 for separate tracking of the warehouse directory configuration fix.

@cnauroth
Copy link
Contributor

cnauroth commented Jan 5, 2023

I created HIVE-26910 / #3918 for separate tracking of the warehouse directory configuration fix.

Oops, sorry! I misunderstood. I see now that the changes are still included in this PR. I'll abandon my PR if this one gets committed.

Copy link
Contributor

@cnauroth cnauroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 (non-binding)

I confirmed that this is the same thing as the pull request I just created by mistake:

> diff <(curl -sS https://patch-diff.githubusercontent.com/raw/apache/hive/pull/3873.diff) <(curl -sS https://patch-diff.githubusercontent.com/raw/apache/hive/pull/3918.diff)

> echo $?
0

I've been running with this locally. TestHS2ImpersonationWithRemoteMS is passing as well as all other test suites that were modified.

Thank you for the contribution, @amanraj2520 !

@amanraj2520
Copy link
Contributor Author

@abstractdog @zabetak A gentle reminder. Can you please approve and merge this.

@abstractdog abstractdog self-requested a review January 8, 2023 15:19
Copy link
Contributor

@abstractdog abstractdog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants