Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-46577][SQL] HiveMetastoreLazyInitializationSuite leaks hive's SessionState #44578

Closed
wants to merge 1 commit into from

Conversation

yaooqinn
Copy link
Member

@yaooqinn yaooqinn commented Jan 3, 2024

What changes were proposed in this pull request?

The upcoming tests with the new hive configurations will have no effect due to the leaked SessionState.

06:21:12.848 pool-1-thread-1 INFO ThriftServerWithSparkContextInHttpSuite: Trying to start HiveThriftServer2: mode=http, attempt=0
....
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:OperationManager is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:SessionManager is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service: CLIService is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:ThriftBinaryCLIService is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service: HiveServer2 is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:OperationManager is started.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:SessionManager is started.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service: CLIService is started.
06:21:12.852 pool-1-thread-1 INFO AbstractService: Service:ThriftBinaryCLIService is started.
06:21:12.852 pool-1-thread-1 INFO ThriftCLIService: Starting ThriftBinaryCLIService on port 10000 with 5...500 worker threads
06:21:12.852 pool-1-thread-1 INFO AbstractService: Service:HiveServer2 is started.

As the logs above revealed, ThriftServerWithSparkContextInHttpSuite started the ThriftBinaryCLIService instead of the ThriftHttpCLIService. This is because in HiveClientImpl, the new configurations are only applied to hive conf during initializing but not for existing ones.

This cause ThriftServerWithSparkContextInHttpSuite retrying or even aborting.

Why are the changes needed?

Fix flakiness in tests

Does this PR introduce any user-facing change?

no

How was this patch tested?

ran tests locally with the hive-thriftserver module locally,

Was this patch authored or co-authored using generative AI tooling?

no

@github-actions github-actions bot added the SQL label Jan 3, 2024
Copy link
Contributor

@LuciferYang LuciferYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

dongjoon-hyun pushed a commit that referenced this pull request Jan 3, 2024
…SessionState

### What changes were proposed in this pull request?

The upcoming tests with the new hive configurations will have no effect due to the leaked SessionState.

```
06:21:12.848 pool-1-thread-1 INFO ThriftServerWithSparkContextInHttpSuite: Trying to start HiveThriftServer2: mode=http, attempt=0
....
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:OperationManager is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:SessionManager is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service: CLIService is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:ThriftBinaryCLIService is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service: HiveServer2 is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:OperationManager is started.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:SessionManager is started.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service: CLIService is started.
06:21:12.852 pool-1-thread-1 INFO AbstractService: Service:ThriftBinaryCLIService is started.
06:21:12.852 pool-1-thread-1 INFO ThriftCLIService: Starting ThriftBinaryCLIService on port 10000 with 5...500 worker threads
06:21:12.852 pool-1-thread-1 INFO AbstractService: Service:HiveServer2 is started.
```

As the logs above revealed, ThriftServerWithSparkContextInHttpSuite started the ThriftBinaryCLIService instead of the ThriftHttpCLIService. This is because in HiveClientImpl, the new configurations are only applied to hive conf during initializing but not for existing ones.

This cause ThriftServerWithSparkContextInHttpSuite retrying or even aborting.

### Why are the changes needed?

Fix flakiness in tests

### Does this PR introduce _any_ user-facing change?

no
### How was this patch tested?

ran tests locally with the hive-thriftserver module locally,
### Was this patch authored or co-authored using generative AI tooling?
no

Closes #44578 from yaooqinn/SPARK-46577.

Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 605fecd)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
dongjoon-hyun pushed a commit that referenced this pull request Jan 3, 2024
…SessionState

### What changes were proposed in this pull request?

The upcoming tests with the new hive configurations will have no effect due to the leaked SessionState.

```
06:21:12.848 pool-1-thread-1 INFO ThriftServerWithSparkContextInHttpSuite: Trying to start HiveThriftServer2: mode=http, attempt=0
....
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:OperationManager is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:SessionManager is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service: CLIService is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:ThriftBinaryCLIService is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service: HiveServer2 is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:OperationManager is started.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:SessionManager is started.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service: CLIService is started.
06:21:12.852 pool-1-thread-1 INFO AbstractService: Service:ThriftBinaryCLIService is started.
06:21:12.852 pool-1-thread-1 INFO ThriftCLIService: Starting ThriftBinaryCLIService on port 10000 with 5...500 worker threads
06:21:12.852 pool-1-thread-1 INFO AbstractService: Service:HiveServer2 is started.
```

As the logs above revealed, ThriftServerWithSparkContextInHttpSuite started the ThriftBinaryCLIService instead of the ThriftHttpCLIService. This is because in HiveClientImpl, the new configurations are only applied to hive conf during initializing but not for existing ones.

This cause ThriftServerWithSparkContextInHttpSuite retrying or even aborting.

### Why are the changes needed?

Fix flakiness in tests

### Does this PR introduce _any_ user-facing change?

no
### How was this patch tested?

ran tests locally with the hive-thriftserver module locally,
### Was this patch authored or co-authored using generative AI tooling?
no

Closes #44578 from yaooqinn/SPARK-46577.

Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 605fecd)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
@dongjoon-hyun
Copy link
Member

Merged to master/3.5/3.4.

Thank you, @yaooqinn and @LuciferYang .

@yaooqinn
Copy link
Member Author

yaooqinn commented Jan 3, 2024

Thanks @dongjoon-hyun @LuciferYang

@yaooqinn yaooqinn deleted the SPARK-46577 branch January 3, 2024 15:17
szehon-ho pushed a commit to szehon-ho/spark that referenced this pull request Feb 7, 2024
…SessionState

### What changes were proposed in this pull request?

The upcoming tests with the new hive configurations will have no effect due to the leaked SessionState.

```
06:21:12.848 pool-1-thread-1 INFO ThriftServerWithSparkContextInHttpSuite: Trying to start HiveThriftServer2: mode=http, attempt=0
....
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:OperationManager is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:SessionManager is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service: CLIService is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:ThriftBinaryCLIService is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service: HiveServer2 is inited.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:OperationManager is started.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service:SessionManager is started.
06:21:12.851 pool-1-thread-1 INFO AbstractService: Service: CLIService is started.
06:21:12.852 pool-1-thread-1 INFO AbstractService: Service:ThriftBinaryCLIService is started.
06:21:12.852 pool-1-thread-1 INFO ThriftCLIService: Starting ThriftBinaryCLIService on port 10000 with 5...500 worker threads
06:21:12.852 pool-1-thread-1 INFO AbstractService: Service:HiveServer2 is started.
```

As the logs above revealed, ThriftServerWithSparkContextInHttpSuite started the ThriftBinaryCLIService instead of the ThriftHttpCLIService. This is because in HiveClientImpl, the new configurations are only applied to hive conf during initializing but not for existing ones.

This cause ThriftServerWithSparkContextInHttpSuite retrying or even aborting.

### Why are the changes needed?

Fix flakiness in tests

### Does this PR introduce _any_ user-facing change?

no
### How was this patch tested?

ran tests locally with the hive-thriftserver module locally,
### Was this patch authored or co-authored using generative AI tooling?
no

Closes apache#44578 from yaooqinn/SPARK-46577.

Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 605fecd)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants