[SPARK-29550][SQL] - enhance session catalog locking #26213

choojoyq · 2019-10-22T15:11:23Z

What changes were proposed in this pull request?

In my streaming applicationspark.streaming.concurrentJobs is set to 50 which is used as size for underlying thread pool. I perform several sql operations on dataframes and automatically create/alter tables/view in runtime. I order to do that i invoke create ... if not exists operations on driver on each batch invocation. Once i noticed that most of batch time is spent on driver but not on executors. I made a thread dump and figured out that most of the threads are blocked on SessionCatalog operation waiting for a lock.

Existing implementation of SessionCatalog uses a single lock which is used almost by all the methods to guard currentDb and tempViews variables. I propose to enhance locking behaviour of SessionCatalog by :

Employing ReadWriteLock which allows to execute read operations concurrently.
Replace synchronized with the corresponding read or write lock.

Also it's possible to go even further and strip locks for currentDb and tempViews but i'm not sure whether it's possible from the implementation point of view.
Probably someone will help me with this?

How was this patch tested?

Only via existing test suits.

AmplabJenkins · 2019-10-22T15:16:33Z

Can one of the admins verify this patch?

choojoyq · 2019-10-25T13:30:00Z

@andrewor14 could you please take a look ?

github-actions · 2020-02-03T00:05:57Z

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

choojoyq force-pushed the SPARK-29550-enhance-session-catalog-locking branch from 2d3be32 to acef06a Compare October 22, 2019 15:13

choojoyq force-pushed the SPARK-29550-enhance-session-catalog-locking branch from acef06a to d6ba3e5 Compare October 22, 2019 15:16

dongjoon-hyun added the SQL label Oct 22, 2019

[SPARK-29550][SQL] - enhance session catalog locking

b50d8f7

choojoyq force-pushed the SPARK-29550-enhance-session-catalog-locking branch from c3232c5 to b50d8f7 Compare October 23, 2019 09:03

github-actions bot added the Stale label Feb 3, 2020

github-actions bot closed this Feb 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-29550][SQL] - enhance session catalog locking #26213

[SPARK-29550][SQL] - enhance session catalog locking #26213

choojoyq commented Oct 22, 2019 •

edited

AmplabJenkins commented Oct 22, 2019

choojoyq commented Oct 25, 2019

github-actions bot commented Feb 3, 2020

[SPARK-29550][SQL] - enhance session catalog locking #26213

[SPARK-29550][SQL] - enhance session catalog locking #26213

Conversation

choojoyq commented Oct 22, 2019 • edited

What changes were proposed in this pull request?

How was this patch tested?

AmplabJenkins commented Oct 22, 2019

choojoyq commented Oct 25, 2019

github-actions bot commented Feb 3, 2020

choojoyq commented Oct 22, 2019 •

edited