Skip to content

[ONLY TEST][SQL] Improve TPCDSCollationQueryTestSuite#47369

Closed
panbingkun wants to merge 1 commit intoapache:masterfrom
panbingkun:improve_TPCDSCollationQueryTestSuite
Closed

[ONLY TEST][SQL] Improve TPCDSCollationQueryTestSuite#47369
panbingkun wants to merge 1 commit intoapache:masterfrom
panbingkun:improve_TPCDSCollationQueryTestSuite

Conversation

@panbingkun
Copy link
Contributor

@panbingkun panbingkun commented Jul 16, 2024

What changes were proposed in this pull request?

Why are the changes needed?

https://github.com/panbingkun/spark/actions/runs/9953478770/job/27497045622

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Only test.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Jul 16, 2024
@LuciferYang
Copy link
Contributor

from the log we can see

Warning: [1888.608s][warning][gc,alloc] broadcast-exchange-907: Retried waiting for GCLocker too often allocating 33554434 words

If this is the reason for the test failures, I think we should first investigate the root cause of the GCLocker's activity to avoid covering up any unknown performance bottlenecks.

@panbingkun
Copy link
Contributor Author

from the log we can see

Warning: [1888.608s][warning][gc,alloc] broadcast-exchange-907: Retried waiting for GCLocker too often allocating 33554434 words

If this is the reason for the test failures, I think we should first investigate the root cause of the GCLocker's activity to avoid covering up any unknown performance bottlenecks.

I have only encountered this once so far, and this is just a record.
Additionally, I believe it originated from TPCDSQueryTestSuite. I compared it slightly with TPCDSQueryTestSuite and found that there are two major differences between it and TPCDSQueryTestSuite:
joinConfs and System.gc() // Workaround for GitHub Actions memory limitation, see also SPARK-37368

joinConfs.foreach { conf =>
System.gc() // SPARK-37368

@panbingkun
Copy link
Contributor Author

I will continue to observe and consider aligning them if I encounter them again in the future.

@LuciferYang
Copy link
Contributor

from the log we can see

Warning: [1888.608s][warning][gc,alloc] broadcast-exchange-907: Retried waiting for GCLocker too often allocating 33554434 words

If this is the reason for the test failures, I think we should first investigate the root cause of the GCLocker's activity to avoid covering up any unknown performance bottlenecks.

I have only encountered this once so far, and this is just a record. Additionally, I believe it originated from TPCDSQueryTestSuite. I compared it slightly with TPCDSQueryTestSuite and found that there are two major differences between it and TPCDSQueryTestSuite: joinConfs and System.gc() // Workaround for GitHub Actions memory limitation, see also SPARK-37368

joinConfs.foreach { conf =>
System.gc() // SPARK-37368

SPARK-37368 was not added System.gc due to the activity of GCLocker. IIRC, it was to address excessive memory usage leading to the test process being killed.

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Oct 27, 2024
@github-actions github-actions bot closed this Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants