Skip to content

Conversation

@pan3793
Copy link
Member

@pan3793 pan3793 commented Oct 20, 2025

What changes were proposed in this pull request?

Currently, GHA jobs streaming, ... takes ~90min, while yarn takes ~25min, this change balances them by moving connect(takes ~20min) from the former to the latter.

Why are the changes needed?

Balance GHA jobs.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Before

  • streaming, ... connect takes ~90min
  • yarn takes ~25min,

After https://github.com/pan3793/spark/actions/runs/18645254985/job/53151226275

  • streaming, ... takes ~75min
  • connect, yarn takes ~50min

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the INFRA label Oct 20, 2025
@pan3793 pan3793 marked this pull request as ready for review October 20, 2025 08:40
@pan3793
Copy link
Member Author

pan3793 commented Oct 20, 2025

cc @HyukjinKwon @LuciferYang

@pan3793 pan3793 changed the title [SPARK-53952][INFRA] Balance GHA CI jobs [SPARK-53952][INFRA] Balance GHA CI jobs by shifting connect tests Oct 20, 2025
@pan3793
Copy link
Member Author

pan3793 commented Oct 20, 2025

Docker K8s IT failure should be irrelevant, they are caused by Docker Hub transient 503

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but this PR doesn't work as you proposed becuase this is simply merging two flaky tests (connect and yarn) into a single bucket.

FYI, I already did this in SPARK-47051 and split it back, @pan3793 .

@dongjoon-hyun
Copy link
Member

Technically, I'm -1 for merging connect and yarn test pipeline until we remove the flakiness of YARN and Connect module significantly.

@pan3793
Copy link
Member Author

pan3793 commented Oct 21, 2025

@dongjoon-hyun, from my observation, the most flaky test is steaming* jobs, while yarn is relatively stable. Additionally, the community has been actively working on streaming and connect modules, this also increases the test cases in these modules.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Oct 21, 2025

from my observation, the most flaky test is steaming* jobs, while yarn is relatively stable.

YARN failed on your PR today again. I can say that I have a longer observation experience than you, @pan3793 .

So, what do you want to say by the following? That doesn't mean anything about the stability.

Additionally, the community has been actively working on streaming and connect modules, this also increases the test cases in these modules.

Although I appreciate your community contributions, we had better talk this later when YARN or Connect module become more deterministic back.

@pan3793
Copy link
Member Author

pan3793 commented Oct 21, 2025

@dongjoon-hyun I see, let me close this then.

@pan3793 pan3793 closed this Oct 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants