Skip to content

[SPARK-51239][INFRA] Upgrade Github Action image for TPCDSQueryBenchmark from 20.04 to latest#49980

Closed
wayneguow wants to merge 3 commits intoapache:masterfrom
wayneguow:tpcds_inf
Closed

[SPARK-51239][INFRA] Upgrade Github Action image for TPCDSQueryBenchmark from 20.04 to latest#49980
wayneguow wants to merge 3 commits intoapache:masterfrom
wayneguow:tpcds_inf

Conversation

@wayneguow
Copy link
Contributor

@wayneguow wayneguow commented Feb 17, 2025

What changes were proposed in this pull request?

This PR aims to upgrade Github Action image for TPCDSQueryBenchmark from 20.04 to latest and update the dependency of databricks/tpcds-kit to the latest codes.

In the past, there were compilation problems in high-version Ubuntu images due to g++ version compatibility issues, but this problem has been solved after this PR: databricks/tpcds-kit#7

Why are the changes needed?

Refer to: actions/runner-images#11101

The Ubuntu 20.04 Actions runner image will begin deprecation on 2025-02-01 and will be fully unsupported by 2025-04-01

image

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Manual check on Ubuntu 24.04 and Pass GA.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the INFRA label Feb 17, 2025
@wayneguow
Copy link
Contributor Author

If this PR works, it needs to be applied to all active branches.

name: "Generate an input dataset for TPCDSQueryBenchmark with SF=1"
if: contains(inputs.class, 'TPCDSQueryBenchmark') || contains(inputs.class, '*')
runs-on: ubuntu-20.04
runs-on: ubuntu-24.04
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you manually run a org.apache.spark.sql.execution.benchmark.TPCDSQueryBenchmark to ensure that this modification is okay?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name: "Generate an input dataset for TPCDSQueryBenchmark with SF=1"
if: contains(inputs.class, 'TPCDSQueryBenchmark') || contains(inputs.class, '*')
runs-on: ubuntu-20.04
runs-on: ubuntu-24.04
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to avoid uncertain compatibility in the future, I am using 24.04 here instead of latest.

# Pin to 'Ubuntu 20.04' due to 'databricks/tpcds-kit' compilation
runs-on: ubuntu-20.04
# Pin to 'Ubuntu 24.04' due to 'databricks/tpcds-kit' compilation
runs-on: ubuntu-24.04
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just use ubuntu-latest as others

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated it.

@wayneguow
Copy link
Contributor Author

image
Seems ok.

LuciferYang
LuciferYang previously approved these changes Feb 17, 2025
Copy link
Contributor

@LuciferYang LuciferYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM

@LuciferYang LuciferYang dismissed their stale review February 17, 2025 08:43

Use ubuntu-latest ?

@wayneguow wayneguow changed the title [SPARK-51239][INFRA] Upgrade Github Action image for TPCDSQueryBenchmark from 20.04 to 24.04 [SPARK-51239][INFRA] Upgrade Github Action image for TPCDSQueryBenchmark from 20.04 to latest Feb 17, 2025
LuciferYang pushed a commit that referenced this pull request Feb 19, 2025
…mark` from 20.04 to latest

### What changes were proposed in this pull request?

This PR aims to upgrade Github Action image for `TPCDSQueryBenchmark` from 20.04 to latest and update the dependency of `databricks/tpcds-kit` to the latest codes.

In the past, there were compilation problems in high-version Ubuntu images due to g++ version compatibility issues, but this problem has been solved after this PR: databricks/tpcds-kit#7

### Why are the changes needed?

Refer to: actions/runner-images#11101

> The Ubuntu 20.04 Actions runner image will begin deprecation on 2025-02-01 and will be fully unsupported by 2025-04-01

![image](https://github.com/user-attachments/assets/db68ec55-f3ca-4a24-aa81-5347c85ec0ed)

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manual check on Ubuntu 24.04 and Pass GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #49980 from wayneguow/tpcds_inf.

Authored-by: Wei Guo <guow93@gmail.com>
Signed-off-by: yangjie01 <yangjie01@baidu.com>
(cherry picked from commit 0af25b8)
Signed-off-by: yangjie01 <yangjie01@baidu.com>
@LuciferYang
Copy link
Contributor

Merged into master and branch-4.0. Thanks @wayneguow @HyukjinKwon and @beliefer

dongjoon-hyun pushed a commit that referenced this pull request Feb 21, 2025
…tu-20.04` to `ubuntu-22.04` and solved the `TPCDSQueryBenchmark` compatibility issue

### What changes were proposed in this pull request?

This PR aims to upgrade left Github Action image from `ubuntu-20.04`to `ubuntu-22.04` and solved the `TPCDSQueryBenchmark` compatibility issue.

### Why are the changes needed?

Same to #49980 , since ubuntu 20.04 will be removed, and according to `Maintenance releases and EOL` at https://spark.apache.org/versioning-policy.html, Spark 3.5 will still be in service before April 12th 2026, so we also need to make this change in branch-3.5.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #49988 from wayneguow/branch-3.5.

Authored-by: Wei Guo <guow93@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@wayneguow wayneguow deleted the tpcds_inf branch February 25, 2025 03:19
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 14, 2025
…mark` from 20.04 to latest

### What changes were proposed in this pull request?

This PR aims to upgrade Github Action image for `TPCDSQueryBenchmark` from 20.04 to latest and update the dependency of `databricks/tpcds-kit` to the latest codes.

In the past, there were compilation problems in high-version Ubuntu images due to g++ version compatibility issues, but this problem has been solved after this PR: databricks/tpcds-kit#7

### Why are the changes needed?

Refer to: actions/runner-images#11101

> The Ubuntu 20.04 Actions runner image will begin deprecation on 2025-02-01 and will be fully unsupported by 2025-04-01

![image](https://github.com/user-attachments/assets/db68ec55-f3ca-4a24-aa81-5347c85ec0ed)

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manual check on Ubuntu 24.04 and Pass GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#49980 from wayneguow/tpcds_inf.

Authored-by: Wei Guo <guow93@gmail.com>
Signed-off-by: yangjie01 <yangjie01@baidu.com>
(cherry picked from commit dbaa8d7)
Signed-off-by: yangjie01 <yangjie01@baidu.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants