Skip to content

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Nov 19, 2025

What changes were proposed in this pull request?

This PR aims to fix release-build.sh to detect REPO_ID correctly.

Why are the changes needed?

Previously, we use grep -A 5 to find description tag.

grep -A 5 "<repositoryId>orgapachespark-" | \
awk '/<repositoryId>/ { id = $0 } /<description>/ && $0 ~ /Apache Spark '"$RELEASE_VERSION"'/ { print id }' | \

However, it's insufficient as of now. According to Today's result, we need to grep 13 lines like the following.

$ curl --retry 10 --retry-all-errors -s -u "$ASF_USERNAME:$ASF_PASSWORD" https://repository.apache.org/service/local/staging/profile_repositories | grep -A 5 "<repositoryId>orgapachespark-"
      <repositoryId>orgapachespark-1505</repositoryId>
      <type>closed</type>
      <policy>release</policy>
      <userId>dongjoon</userId>
      <userAgent>curl/7.81.0</userAgent>
      <ipAddress>35.94.112.49</ipAddress>
$ curl --retry 10 --retry-all-errors -s -u "$ASF_USERNAME:$ASF_PASSWORD" https://repository.apache.org/service/local/staging/profile_repositories | grep -A 13 "<repositoryId>orgapachespark-"
      <repositoryId>orgapachespark-1505</repositoryId>
      <type>closed</type>
      <policy>release</policy>
      <userId>dongjoon</userId>
      <userAgent>curl/7.81.0</userAgent>
      <ipAddress>35.94.112.49</ipAddress>
      <repositoryURI>https://repository.apache.org/content/repositories/orgapachespark-1505</repositoryURI>
      <created>2025-11-16T20:23:35.413Z</created>
      <createdDate>Sun Nov 16 20:23:35 UTC 2025</createdDate>
      <createdTimestamp>1763324615413</createdTimestamp>
      <updated>2025-11-16T21:02:45.041Z</updated>
      <updatedDate>Sun Nov 16 21:02:45 UTC 2025</updatedDate>
      <updatedTimestamp>1763326965041</updatedTimestamp>
      <description>Apache Spark 4.1.0-preview4 (commit c125aea395b)</description>

Does this PR introduce any user-facing change?

No behavior change.

How was this patch tested?

Manually test.

$ curl --retry 10 --retry-all-errors -s -u "$ASF_USERNAME:$ASF_PASSWORD" https://repository.apache.org/service/local/staging/profile_repositories | grep -A 13 "<repositoryId>orgapachespark-" | awk '/<repositoryId>/ { id = $0 } /<description>/ && $0 ~ /Apache Spark '"$RELEASE_VERSION"'/ { print id }'
      <repositoryId>orgapachespark-1505</repositoryId>

After merging this, I'm going to test with Apache Spark 4.1.0-preview4.

Was this patch authored or co-authored using generative AI tooling?

No.

@dongjoon-hyun
Copy link
Member Author

Could you review this PR, @HyukjinKwon ?

@HyukjinKwon
Copy link
Member

Thanks!

@dongjoon-hyun
Copy link
Member Author

Thank you, @HyukjinKwon . Merged to master/4.1.

dongjoon-hyun added a commit that referenced this pull request Nov 19, 2025
…ctly

### What changes were proposed in this pull request?

This PR aims to fix `release-build.sh` to detect `REPO_ID` correctly.

### Why are the changes needed?

Previously, we use `grep -A 5` to find `description` tag.

https://github.com/apache/spark/blob/f328b5ef14c9ef4e2d04ab69c0578ab461388715/dev/create-release/release-build.sh#L501-L502

However, it's insufficient as of now. According to Today's result, we need to grep 13 lines like the following.

```
$ curl --retry 10 --retry-all-errors -s -u "$ASF_USERNAME:$ASF_PASSWORD" https://repository.apache.org/service/local/staging/profile_repositories | grep -A 5 "<repositoryId>orgapachespark-"
      <repositoryId>orgapachespark-1505</repositoryId>
      <type>closed</type>
      <policy>release</policy>
      <userId>dongjoon</userId>
      <userAgent>curl/7.81.0</userAgent>
      <ipAddress>35.94.112.49</ipAddress>
```

```
$ curl --retry 10 --retry-all-errors -s -u "$ASF_USERNAME:$ASF_PASSWORD" https://repository.apache.org/service/local/staging/profile_repositories | grep -A 13 "<repositoryId>orgapachespark-"
      <repositoryId>orgapachespark-1505</repositoryId>
      <type>closed</type>
      <policy>release</policy>
      <userId>dongjoon</userId>
      <userAgent>curl/7.81.0</userAgent>
      <ipAddress>35.94.112.49</ipAddress>
      <repositoryURI>https://repository.apache.org/content/repositories/orgapachespark-1505</repositoryURI>
      <created>2025-11-16T20:23:35.413Z</created>
      <createdDate>Sun Nov 16 20:23:35 UTC 2025</createdDate>
      <createdTimestamp>1763324615413</createdTimestamp>
      <updated>2025-11-16T21:02:45.041Z</updated>
      <updatedDate>Sun Nov 16 21:02:45 UTC 2025</updatedDate>
      <updatedTimestamp>1763326965041</updatedTimestamp>
      <description>Apache Spark 4.1.0-preview4 (commit c125aea)</description>
```

### Does this PR introduce _any_ user-facing change?

No behavior change.

### How was this patch tested?

Manually test.

```
$ curl --retry 10 --retry-all-errors -s -u "$ASF_USERNAME:$ASF_PASSWORD" https://repository.apache.org/service/local/staging/profile_repositories | grep -A 13 "<repositoryId>orgapachespark-" | awk '/<repositoryId>/ { id = $0 } /<description>/ && $0 ~ /Apache Spark '"$RELEASE_VERSION"'/ { print id }'
      <repositoryId>orgapachespark-1505</repositoryId>
```

After merging this, I'm going to test with `Apache Spark 4.1.0-preview4`.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #53136 from dongjoon-hyun/SPARK-54426.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit be281fb)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun dongjoon-hyun deleted the SPARK-54426 branch November 19, 2025 23:49
@HyukjinKwon
Copy link
Member

Let me cherry-pick this to branch-4 and branch-3.5 too. It should work in the same way

HyukjinKwon pushed a commit that referenced this pull request Nov 19, 2025
…ctly

### What changes were proposed in this pull request?

This PR aims to fix `release-build.sh` to detect `REPO_ID` correctly.

### Why are the changes needed?

Previously, we use `grep -A 5` to find `description` tag.

https://github.com/apache/spark/blob/f328b5ef14c9ef4e2d04ab69c0578ab461388715/dev/create-release/release-build.sh#L501-L502

However, it's insufficient as of now. According to Today's result, we need to grep 13 lines like the following.

```
$ curl --retry 10 --retry-all-errors -s -u "$ASF_USERNAME:$ASF_PASSWORD" https://repository.apache.org/service/local/staging/profile_repositories | grep -A 5 "<repositoryId>orgapachespark-"
      <repositoryId>orgapachespark-1505</repositoryId>
      <type>closed</type>
      <policy>release</policy>
      <userId>dongjoon</userId>
      <userAgent>curl/7.81.0</userAgent>
      <ipAddress>35.94.112.49</ipAddress>
```

```
$ curl --retry 10 --retry-all-errors -s -u "$ASF_USERNAME:$ASF_PASSWORD" https://repository.apache.org/service/local/staging/profile_repositories | grep -A 13 "<repositoryId>orgapachespark-"
      <repositoryId>orgapachespark-1505</repositoryId>
      <type>closed</type>
      <policy>release</policy>
      <userId>dongjoon</userId>
      <userAgent>curl/7.81.0</userAgent>
      <ipAddress>35.94.112.49</ipAddress>
      <repositoryURI>https://repository.apache.org/content/repositories/orgapachespark-1505</repositoryURI>
      <created>2025-11-16T20:23:35.413Z</created>
      <createdDate>Sun Nov 16 20:23:35 UTC 2025</createdDate>
      <createdTimestamp>1763324615413</createdTimestamp>
      <updated>2025-11-16T21:02:45.041Z</updated>
      <updatedDate>Sun Nov 16 21:02:45 UTC 2025</updatedDate>
      <updatedTimestamp>1763326965041</updatedTimestamp>
      <description>Apache Spark 4.1.0-preview4 (commit c125aea)</description>
```

### Does this PR introduce _any_ user-facing change?

No behavior change.

### How was this patch tested?

Manually test.

```
$ curl --retry 10 --retry-all-errors -s -u "$ASF_USERNAME:$ASF_PASSWORD" https://repository.apache.org/service/local/staging/profile_repositories | grep -A 13 "<repositoryId>orgapachespark-" | awk '/<repositoryId>/ { id = $0 } /<description>/ && $0 ~ /Apache Spark '"$RELEASE_VERSION"'/ { print id }'
      <repositoryId>orgapachespark-1505</repositoryId>
```

After merging this, I'm going to test with `Apache Spark 4.1.0-preview4`.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #53136 from dongjoon-hyun/SPARK-54426.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
HyukjinKwon pushed a commit that referenced this pull request Nov 19, 2025
…ctly

This PR aims to fix `release-build.sh` to detect `REPO_ID` correctly.

Previously, we use `grep -A 5` to find `description` tag.

https://github.com/apache/spark/blob/f328b5ef14c9ef4e2d04ab69c0578ab461388715/dev/create-release/release-build.sh#L501-L502

However, it's insufficient as of now. According to Today's result, we need to grep 13 lines like the following.

```
$ curl --retry 10 --retry-all-errors -s -u "$ASF_USERNAME:$ASF_PASSWORD" https://repository.apache.org/service/local/staging/profile_repositories | grep -A 5 "<repositoryId>orgapachespark-"
      <repositoryId>orgapachespark-1505</repositoryId>
      <type>closed</type>
      <policy>release</policy>
      <userId>dongjoon</userId>
      <userAgent>curl/7.81.0</userAgent>
      <ipAddress>35.94.112.49</ipAddress>
```

```
$ curl --retry 10 --retry-all-errors -s -u "$ASF_USERNAME:$ASF_PASSWORD" https://repository.apache.org/service/local/staging/profile_repositories | grep -A 13 "<repositoryId>orgapachespark-"
      <repositoryId>orgapachespark-1505</repositoryId>
      <type>closed</type>
      <policy>release</policy>
      <userId>dongjoon</userId>
      <userAgent>curl/7.81.0</userAgent>
      <ipAddress>35.94.112.49</ipAddress>
      <repositoryURI>https://repository.apache.org/content/repositories/orgapachespark-1505</repositoryURI>
      <created>2025-11-16T20:23:35.413Z</created>
      <createdDate>Sun Nov 16 20:23:35 UTC 2025</createdDate>
      <createdTimestamp>1763324615413</createdTimestamp>
      <updated>2025-11-16T21:02:45.041Z</updated>
      <updatedDate>Sun Nov 16 21:02:45 UTC 2025</updatedDate>
      <updatedTimestamp>1763326965041</updatedTimestamp>
      <description>Apache Spark 4.1.0-preview4 (commit c125aea)</description>
```

No behavior change.

Manually test.

```
$ curl --retry 10 --retry-all-errors -s -u "$ASF_USERNAME:$ASF_PASSWORD" https://repository.apache.org/service/local/staging/profile_repositories | grep -A 13 "<repositoryId>orgapachespark-" | awk '/<repositoryId>/ { id = $0 } /<description>/ && $0 ~ /Apache Spark '"$RELEASE_VERSION"'/ { print id }'
      <repositoryId>orgapachespark-1505</repositoryId>
```

After merging this, I'm going to test with `Apache Spark 4.1.0-preview4`.

No.

Closes #53136 from dongjoon-hyun/SPARK-54426.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit be281fb)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
@dongjoon-hyun
Copy link
Member Author

Oh, thanks, @HyukjinKwon !

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented Nov 20, 2025

I cancelled the release job because other issues. Initially, release GitHub Action job didn't support to make a preview from branch-4.1 and I released it manually. The situation happens in the finalize step in the same way like the following. I gave RELEASE_VERSION: 4.1.0-preview4 specifically, but during the processing, it's replaced to v4.1.0-rc1.

env:
    GIT_BRANCH: branch-4.1
    RELEASE_VERSION: 4.1.0-preview4
    SPARK_RC_COUNT: 1
    IS_FINALIZE: true
    GIT_NAME: ***-hyun
    ASF_USERNAME: ***
    ASF_PASSWORD: ***
    GPG_PRIVATE_KEY: ***
  
    GPG_PASSPHRASE: ***
    PYPI_API_TOKEN: ***
    DEBUG_MODE: 1
    ANSWER: y

...

================
Release details:
BRANCH:     branch-4.1
VERSION:    4.1.0
TAG:        v4.1.0-rc1
NEXT:       4.1.1-SNAPSHOT
ASF USER:   ***
GPG KEY:    ***@apache.org
FULL NAME:  ***-hyun
E-MAIL:     ***@apache.org
================

For the actual testing, I'll do next week with Apache Spark 4.1.0 RC and finalizing it, @HyukjinKwon .

@dongjoon-hyun
Copy link
Member Author

For 4.1.0-preview4, I'll proceed to finalize manually.

@HyukjinKwon
Copy link
Member

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants