New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-41377][BUILD] Fix spark-version-info.properties not found on Windows #38903
[SPARK-41377][BUILD] Fix spark-version-info.properties not found on Windows #38903
Conversation
@rxin could you please review this PR? |
Can one of the admins verify this patch? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @GauthamBanasandra .
We need to register build/spark-build-info.ps1
to AppVeyor.yml like the following. That will help you verify.
Line 30 in 89b2ee2
- dev/appveyor-install-dependencies.ps1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. I verified this PR on Windows manually.
C:\Users\dongj\spark>type spark-version-info.properties
version=3.4.0-SNAPSHOT
user=dongj
revision=b5fc6ed1a8924cebd6632312ad7bcbd956d2171b
branch=spark-version-info-ps
date=2022-12-09T17:53:44Z
url=https://github.com/apache/spark.git
Merged to master for Apache Spark 3.4.0.
I added you to the Apache Spark JIRA contributor group and assign SPARK-41377 to you. |
@dongjoon-hyun Thanks for the help and review. 😊 |
…indows ### What changes were proposed in this pull request? This PR enhances the Maven build configuration to automatically detect and switch between using Powershell for Windows and Bash for non-Windows OS to generate `spark-version-info.properties` file. ### Why are the changes needed? While building Spark, the `spark-version-info.properties` file [is generated using bash](https://github.com/apache/spark/blob/d62c18b7497997188ec587e1eb62e75c979c1c93/core/pom.xml#L560-L564). In Windows environment, if Windows Subsystem for Linux (WSL) is installed, it somehow overrides the other bash executables in the PATH, as noted in SPARK-40739. The bash in WSL has a different mounting configuration and thus, [the target location specified for spark-version-info.properties](https://github.com/apache/spark/blob/d62c18b7497997188ec587e1eb62e75c979c1c93/core/pom.xml#L561-L562) won't be the expected location. Ultimately, this leads to `spark-version-info.properties` to get excluded from the spark-core jar, thus causing the SparkContext initialization to fail with the above depicted error message. This PR fixes the issue by directing the build system to use the right shell according to the platform. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? I tested this by building on a Windows 10 PC. ```psh mvn -Pyarn '-Dhadoop.version=3.3.0' -DskipTests clean package ``` Once the build finished, I verified that `spark-version-info.properties` file was included in the spark-core jar. ![image](https://user-images.githubusercontent.com/10280768/205497898-80e53617-c991-460e-b04a-a3bdd4f298ae.png) I also ran the SparkPi application and verified that it ran successfully without any errors. ![image](https://user-images.githubusercontent.com/10280768/205499567-f6e8e10a-dcbb-45fb-b282-fc29ba58adee.png) Closes apache#38903 from GauthamBanasandra/spark-version-info-ps. Authored-by: Gautham Banasandra <gautham.bangalore@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
This PR enhances the Maven build configuration to automatically detect and switch between using Powershell for Windows and Bash for non-Windows OS to generate
spark-version-info.properties
file.Why are the changes needed?
While building Spark, the
spark-version-info.properties
file is generated using bash. In Windows environment, if Windows Subsystem for Linux (WSL) is installed, it somehow overrides the other bash executables in the PATH, as noted in SPARK-40739. The bash in WSL has a different mounting configuration and thus, the target location specified for spark-version-info.properties won't be the expected location. Ultimately, this leads tospark-version-info.properties
to get excluded from the spark-core jar, thus causing the SparkContext initialization to fail with the above depicted error message.This PR fixes the issue by directing the build system to use the right shell according to the platform.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
I tested this by building on a Windows 10 PC.
Once the build finished, I verified that
spark-version-info.properties
file was included in the spark-core jar.I also ran the SparkPi application and verified that it ran successfully without any errors.