Skip to content

Commit

Permalink
[SPARK-37121][HIVE][TEST] Fix Python version detection bug in TestUti…
Browse files Browse the repository at this point in the history
…ls used by HiveExternalCatalogVersionsSuite

### What changes were proposed in this pull request?
Fix a bug in `TestUtils.isPythonVersionAtLeast38` to allow for `HiveExternalCatalogVersionsSuite` to test against Spark 2.x releases in environments with Python <= 3.7.

### Why are the changes needed?
The logic in `TestUtils.isPythonVersionAtLeast38` was added in #30044 to prevent Spark 2.4 from being run in an environment where the Python3 version installed was >= Python 3.8, which is not compatible with Spark 2.4. However, this method always returns true, so only Spark 3.x versions will ever be included in the version set for `HiveExternalCatalogVersionsSuite`, regardless of the system-installed version of Python.

The problem is here:
https://github.com/apache/spark/blob/951efb80856e2a92ba3690886c95643567dae9d0/core/src/main/scala/org/apache/spark/TestUtils.scala#L280-L291
It's trying to evaluate the version of Python using a `ProcessLogger`, but the logger accepts a `String => Unit` function, i.e., it does not make use of the return value in any way (since it's meant for logging). So the result of the `startsWith` checks are thrown away, and `attempt.isSuccess && attempt.get == 0` will always be true as long as your system has a `python3` binary (of any version).

### Does this PR introduce _any_ user-facing change?
No, test changes only.

### How was this patch tested?
Confirmed by checking that `HiveExternalCatalogVersionsSuite` downloads binary distros for Spark 2.x lines as well as 3.x when I symlink my `python3` to Python 3.7, and only downloads distros for the 3.x lines when I symlink my `python3` to Python 3.9.

```bash
brew link --force python3.7
# run HiveExternalCatalogVersionsSuite and validate that 2.x and 3.x tests get executed
brew unlink python3.7
brew link --force python3.9
# run HiveExternalCatalogVersionsSuite and validate that only 3.x tests get executed
```

Closes #34395 from xkrogen/xkrogen-SPARK-37121-testutils-python38-fix.

Authored-by: Erik Krogen <xkrogen@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
  • Loading branch information
xkrogen authored and HyukjinKwon committed Oct 28, 2021
1 parent 3319361 commit 30e1261
Showing 1 changed file with 3 additions and 10 deletions.
13 changes: 3 additions & 10 deletions core/src/main/scala/org/apache/spark/TestUtils.scala
Expand Up @@ -278,16 +278,9 @@ private[spark] object TestUtils {
}

def isPythonVersionAtLeast38(): Boolean = {
val attempt = if (Utils.isWindows) {
Try(Process(Seq("cmd.exe", "/C", "python3 --version"))
.run(ProcessLogger(s => s.startsWith("Python 3.8") || s.startsWith("Python 3.9")))
.exitValue())
} else {
Try(Process(Seq("sh", "-c", "python3 --version"))
.run(ProcessLogger(s => s.startsWith("Python 3.8") || s.startsWith("Python 3.9")))
.exitValue())
}
attempt.isSuccess && attempt.get == 0
val cmdSeq = if (Utils.isWindows) Seq("cmd.exe", "/C") else Seq("sh", "-c")
val pythonSnippet = "import sys; sys.exit(sys.version_info < (3, 8, 0))"
Try(Process(cmdSeq :+ s"python3 -c '$pythonSnippet'").! == 0).getOrElse(false)
}

/**
Expand Down

0 comments on commit 30e1261

Please sign in to comment.