Kubernetes 1.17 #68

akirillov · 2020-08-19T20:37:43Z

What changes were proposed in this pull request?

This PR cherry-picks two commits from the upstream related to fabric8 client version and required for Spark to support Kubernetes version 1.17.

Why are the changes needed?

To provide Kubernetes version 1.17 support for Spark 2.4.5.

Does this PR introduce any user-facing change?

No

How was this patch tested?

unit tests from this repo
e2e tests from KUDO Spark

Bump fabric8 kubernetes-client to 4.7.1 New fabric8 version brings support for Kubernetes 1.17 clusters. Full release notes: - https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.0 - https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.1 No Existing unit and integration tests cover creation of K8S objects. Adjusted them to work with the new fabric8 version Closes apache#27443 from onursatici/os/bump-fabric8. Authored-by: Onur Satici <onursatici@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>

### What changes were proposed in this pull request? This PR aims to upgrade `kubernetes-client` library to bring the JDK8 related fixes. Please note that JDK11 works fine without any problem. - https://github.com/fabric8io/kubernetes-client/releases/tag/v4.9.2 - JDK8 always uses http/1.1 protocol (Prevent OkHttp from wrongly enabling http/2) ### Why are the changes needed? OkHttp "wrongly" detects the Platform as Jdk9Platform on JDK 8u251. - fabric8io/kubernetes-client#2212 - https://stackoverflow.com/questions/61565751/why-am-i-not-able-to-run-sparkpi-example-on-a-kubernetes-k8s-cluster Although there is a workaround `export HTTP2_DISABLE=true` and `Downgrade JDK or K8s`, we had better avoid this problematic situation. ### Does this PR introduce _any_ user-facing change? No. This will recover the failures on JDK 8u252. ### How was this patch tested? - [x] Pass the Jenkins UT (apache#28601 (comment)) - [x] Pass the Jenkins K8S IT with the K8s 1.13 (apache#28601 (comment)) - [x] Manual testing with K8s 1.17.3. (Below) **v1.17.6 result (on Minikube)** ``` KubernetesSuite: - Run SparkPi with no resources - Run SparkPi with a very long application name. - Use SparkLauncher.NO_RESOURCE - Run SparkPi with a master URL without a scheme. - Run SparkPi with an argument. - Run SparkPi with custom labels, annotations, and environment variables. - All pods have the same service account by default - Run extraJVMOptions check on driver - Run SparkRemoteFileTest using a remote data file - Run SparkPi with env and mount secrets. - Run PySpark on simple pi.py example - Run PySpark with Python2 to test a pyfiles example - Run PySpark with Python3 to test a pyfiles example - Run PySpark with memory customization - Run in client mode. - Start pod creation from template - PVs with local storage - Launcher client dependencies - Test basic decommissioning Run completed in 8 minutes, 27 seconds. Total number of tests run: 19 Suites: completed 2, aborted 0 Tests: succeeded 19, failed 0, canceled 0, ignored 0, pending 0 All tests passed. ``` Closes apache#28601 from dongjoon-hyun/SPARK-K8S-CLIENT. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

… coerce to nullable type ### What changes were proposed in this pull request? This PR targets for non-nullable null type not to coerce to nullable type in complex types. Non-nullable fields in struct, elements in an array and entries in map can mean empty array, struct and map. They are empty so it does not need to force the nullability when we find common types. This PR also reverts and supersedes apache@d7b97a1 ### Why are the changes needed? To make type coercion coherent and consistent. Currently, we correctly keep the nullability even between non-nullable fields: ```scala import org.apache.spark.sql.types._ import org.apache.spark.sql.functions._ spark.range(1).select(array(lit(1)).cast(ArrayType(IntegerType, false))).printSchema() spark.range(1).select(array(lit(1)).cast(ArrayType(DoubleType, false))).printSchema() ``` ```scala spark.range(1).selectExpr("concat(array(1), array(1)) as arr").printSchema() ``` ### Does this PR introduce any user-facing change? Yes. ```scala import org.apache.spark.sql.types._ import org.apache.spark.sql.functions._ spark.range(1).select(array().cast(ArrayType(IntegerType, false))).printSchema() ``` ```scala spark.range(1).selectExpr("concat(array(), array(1)) as arr").printSchema() ``` **Before:** ``` org.apache.spark.sql.AnalysisException: cannot resolve 'array()' due to data type mismatch: cannot cast array<null> to array<int>;; 'Project [cast(array() as array<int>) AS array()#68] +- Range (0, 1, step=1, splits=Some(12)) at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:149) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:140) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$2(TreeNode.scala:333) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:72) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:333) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$1(TreeNode.scala:330) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:399) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:237) ``` ``` root |-- arr: array (nullable = false) | |-- element: integer (containsNull = true) ``` **After:** ``` root |-- array(): array (nullable = false) | |-- element: integer (containsNull = false) ``` ``` root |-- arr: array (nullable = false) | |-- element: integer (containsNull = false) ``` ### How was this patch tested? Unittests were added and manually tested. Closes apache#27991 from HyukjinKwon/SPARK-31227. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 3bd10ce) Signed-off-by: Wenchen Fan <wenchen@databricks.com>

onursatici and others added 2 commits August 19, 2020 13:30

akirillov requested a review from alembiewski August 19, 2020 20:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubernetes 1.17 #68

Kubernetes 1.17 #68

akirillov commented Aug 19, 2020

Kubernetes 1.17 #68

Are you sure you want to change the base?

Kubernetes 1.17 #68

Conversation

akirillov commented Aug 19, 2020

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?