SPARK-1324: SparkUI Should Not Bind to SPARK_PUBLIC_DNS #231

pwendell · 2014-03-25T23:43:03Z

/cc @aarondav and @andrewor14

AmplabJenkins · 2014-03-26T00:13:54Z

Merged build triggered.

AmplabJenkins · 2014-03-26T00:13:54Z

Merged build started.

CodingCat · 2014-03-26T00:16:20Z

Will this cause some problem in EC2, where the bindHost is usually the private IP?

if the JettyServer is started with the private IP address, the user cannot access through Internet....

(I didn't test)

pwendell · 2014-03-26T00:20:09Z

I don't think this would cause any problems. The user can override both the internal and external ip's using SPARK_LOCAL_IP and SPARK_PUBLIC_DNS. This patch makes sure the right value is used in the right circumstance.

AmplabJenkins · 2014-03-26T01:11:29Z

Merged build finished.

AmplabJenkins · 2014-03-26T01:11:29Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13449/

aarondav · 2014-03-26T22:09:19Z

LGTM. This is already how other UIs work, like WorkerWebUI, so it should not break anything.

pwendell · 2014-03-27T01:23:11Z

Merged, thanks

Bumping version numbers for 0.8.1 release This bumps various version numbers for the release. Note that we don't bump any of the pom.xml files because they get automatically updated as part of the maven release plug-ins.

@aarondav

/cc @aarondav and @andrewor14 Author: Patrick Wendell <pwendell@gmail.com> Closes apache#231 from pwendell/ui-binding and squashes the following commits: e8025f8 [Patrick Wendell] SPARK-1324: SparkUI Should Not Bind to SPARK_PUBLIC_DNS

…ss garbage ## What changes were proposed in this pull request? Commit markers are expensive for readers to encounter, and we keep them for 72 hours. However it should be safe to delete them after a shorter period of time; e.g. 1 hr, after which assume they are ok to delete. ## How was this patch tested? Unit tests. Author: Eric Liang <ekl@databricks.com> Closes apache#231 from ericl/es-2485.

* Allow spark driver find shuffle pods in specified namespace The conf property spark.kubernetes.shuffle.namespace is used to specify the namesapce of shuffle pods. In normal cases, only one "shuffle daemonset" is deployed and shared by all spark pods. The spark driver should be able to list and watch shuffle pods in the namespace specified by user. Note: by default, spark driver pod doesn't have authority to list and watch shuffle pods in another namespace. Some action is needed to grant it the authority. For example, below ABAC policy works. ``` {"apiVersion": "abac.authorization.kubernetes.io/v1beta1", "kind": "Policy", "spec": {"group": "system:serviceaccounts", "namespace": "SHUFFLE_NAMESPACE", "resource": "pods", "readonly": true}} ``` (cherry picked from commit a6291c6) * Bypass init-containers when possible (cherry picked from commit 08fe944) * Config for hard cpu limit on pods; default unlimited (cherry picked from commit 8b3248f) * Allow number of executor cores to have fractional values This commit tries to solve issue apache#359 by allowing the `spark.executor.cores` configuration key to take fractional values, e.g., 0.5 or 1.5. The value is used to specify the cpu request when creating the executor pods, which is allowed to be fractional by Kubernetes. When the value is passed to the executor process through the environment variable `SPARK_EXECUTOR_CORES`, the value is rounded up to the closest integer as required by the `CoarseGrainedExecutorBackend`. Signed-off-by: Yinan Li <ynli@google.com>(cherry picked from commit 6f6cfd6) * Python Bindings for launching PySpark Jobs from the JVM * Adding PySpark Submit functionality. Launching Python from JVM * Addressing scala idioms related to PR351 * Removing extends Logging which was necessary for LogInfo * Refactored code to leverage the ContainerLocalizedFileResolver * Modified Unit tests so that they would pass * Modified Unit Test input to pass Unit Tests * Setup working environent for integration tests for PySpark * Comment out Python thread logic until Jenkins has python in Python * Modifying PythonExec to pass on Jenkins * Modifying python exec * Added unit tests to ClientV2 and refactored to include pyspark submission resources * Modified unit test check * Scalastyle * PR 348 file conflicts * Refactored unit tests and styles * further scala stylzing and logic * Modified unit tests to be more specific towards Class in question * Removed space delimiting for methods * Submission client redesign to use a step-based builder pattern. This change overhauls the underlying architecture of the submission client, but it is intended to entirely preserve existing behavior of Spark applications. Therefore users will find this to be an invisible change. The philosophy behind this design is to reconsider the breakdown of the submission process. It operates off the abstraction of "submission steps", which are transformation functions that take the previous state of the driver and return the new state of the driver. The driver's state includes its Spark configurations and the Kubernetes resources that will be used to deploy it. Such a refactor moves away from a features-first API design, which considers different containers to serve a set of features. The previous design, for example, had a container files resolver API object that returned different resolutions of the dependencies added by the user. However, it was up to the main Client to know how to intelligently invoke all of those APIs. Therefore the API surface area of the file resolver became untenably large and it was not intuitive of how it was to be used or extended. This design changes the encapsulation layout; every module is now responsible for changing the driver specification directly. An orchestrator builds the correct chain of steps and hands it to the client, which then calls it verbatim. The main client then makes any final modifications that put the different pieces of the driver together, particularly to attach the driver container itself to the pod and to apply the Spark configuration as command-line arguments. * Don't add the init-container step if all URIs are local. * Python arguments patch + tests + docs * Revert "Python arguments patch + tests + docs" This reverts commit 4533df2. * Revert "Don't add the init-container step if all URIs are local." This reverts commit e103225. * Revert "Submission client redesign to use a step-based builder pattern." This reverts commit 5499f6d. * style changes * space for styling (cherry picked from commit befcf0a) Conflicts: README.md core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala * Submission client redesign to use a step-based builder pattern * Submission client redesign to use a step-based builder pattern. This change overhauls the underlying architecture of the submission client, but it is intended to entirely preserve existing behavior of Spark applications. Therefore users will find this to be an invisible change. The philosophy behind this design is to reconsider the breakdown of the submission process. It operates off the abstraction of "submission steps", which are transformation functions that take the previous state of the driver and return the new state of the driver. The driver's state includes its Spark configurations and the Kubernetes resources that will be used to deploy it. Such a refactor moves away from a features-first API design, which considers different containers to serve a set of features. The previous design, for example, had a container files resolver API object that returned different resolutions of the dependencies added by the user. However, it was up to the main Client to know how to intelligently invoke all of those APIs. Therefore the API surface area of the file resolver became untenably large and it was not intuitive of how it was to be used or extended. This design changes the encapsulation layout; every module is now responsible for changing the driver specification directly. An orchestrator builds the correct chain of steps and hands it to the client, which then calls it verbatim. The main client then makes any final modifications that put the different pieces of the driver together, particularly to attach the driver container itself to the pod and to apply the Spark configuration as command-line arguments. * Add a unit test for BaseSubmissionStep. * Add unit test for kubernetes credentials mounting. * Add unit test for InitContainerBootstrapStep. * unit tests for initContainer * Add a unit test for DependencyResolutionStep. * further modifications to InitContainer unit tests * Use of resolver in PythonStep and unit tests for PythonStep * refactoring of init unit tests and pythonstep resolver logic * Add unit test for KubernetesSubmissionStepsOrchestrator. * refactoring and addition of secret trustStore+Cert checks in a SubmissionStepSuite * added SparkPodInitContainerBootstrapSuite * Added InitContainerResourceStagingServerSecretPluginSuite * style in Unit tests * extremely minor style fix in variable naming * Address comments. * Rename class for consistency. * Attempt to make spacing consistent. Multi-line methods should have four-space indentation for arguments that aren't on the same line as the method call itself... but this is difficult to do consistently given how IDEs handle Scala multi-line indentation in most cases. (cherry picked from commit 0f4368f) Conflicts: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/kubernetes/KubernetesClusterSchedulerBackend.scala * Add implicit conversions to imports. Otherwise we can get a Scalastyle error when building from SBT. (cherry picked from commit 7deaaa3) * Fix import order and scalastyle Test with ./dev/scalastyle * fix submit job errors (cherry picked from commit 8751a9a) * Add node selectors for driver and executor pods (cherry picked from commit 6dbd32e) * Retry binding server to random port in the resource staging server test. * Retry binding server to random port in the resource staging server test. * Break if successful start * Start server in try block. * FIx scalastyle * More rigorous cleanup logic. Increment port numbers. * Move around more exception logic. * More exception refactoring. * Remove whitespace * Fix test * Rename variable * Scalastyle fix

* MapR [SPARK-164] Update Kafka version to 1.0.1-mapr in Spark Kafka Producer module

* try new JCE * fix envvar in marathon.json.mustache

Update K8S conformance test log secret

* MapR [SPARK-164] Update Kafka version to 1.0.1-mapr in Spark Kafka Producer module

…subquery reuse ### What changes were proposed in this pull request? This PR: 1. Fixes an issue in `ReuseExchange` rule that can result a `ReusedExchange` node pointing to an invalid exchange. This can happen due to the 2 separate traversals in `ReuseExchange` when the 2nd traversal modifies an exchange that has already been referenced (reused) in the 1st traversal. Consider the following query: ``` WITH t AS ( SELECT df1.id, df2.k FROM df1 JOIN df2 ON df1.k = df2.k WHERE df2.id < 2 ) SELECT * FROM t AS a JOIN t AS b ON a.id = b.id ``` Before this PR the plan of the query was (note the `<== this reuse node points to a non-existing node` marker): ``` == Physical Plan == *(7) SortMergeJoin [id#14L], [id#18L], Inner :- *(3) Sort [id#14L ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(id#14L, 5), true, [id=#298] : +- *(2) Project [id#14L, k#17L] : +- *(2) BroadcastHashJoin [k#15L], [k#17L], Inner, BuildRight : :- *(2) Project [id#14L, k#15L] : : +- *(2) Filter isnotnull(id#14L) : : +- *(2) ColumnarToRow : : +- FileScan parquet default.df1[id#14L,k#15L] Batched: true, DataFilters: [isnotnull(id#14L)], Format: Parquet, Location: InMemoryFileIndex[file:/Users/petertoth/git/apache/spark/sql/core/spark-warehouse/org.apache.spar..., PartitionFilters: [isnotnull(k#15L), dynamicpruningexpression(k#15L IN dynamicpruning#26)], PushedFilters: [IsNotNull(id)], ReadSchema: struct<id:bigint> : : +- SubqueryBroadcast dynamicpruning#26, 0, [k#17L], [id=#289] : : +- ReusedExchange [k#17L], BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, true])), [id=#179] : +- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, true])), [id=#179] : +- *(1) Project [k#17L] : +- *(1) Filter ((isnotnull(id#16L) AND (id#16L < 2)) AND isnotnull(k#17L)) : +- *(1) ColumnarToRow : +- FileScan parquet default.df2[id#16L,k#17L] Batched: true, DataFilters: [isnotnull(id#16L), (id#16L < 2), isnotnull(k#17L)], Format: Parquet, Location: InMemoryFileIndex[file:/Users/petertoth/git/apache/spark/sql/core/spark-warehouse/org.apache.spar..., PartitionFilters: [], PushedFilters: [IsNotNull(id), LessThan(id,2), IsNotNull(k)], ReadSchema: struct<id:bigint,k:bigint> +- *(6) Sort [id#18L ASC NULLS FIRST], false, 0 +- ReusedExchange [id#18L, k#21L], Exchange hashpartitioning(id#14L, 5), true, [id=#184] <== this reuse node points to a non-existing node ``` After this PR: ``` == Physical Plan == *(7) SortMergeJoin [id#14L], [id#18L], Inner :- *(3) Sort [id#14L ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(id#14L, 5), true, [id=#231] : +- *(2) Project [id#14L, k#17L] : +- *(2) BroadcastHashJoin [k#15L], [k#17L], Inner, BuildRight : :- *(2) Project [id#14L, k#15L] : : +- *(2) Filter isnotnull(id#14L) : : +- *(2) ColumnarToRow : : +- FileScan parquet default.df1[id#14L,k#15L] Batched: true, DataFilters: [isnotnull(id#14L)], Format: Parquet, Location: InMemoryFileIndex[file:/Users/petertoth/git/apache/spark/sql/core/spark-warehouse/org.apache.spar..., PartitionFilters: [isnotnull(k#15L), dynamicpruningexpression(k#15L IN dynamicpruning#26)], PushedFilters: [IsNotNull(id)], ReadSchema: struct<id:bigint> : : +- SubqueryBroadcast dynamicpruning#26, 0, [k#17L], [id=#103] : : +- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, true])), [id=#102] : : +- *(1) Project [k#17L] : : +- *(1) Filter ((isnotnull(id#16L) AND (id#16L < 2)) AND isnotnull(k#17L)) : : +- *(1) ColumnarToRow : : +- FileScan parquet default.df2[id#16L,k#17L] Batched: true, DataFilters: [isnotnull(id#16L), (id#16L < 2), isnotnull(k#17L)], Format: Parquet, Location: InMemoryFileIndex[file:/Users/petertoth/git/apache/spark/sql/core/spark-warehouse/org.apache.spar..., PartitionFilters: [], PushedFilters: [IsNotNull(id), LessThan(id,2), IsNotNull(k)], ReadSchema: struct<id:bigint,k:bigint> : +- ReusedExchange [k#17L], BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, true])), [id=#102] +- *(6) Sort [id#18L ASC NULLS FIRST], false, 0 +- ReusedExchange [id#18L, k#21L], Exchange hashpartitioning(id#14L, 5), true, [id=#231] ``` 2. Fixes an issue with separate consecutive `ReuseExchange` and `ReuseSubquery` rules that can result a `ReusedExchange` node pointing to an invalid exchange. This can happen due to the 2 separate rules when `ReuseSubquery` rule modifies an exchange that has already been referenced (reused) in `ReuseExchange` rule. Consider the following query: ``` WITH t AS ( SELECT df1.id, df2.k FROM df1 JOIN df2 ON df1.k = df2.k WHERE df2.id < 2 ), t2 AS ( SELECT * FROM t UNION SELECT * FROM t ) SELECT * FROM t2 AS a JOIN t2 AS b ON a.id = b.id ``` Before this PR the plan of the query was (note the `<== this reuse node points to a non-existing node` marker): ``` == Physical Plan == *(15) SortMergeJoin [id#46L], [id#58L], Inner :- *(7) Sort [id#46L ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(id#46L, 5), true, [id=#979] : +- *(6) HashAggregate(keys=[id#46L, k#49L], functions=[]) : +- Exchange hashpartitioning(id#46L, k#49L, 5), true, [id=#975] : +- *(5) HashAggregate(keys=[id#46L, k#49L], functions=[]) : +- Union : :- *(2) Project [id#46L, k#49L] : : +- *(2) BroadcastHashJoin [k#47L], [k#49L], Inner, BuildRight : : :- *(2) Project [id#46L, k#47L] : : : +- *(2) Filter isnotnull(id#46L) : : : +- *(2) ColumnarToRow : : : +- FileScan parquet default.df1[id#46L,k#47L] Batched: true, DataFilters: [isnotnull(id#46L)], Format: Parquet, Location: InMemoryFileIndex[file:/Users/petertoth/git/apache/spark/sql/core/spark-warehouse/org.apache.spar..., PartitionFilters: [isnotnull(k#47L), dynamicpruningexpression(k#47L IN dynamicpruning#66)], PushedFilters: [IsNotNull(id)], ReadSchema: struct<id:bigint> : : : +- SubqueryBroadcast dynamicpruning#66, 0, [k#49L], [id=#926] : : : +- ReusedExchange [k#49L], BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, true])), [id=#656] : : +- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, true])), [id=#656] : : +- *(1) Project [k#49L] : : +- *(1) Filter ((isnotnull(id#48L) AND (id#48L < 2)) AND isnotnull(k#49L)) : : +- *(1) ColumnarToRow : : +- FileScan parquet default.df2[id#48L,k#49L] Batched: true, DataFilters: [isnotnull(id#48L), (id#48L < 2), isnotnull(k#49L)], Format: Parquet, Location: InMemoryFileIndex[file:/Users/petertoth/git/apache/spark/sql/core/spark-warehouse/org.apache.spar..., PartitionFilters: [], PushedFilters: [IsNotNull(id), LessThan(id,2), IsNotNull(k)], ReadSchema: struct<id:bigint,k:bigint> : +- *(4) Project [id#46L, k#49L] : +- *(4) BroadcastHashJoin [k#47L], [k#49L], Inner, BuildRight : :- *(4) Project [id#46L, k#47L] : : +- *(4) Filter isnotnull(id#46L) : : +- *(4) ColumnarToRow : : +- FileScan parquet default.df1[id#46L,k#47L] Batched: true, DataFilters: [isnotnull(id#46L)], Format: Parquet, Location: InMemoryFileIndex[file:/Users/petertoth/git/apache/spark/sql/core/spark-warehouse/org.apache.spar..., PartitionFilters: [isnotnull(k#47L), dynamicpruningexpression(k#47L IN dynamicpruning#66)], PushedFilters: [IsNotNull(id)], ReadSchema: struct<id:bigint> : : +- ReusedSubquery SubqueryBroadcast dynamicpruning#66, 0, [k#49L], [id=#926] : +- ReusedExchange [k#49L], BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, true])), [id=#656] +- *(14) Sort [id#58L ASC NULLS FIRST], false, 0 +- ReusedExchange [id#58L, k#61L], Exchange hashpartitioning(id#46L, 5), true, [id=#761] <== this reuse node points to a non-existing node ``` After this PR: ``` == Physical Plan == *(15) SortMergeJoin [id#46L], [id#58L], Inner :- *(7) Sort [id#46L ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(id#46L, 5), true, [id=#793] : +- *(6) HashAggregate(keys=[id#46L, k#49L], functions=[]) : +- Exchange hashpartitioning(id#46L, k#49L, 5), true, [id=#789] : +- *(5) HashAggregate(keys=[id#46L, k#49L], functions=[]) : +- Union : :- *(2) Project [id#46L, k#49L] : : +- *(2) BroadcastHashJoin [k#47L], [k#49L], Inner, BuildRight : : :- *(2) Project [id#46L, k#47L] : : : +- *(2) Filter isnotnull(id#46L) : : : +- *(2) ColumnarToRow : : : +- FileScan parquet default.df1[id#46L,k#47L] Batched: true, DataFilters: [isnotnull(id#46L)], Format: Parquet, Location: InMemoryFileIndex[file:/Users/petertoth/git/apache/spark/sql/core/spark-warehouse/org.apache.spar..., PartitionFilters: [isnotnull(k#47L), dynamicpruningexpression(k#47L IN dynamicpruning#66)], PushedFilters: [IsNotNull(id)], ReadSchema: struct<id:bigint> : : : +- SubqueryBroadcast dynamicpruning#66, 0, [k#49L], [id=#485] : : : +- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, true])), [id=#484] : : : +- *(1) Project [k#49L] : : : +- *(1) Filter ((isnotnull(id#48L) AND (id#48L < 2)) AND isnotnull(k#49L)) : : : +- *(1) ColumnarToRow : : : +- FileScan parquet default.df2[id#48L,k#49L] Batched: true, DataFilters: [isnotnull(id#48L), (id#48L < 2), isnotnull(k#49L)], Format: Parquet, Location: InMemoryFileIndex[file:/Users/petertoth/git/apache/spark/sql/core/spark-warehouse/org.apache.spar..., PartitionFilters: [], PushedFilters: [IsNotNull(id), LessThan(id,2), IsNotNull(k)], ReadSchema: struct<id:bigint,k:bigint> : : +- ReusedExchange [k#49L], BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, true])), [id=#484] : +- *(4) Project [id#46L, k#49L] : +- *(4) BroadcastHashJoin [k#47L], [k#49L], Inner, BuildRight : :- *(4) Project [id#46L, k#47L] : : +- *(4) Filter isnotnull(id#46L) : : +- *(4) ColumnarToRow : : +- FileScan parquet default.df1[id#46L,k#47L] Batched: true, DataFilters: [isnotnull(id#46L)], Format: Parquet, Location: InMemoryFileIndex[file:/Users/petertoth/git/apache/spark/sql/core/spark-warehouse/org.apache.spar..., PartitionFilters: [isnotnull(k#47L), dynamicpruningexpression(k#47L IN dynamicpruning#66)], PushedFilters: [IsNotNull(id)], ReadSchema: struct<id:bigint> : : +- ReusedSubquery SubqueryBroadcast dynamicpruning#66, 0, [k#49L], [id=#485] : +- ReusedExchange [k#49L], BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, true])), [id=#484] +- *(14) Sort [id#58L ASC NULLS FIRST], false, 0 +- ReusedExchange [id#58L, k#61L], Exchange hashpartitioning(id#46L, 5), true, [id=#793] ``` (This example contains issue 1 as well.) 3. Improves the reuse of exchanges and subqueries by enabling reuse across the whole plan. This means that the new combined rule utilizes the reuse opportunities between parent and subqueries by traversing the whole plan. The traversal is started on the top level query only. 4. Due to the order of traversal this PR does while adding reuse nodes, the reuse nodes appear in parent queries if reuse is possible between different levels of queries (typical for DPP). This is not an issue from execution perspective, but this also means "forward references" in explain formatted output where parent queries come first. The changes I made to `ExplainUtils` are to handle these references properly. This PR fixes the above 3 issues by unifying the separate rules into a `ReuseExchangeAndSubquery` rule that does a 1 pass, whole-plan, bottom-up traversal. ### Why are the changes needed? Performance improvement. ### How was this patch tested? - New UTs in `ReuseExchangeAndSubquerySuite` to cover 1. and 2. - New UTs in `DynamicPartitionPruningSuite`, `SubquerySuite` and `ExchangeSuite` to cover 3. - New `ReuseMapSuite` to test `ReuseMap`. - Checked new golden files of `PlanStabilitySuite`s for invalid reuse references. - TPCDS benchmarks. Closes #28885 from peter-toth/SPARK-29375-SPARK-28940-whole-plan-reuse. Authored-by: Peter Toth <peter.toth@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

…cript K8S-1077 (apache#598) * K8S-1077 - use single k8s secret with user info MapR [SPARK-651] Replacing joda-time-*.jar with joda-time-2.10.3.jar. MapR [SPARK-638] Wrong permissions when creating files under directory with GID bit set. MapR [SPARK-627] SparkHistoryServer-2.4 is getting 403 Unauthorized home page for users(spark.ui.view.acls) via spark-submit MapR [SPARK-639] Default headers are adding two times MapR [SPARK-629] Spark UI for job lose CSS styles MapR [MS-925] After upgrade to MEP 6.2 (Spark 2.4.0) can no longer consume Kafka / MapR Streams. MapR [SPARK-626] Update kafka dependencies for Spark 2.4.4.0 in release MEP-6.3.0 MapR [SPARK-340] Jetty web server version at Spark should be updated tp v9.4.X MapR [SPARK-617] an't use ssl via spark beeline MapR [SPARK-617] Can't use ssl via spark beeline MapR [SPARK-620] Replace core dependency in Spark-2.4.4 MapR [SPARK-621] Fix multiple XML configuration initialization for (apache#575) custom headers. Use X-XSS-Protection, X-Content-Type-Options Content-Security-Policy and Strict-Transport-Security configuration only in case: cluster security is enabled OR spark.ui.security.headers.enabled set to true. MapR [SPARK-595] Spark cannot access hs2 through zookeeper Revert "MapR [SPARK-595] Spark cannot access hs2 through zookeeper (apache#577)" MapR [SPARK-595] Spark cannot access hs2 through zookeeper MapR [SPARK-620] Replace core dependency in Spark-2.4. MapR [SPARK-619] Move absent commits from 2.4.3 branch to 2.4.4 (apache#574) * Adding SQL API to write to kafka from Spark (apache#567) * Branch 2.4.3 extended kafka and examples (apache#569) * The v2 API is in its own package - the v2 api is in a different package - the old functionality is available in a separated package * v2 API examples - All the examples are using the newest API. - I have removed the old examples since they are not relevant any more and the same functionality is shown in the new examples usin the new API. * MapR [SPARK-619] Move absent commits from 2.4.3 branch to 2.4.4 CORE-321. Add custom http header support for jetty. MapR [SPARK-609] Port Apache Spark-2.4.4 changes to the MapR Spark-2.4.4 branch Adding multi table loader (apache#560) * Adding multi table loader - This allows us to load multiple matching tables into one Union DataFrame. If we have the fallowing MFS structure: ``` /clients/client_1/data.table /clients/client_2/data.table ``` we can load a union dataframe by doing `loadFromMapRDB("/clients/*/*.table")` * Fixing the path to the reader MapR [SPARK-588] Spark thriftserver fails when work with hive-maprdb json table MapR [SPARK-598] Spark can't add needed properties to hive-site.xml MAPR-SPARK-596: Change HBase compatible version for Spark 2.4.3 MapR [SPARK-592] Add possibility to use start-thriftserver.sh script with 2304 port MapR [SPARK-584] MaprDB connector's setHintUsingIndex method doesn't work as expected MapR [SPARK-583] MaprDB connector's loadFromMaprDB function for Java API doesn't work as expected SPARK-579 info about ssl_trustore is added for metrics MapR [SPARK-552] Failed to get broadcast_11_piece0 of broadcast_11 SPARK-569 Generation of SSL ceritificates for spark UI MapR [SPARK-575] Warning messages in spark workspace after the second attempt to login to job's UI Update zookeeper version Adding `joinWithMapRDBTable` function (apache#529) The related documentation of this function is here https://github.com/anicolaspp/MapRDBConnector#joinwithmaprdbtable. The main idea is that having a dataframe (no matter how was it constructed) we can join it with a MapR-DB table. This functions looks at the join query and load only those records from MapR-DB that will join instead of loading the full table and then join in memory. In other words, we only load what we know will be joint. Adding DataSource Reader Support (apache#525) * Adding DataSource Reader Support * Update SparkSessionExt.scala * creating a package object * Update MapRDBSpark.scala * fully path to avoid name collition * refactorings MapR [SPARK-451] Spark hadoop/core dependency updates MapR [SPARK-566] Move absent commits from 2.4.0 branch MapR [SPARK-561] Spark 2.4.3 porting to MapR MapR [SPARK-561] Spark 2.4.3 porting to MapR MapR [SPARK-558] Render application UI init page if driver is not up MapR [SPARK-541] Avoid duplication of the first unexpired record MapR [COLD-150][K8S] Fix metrics copy MapR [K8S-893] Hide plain text password from logs MapR [SPARK-540] Include 'avro' artifacts MapR [SPARK-536] PySpark streaming package for kafka-0-10 added K8S-853: Enable spark metrics for external tenant MapR [SPARK-531] Remove duplicating entries from classpath in ClasspathFilter MapR [SPARK-516] Spark jobs failure using yarn mode on kerberos fixed MapR [SPARK-462] Spark and SparkHistoryServer allow week ciphers, which can allow man in the middle attack [SPARK-508] MapR-DB OJAI Connector for Spark isNull condition returns incorrect result MapR [SPARK-510] nonmapr "admin" users not able to view other user logs in SHS SPARK-460: Spark Metrics for CollectD Configuration for collecting Spark metrics SPARK-463 MAPR_MAVEN_REPO variable for specifying mapR repository MapR [SPARK-492] Spark 2.4.0.0 configure.sh has error messages MapR [SPARK-515][K8S] Remove configure.sh call for k8s MapR [SPARK-515] Move configuring spark-env.sh back to the private-pkg MapR [SPARK-515] Move configuring spark-env.sh back to the private-pkg MapR [SPARK-514] Recovery from checkpoint is broken MapR [SPARK-445] Messages loss fixed by reverting [MAPR-32290] changes from kafka09 package (apache#460) * MapR [SPARK-445] Revert "[MAPR-32290] Spark processing offsets when messages are already TTL in the first batch (apache#376)" This reverts commit e8d59b9. * MapR [SPARK-445] Revert "[MAPR-32290] Spark processing offsets when messages are already ttl in first batch (apache#368)" This reverts commit b282a8b. MapR [SPARK-445] Messages loss fixed by reverting [MAPR-32290] changes from kafka10 package MapR [SPARK-469] Fix NPE in generated classes by reverting "[SPARK-23466][SQL] Remove redundant null checks in generated Java code by GenerateUnsafeProjection" (apache#455) This reverts commit c5583fd. MapR [SPARK-482] Spark streaming app fails to start by UnknownTopicOrPartitionException with checkpoint MapR [SPARK-496] Spark HS UI doesn't work MapR [SPARK-416] CVE-2018-1320 vulnerability in Apache Thrift MapR [SPARK-486][K8S] Fix sasl encryption error on Kubernetes MapR [SPARK-481] Cannot run spark configure.sh on Client node MapR [K8S-637][K8S] Add configure.sh configuration in spark-defaults.conf for job runtime MapR [SPARK-465] Error messages after update of spark 2.4 MapR [SPARK-465] Error messages after update of spark 2.4 MapR [SPARK-464] Can't submit spark 2.4 jobs from mapr-client [SPARK-466] SparkR errors fixed MapR [SPARK-456] Spark shell can't be started SPARK-417 impersonation fixes for spark executor. Impersonation is mo… (apache#433) * SPARK-417 impersonation fixes for spark executor. Impersonation is moved from HadoopRDD.compute() method to org.apache.spark.executor.Executor.run() method * SPARK-363 Hive version changed to '1.2.0-mapr-spark-MEP-6.0.0' [SPARK-449] Kafka offset commit issue fixed MapR [SPARK-287] Move logic of creating /apps/spark folder from installer's scripts to the configure.sh MapR [SPARK-221] Investigate possibility to move creating of the spark-env.sh from private-pkg to configure.sh MapR [SPARK-430] PID files should be under /opt/mapr/pid MapR [SPARK-446] Spark configure.sh doesn't start/stop Spark services MapR [SPARK-434] Move absent commits from 2.3.2 branch (apache#425) * MapR [SPARK-352] Spark shell fails with "NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream" if java is not available in PATH * MapR [SPARK-350] Deprecate Spark Kafka-09 package * MapR [SPARK-326] Investigate possibility of writing Java example for the MapRDB OJAI connector * [SPARK-356] Merge mapr changes from kafka-09 package into the kafka-10 * SPARK-319 Fix for sparkR version check * MapR [SPARK-349] Update OJAI client to v3 for Spark MapR-DB JSON connector * MapR [SPARK-367] Move absent commits from 2.3.1 branch * MapR [SPARK-137] Analyze the warning during compilation of OJAI connector * MapR [SPARK-369] Spark 2.3.2 fails with error related to zookeeper * [MAPR-26258] hbasecontext.HBaseDistributedScanExample fails * [SPARK-24355] Spark external shuffle server improvement to better handle block fetch requests * MapR [SPARK-374] Spark Hive example fails when we submit job from another(simple) cluster user * MapR [SPARK-434] Move absent commits from 2.3.2 branch * MapR [SPARK-434] Move absent commits from 2.3.2 branch * MapR [SPARK-373] Unexpected behavior during job running in standalone cluster mode * MapR [SPARK-419] Update hive-maprdb-json-handler jar for spark 2.3.2.0 and spark 2.2.1 * MapR [SPARK-396] Interface change of sendToKafka * MapR [SPARK-357] consumer groups are prepeneded with a "service_" prefix * MapR [SPARK-429] Changes in maprdb connector are the cause of broken backward compatibility * MapR [SPARK-427] Update kafka in Spark-2.4.0 to the 1.1.1-mapr * MapR [SPARK-434] Move absent commits from 2.3.2 branch * Move absent commits from 2.3.2 branch * MapR [SPARK-434] Move absent commits from 2.3.2 branch * Move absent commits from 2.3.2 branch * Move absent commits from 2.3.2 branch MapR [SPARK-427] Update kafka in Spark-2.4.0 to the 1.1.1-mapr MapR [SPARK-379] Spark 2.4 4-gidit version MapR [PIC-48][K8S] Port k8s changes to 2.4.0 [PIC-48] Create user for k8s driver and executor if required [PIC-48] Create user for k8s driver and executor if required Revert "Remove spark.ui.filters property" This reverts commit d8941ba36c3451cdce15d18d6c1a52991de3b971. [SPARK-351] Copy kubernetes start scripts anyway PIC-34: Rename default configmap name to be consistent with mapr-kubernetes [SPARK-23668][K8S] Add config option for passing through k8s Pod.spec.imagePullSecrets (apache#355) Pass through the `imagePullSecrets` option to the k8s pod in order to allow user to access private image registries. See https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/ Unit tests + manual testing. Manual testing procedure: 1. Have private image registry. 2. Spark-submit application with no `spark.kubernetes.imagePullSecret` set. Do `kubectl describe pod ...`. See the error message: ``` Error syncing pod, skipping: failed to "StartContainer" for "spark-kubernetes-driver" with ErrImagePull: "rpc error: code = 2 desc = Error: Status 400 trying to pull repository ...: \"{\\n \\\"errors\\\" : [ {\\n \\\"status\\\" : 400,\\n \\\"message\\\" : \\\"Unsupported docker v1 repository request for '...'\\\"\\n } ]\\n}\"" ``` 3. Create secret `kubectl create secret docker-registry ...` 4. Spark-submit with `spark.kubernetes.imagePullSecret` set to the new secret. See that deployment was successful. Author: Andrew Korzhuev <andrew.korzhuev@klarna.com> Author: Andrew Korzhuev <korzhuev@andrusha.me> Closes apache#20811 from andrusha/spark-23668-image-pull-secrets. [SPARK-321] Change default value of spark.mapr.ssl.secret.prefix property [PIC-32] Spark on k8s with MapR secure cluster Update entrypoint.sh with correct spark version (apache#340) This PR has minor fix to correct the spark version string [SPARK-274] Create home directory for user who submitted job [MAPR-SPARK-230] Implement security for Spark on Kubernetes Run Spark job with specify the username for driver and executor Read cluster configs from configMap Run configure.sh script form entrypoint.sh Remove spark.kubernetes.driver.pod.commands property Add Spark properties for executor and driver environment variable MapR [SPARK-296] Structured Streaming memory leak Revert "[MAPR-SPARK-210] Rename sprk-defaults.conf to spark-defaults.conf.tem…" (apache#252) * Revert "[MAPR-SPARK-176] Fix Spark Project Catalyst unit tests (apache#251)" This reverts commit 5de05075cd14abf8ac65046a57a5d76617818fbe. * Revert "[MAPR-SPARK-210] Rename sprk-defaults.conf to spark-defaults.conf.template (apache#249)" This reverts commit 1baa677d727e89db7c605ffbae9a9eba00337ad0. [MAPR-SPARK-210] Rename sprk-defaults.conf to spark-defaults.conf.template MapR [SPARK-379] Port Spark to 2.4.0 MapR [SPARK-341] Spark 2.3.2 porting [MAPR-32290] Spark processing offsets when messages are already TTL in the first batch * Bug 32263 - Seek called on unsubscribed partitions [MSPARK-331] Remove snapshot versions of mapr dependencies from Spark-2.3.1 [MAPR-32290] Spark processing offsets when messages are already ttl in first batch MapR [SPARK-325] Add examples for work with the MapRDB JSON connector into the Spark project [ATS-449] Unit test for EBF 32013 created. MAPR-SPARK-311: Spark beeline uses default ssl truststore instead of mapr ssl truststore Bug 32355 - Executor tab empty on Spark UI [SPARK-318] Submitting Spark jobs from Oozie fails due to ClassNotFoundException Bug 32014 - Spark Consumer fails with java.lang.AssertionError Revert "[SPARK-306] Kafka clients 1.0.1 present in jars directory for Spark 2.3.1" (apache#341) * Revert "[SPARK-306] Kafka clients 1.0.1 present in jars directory for Spark 2.3.1 (apache#335)" This reverts commit 832411e. Bug 32014 - Spark Consumer fails with java.lang.AssertionError (apache#326) (apache#336) * MapR [32014] Spark Consumer fails with java.lang.AssertionError [SPARK-306] Kafka clients 1.0.1 present in jars directory for Spark 2.3.1 DEVOPS-2768 temporarily removed curl for file downloading [SPARK-302] Local privilege escalation MapR [SPARK-297] Added unit test for empty value conversion MapR [SPARK-297] Empty values are loaded as non-null MapR [SPARK-296] Structured Streaming memory leak 2.3.1 spark 289 (apache#318) * MapR [SPARK-289] Fix unit test for Spark-2.3.1 [SPARK-130] MapRDB connector - NPE while saving Pair RDD with 'null' values MapR [SPARK-283] Unit tests fail during initialization SSL properties. [SPARK-212] SparkHiveExample fails when we run it twice MapR [SPARK-282] Remove maprfs and hadoop jars from mapr spark package MapR [SPARK-278] Spark submit fails for jobs with python MapR [SPARK-279] Can't connect to spark thrift server with new spark and hive packages MapR [SPARK-276] Update zookeeper dependency to v.3.4.11 for spark 2.3.1 MapR [SPARK-272] Use only client passwords from ssl-client.xml MapR [SPARK-266] Spark jobs can't finish correctly, when there is an error during job running MapR [SPARK-263] Add possibility to use keyPassword which is different from keyStorePassword [MSPARK-31632] RM UI showing broken page for Spark jobs MapR [SPARK-261] Use mapr-security-web for getting passwords. MapR [SPARK-259] Spark application doesn't finish correctly MapR [SPARK-268] Update Spark version for Warden change project version to 2.3.1-mapr-SNAPSHOT MapR [SPARK-256] Spark doesn't work on yarn mode MapR [SPARK-255] Installer fresh install 610/600 secure fails to start "mapr-spark-thriftserver", "mapr-spark-historyserver" Mapr [SPARK-248] MapRDBTableScanRDD fails to convert to Scala Dataframe when using where clause MapR [SPARK-225] Hadoop credentials provider usage for hiding passwords at spark-defaults MapR [SPARK-214] Hive-2.1 poperties can't be read from a hive-site.xml as Spark uses Hive-1.2 MapR [SPARK-216] Spark thriftserver fails when work with hive-maprdb json table SPARK-244 (apache#278) Provide ability to use MapR-Negotiation authentication for Spark HistoryServer MapR [SPARK-226] Spark - pySpark Security Vulnerability MapR [SPARK-220] SparkR fails with UDF functions bug fixed MapR [SPARK-227] KafkaUtils.createDirectStream fails with kafka-09 MapR [SPARK-183] Spark Integration for Kafka 0.10 unit tests disabled MapR [SPARK-182] Spark Project External Kafka Producer v09 unit tests fixed MapR [SPARK-179] Spark Integration for Kafka 0.9 unit tests fixed MapR [SPARK-181] Kafka 0.10 Structured Streaming unit tests fixed [MSPARK-31305] Spark History server NOT loading applications submitted by users other than 'mapr' MapR [SPARK-175] Fix Spark Project Streaming unit tests [MAPR-SPARK-176] Fix Spark Project Catalyst unit tests [MAPR-SPARK-178] Fix Spark Project Hive unit tests MapR [SPARK-174] Spark Core unit tests fixed Changed version for spark-kafka connector. MapR [SPARK-202] Update MapR Spark to 2.3.0 Fixed compile time errors in tests Change project version [SPARK-198] Update hadoop dependency version to 2.7.0-mapr-1803 for Spark 2.2.1 MapR [SPARK-188] Couldn't connect to thrift server via spark beeline on kerberos cluster MapR [SPARK-143] Spark History Server does not require login for secured-by-default clusters MapR [SPARK-186] Update OJAI versions to the latest for Spark-2.2.1 OJAI Connector MapR [SPARK-191] Incorrect work of MapR-DB Sink 'complete' and 'update' modes fixed MapR [SPARK-170] StackOverflowException in equals method in DBMapValue 2.2.1 build fixed (apache#231) * MapR [SPARK-164] Update Kafka version to 1.0.1-mapr in Spark Kafka Producer module MapR [SPARK-161] Include Kafka Structured streaming jar to Spark package. MapR [SPARK-155] Change Spark Master port from 8080 MapR [SPARK-153] Exception in spark job with configured labels on yarn-client mode MapR [SPARK-152] Incorrect date string parsing fixed MapR [SPARK-21] Structured Streaming MapR-DB Sink created MapR [SPARK-135] Spark 2.2 with MapR Streams ( Kafka 1.0) (apache#218) * MapR [SPARK-135] Spark 2.2 with MapR Streams (Kafka 1.0) Added functionality of MapR-Streams specific EOF handling. MapR [SPARK-143] Spark History Server does not require login for secured-by-default clusters Disable build failing if scalastyle checking is fall. MapR [SPARK-16] Change Spark version in Warden files and configure.sh MapR [SPARK-144] Add insertToMapRDB method for rdd for Java API [MAPR-30536] Spark SQL queries on Map column fails after upgrade MapR [SPARK-139] Remove "update" related APIs from connector MapR [SPARK-140] Change the option name "tableName" to "tablePath" in the Spark/MapR-DB connectors. MapR [SPARK-121] Spark OJAI JAVA: update functionality removed MapR [SPARK-118] Spark OJAI Python: missed DataFrame import while moving imports in order to fix MapR [ZEP-101] interpreter issue MapR [SPARK-118] Spark OJAI Python: move MapR DB Connector class importing in order to fix MapR [ZEP-101] interpreter issue MapR [SPARK-117] Spark OJAI Python: Save functionality implementation MapR [SPARK-131] Exception when try to save JSON table with Binary _id field Spark OJAI JAVA: load to RDD, save from RDD implementation (apache#195) * MapR [SPARK-124] Loading to JavaRDD implemented * MapR [SPARK-124] MapRDBJavaSparkContext constructor changed * MapR [SPARK-124] implemented RDD[Row] saving MapR [SPARK-118] Spark OJAI Python: Read implementation MapR [SPARK-128] MapRDB connector - wrong handle of null fields when nullable is false * MapR [SPARK-121] Spark OJAI JAVA: Read to Dataset functionality implementation * Minor refactoring MapR [SPARK-125] Default value of idFieldPath parameter is not handle MapR [SPARK-113] Hit java.lang.UnsupportedOperationException: empty.reduceLeft during loadFromMapRDB Spark Mapr-DB connector was refactored according to Scala style Removed code duplication [MSPARK-107]idField information is lost in MapRDBDataFrameWriterFunctions.saveToMapRDB configure.sh takes options to change ports Kafka client excluded from package because correct version is located in "mapr classpath" Changed Kafka version in Kafka producer module. Branch spark 69 (apache#170) * Fixing the wrong type casting of TimeStamp to OTimeStamp when read from spark dataFrame. * SPARK-69: Problem with license when we try to read from json and write to maprdb remove creatin /usr/local/spark link from configure.sh. This link will be creates by private-pkg remove include-maprdb from default profiles added profiles in maprdb pom file instead of two pom files Fixed maprdb connector dependencies. Fixing the wrong type casting of TimeStamp to OTimeStamp when read from spark dataFrame. changed port for spark-thriftserver as it conflicts with hive server changed port for spark-thriftserver as it conflicts with hive server remove .not_configured_yet file after success Ojai connector fixed required java version [MSPARK-45] Move Spark-OJAI connector code to Spark github repo (apache#132) * SPARK-45 Move Spark-OJAI connector code to Spark github repo * Fixing pom versions for maprdb spark connector. * Changes made to the connector code to be compatible with 5.2.* and 6.0 clients. Spark 2.1.0 mapr 29106 (apache#150) * [SPARK-20922][CORE] Add whitelist of classes that can be deserialized by the launcher. Blindly deserializing classes using Java serialization opens the code up to issues in other libraries, since just deserializing data from a stream may end up execution code (think readObject()). Since the launcher protocol is pretty self-contained, there's just a handful of classes it legitimately needs to deserialize, and they're in just two packages, so add a filter that throws errors if classes from any other package show up in the stream. This also maintains backwards compatibility (the updated launcher code can still communicate with the backend code in older Spark releases). Tested with new and existing unit tests. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes apache#18166 from vanzin/SPARK-20922. (cherry picked from commit 8efc6e9) Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com> (cherry picked from commit 772a9b9) * [SPARK-20922][CORE][HOTFIX] Don't use Java 8 lambdas in older branches. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes apache#18178 from vanzin/SPARK-20922-hotfix. Added security by default for historyserver use waitForConsumerAssignment() instead of consumer.poll(0) for spark-29052 change MAPR_HADOOP_CLASSPATH in configure.sh for creating it by mapr-classpath.sh change MAPR_HADOOP_CLASSPATH in configure.sh for creating it by mapr-classpath.sh changes for mapr-classpath.sh changes for mapr-classpath.sh configure.sh changes [SPARK-39] Classpath filter was added Fixed impersonation when data read from MapR-DB via Spark-Hive. added configure.sh and warden.spark-thriftserver.conf hive-hbase-handler added to Spark jars Fixed "Single message comes late" 28339 bug fixed Spark streaming skipped message with zero offset from Kafka 0.9 [MSPARK-9] Initial fix for Spark unit tests Bump dependencies after ECO-1703 release [SPARK-33] Streaming example fixed [MAPR-26060] Fixed case when mapr-streams make gaps in offsets ported features from kafka 10 to kafka 9 [MAPR-26289][SPARK-2.1] Streaming general improvements (apache#93) * Added include-kafka-09 profile to Assembly * Set default poll timeout to 120s Set default HBase verison to 1.1.8 Changes from Kafka10 package were ported to Kafka09 package. [MAPR-26053] Include MapR Classes to the default value of spark.sql.hive.metastore.sharedPrefixes [MAPR-25807] Spark-Warehouse path computes incorrectly Add MapR-SASL support for Thrift Server Adding scala library. [MAPR-25713] Spark might try to load MapR Class Loader multiple times and fail [MAPR-25311] Bump Spark dependencies after ECO-1611 release [MINOR] Fix spark-jars.sh script [MAPR-24603] Could not launch beeline shell after starting spark thrift server fixed syntax error in V09DirectKafkaWordCount example Spark 2.0.1 MAPR-streams Python API [MAPR-24415] SPARK_JAVA_OPTS is deprecated Kafka streaming producer added. Minor fix for previous commit Added script for MAPR-24374 Some minor changes to spark-defaults.conf Changed default HBase version to 1.1.1 in compatibility.version Streaming example was refactored [MAPR-24470] HiveFromSpark test fails in yarn-cluster mode Added MapR Repo [MAPR-22940] Failed to connect spark beeline (after spark thrift server is started) on Kerberos cluster [MAPR-18865] Unable to submit spark apps from Windows client Skip maven clean task on the parent module New: Issue with running Hive commands in Spark This is fixed in SPARK-7819 Isolated Hive Client Loader appears to cause Native Library libMapRClient.4.0.2-mapr.so already loaded in another classloader error Spark warden.services.conf should have dependency on cldb Remove DFS shuffle settings. These settings are not used right now. Copy every file in the conf directory into the distribution package. Create spark-defaults.conf for MapR Settings to enable DFS shuffle on MapR. Support hbase classpath computation in util script. Adding external conf and scripts. Enable SPARK_HIVE mode while building. This is needed to bundle datanucleus jars needed for hive table creation. Build Spark on MapR. - make-distribution.sh takes an environment variable to enable profiles - MVN_PROFILE_ARG - Added warden conf files under ext-conf. - Updated pom.xml to use right set of jars and version. Spark Master failed to start in HA mode Updated Apache Curator version Added spark streaming integration with kafka 0.9 and mapr-streams Added MapR Repo

SPARK-1324: SparkUI Should Not Bind to SPARK_PUBLIC_DNS

e8025f8

asfgit closed this in be6d96c Mar 27, 2014

lins05 pushed a commit to lins05/spark that referenced this pull request Apr 23, 2017

Stop executors cleanly before deleting their pods (apache#231)

c6a5c6e

erikerlandson pushed a commit to erikerlandson/spark that referenced this pull request Jul 28, 2017

Stop executors cleanly before deleting their pods (apache#231)

d0e27b1

jamesrgrinter pushed a commit to jamesrgrinter/spark that referenced this pull request Apr 22, 2018

2.2.1 build fixed (apache#231)

77ed36f

* MapR [SPARK-164] Update Kafka version to 1.0.1-mapr in Spark Kafka Producer module

Igosuki pushed a commit to Adikteev/spark that referenced this pull request Jul 31, 2018

[SPARK-578] use 8u152 JCE (apache#231)

4f418c0

* try new JCE * fix envvar in marathon.json.mustache

bzhaoopenstack pushed a commit to bzhaoopenstack/spark that referenced this pull request Sep 11, 2019

Merge pull request apache#231 from theopenlab/update-gcp-sceret

5185b5e

Update K8S conformance test log secret

AngersZhuuuu mentioned this pull request May 10, 2020

[SPARK-31670][SQL] Trim unnecessary Struct field alias in Aggregate/GroupingSets #28490

Closed

wangyum mentioned this pull request Jun 7, 2020

[WIP][SPARK-31919][SQL] Push down more predicates through Join #28741

Closed

peter-toth mentioned this pull request Jul 17, 2020

[SPARK-29375][SPARK-28940][SPARK-32041][SQL] Whole plan exchange and subquery reuse #28885

Closed

arjunshroff pushed a commit to arjunshroff/spark that referenced this pull request Nov 24, 2020

2.2.1 build fixed (apache#231)

5ee588f

* MapR [SPARK-164] Update Kafka version to 1.0.1-mapr in Spark Kafka Producer module

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SPARK-1324: SparkUI Should Not Bind to SPARK_PUBLIC_DNS #231

SPARK-1324: SparkUI Should Not Bind to SPARK_PUBLIC_DNS #231

pwendell commented Mar 25, 2014

AmplabJenkins commented Mar 26, 2014

AmplabJenkins commented Mar 26, 2014

CodingCat commented Mar 26, 2014

pwendell commented Mar 26, 2014

AmplabJenkins commented Mar 26, 2014

AmplabJenkins commented Mar 26, 2014

aarondav commented Mar 26, 2014

pwendell commented Mar 27, 2014

SPARK-1324: SparkUI Should Not Bind to SPARK_PUBLIC_DNS #231

SPARK-1324: SparkUI Should Not Bind to SPARK_PUBLIC_DNS #231

Conversation

pwendell commented Mar 25, 2014

AmplabJenkins commented Mar 26, 2014

AmplabJenkins commented Mar 26, 2014

CodingCat commented Mar 26, 2014

pwendell commented Mar 26, 2014

AmplabJenkins commented Mar 26, 2014

AmplabJenkins commented Mar 26, 2014

aarondav commented Mar 26, 2014

pwendell commented Mar 27, 2014