Skip to content

Commit

Permalink
Merge branch 'master' of github.com:apache/spark into SPARK-23866
Browse files Browse the repository at this point in the history
  • Loading branch information
mgaido91 committed Jul 20, 2019
2 parents 146aa32 + 36d7d81 commit 25533a0
Show file tree
Hide file tree
Showing 3,328 changed files with 267,573 additions and 75,769 deletions.
The diff you're trying to view is too large. We only load the first 3000 changed files.
2 changes: 1 addition & 1 deletion .github/PULL_REQUEST_TEMPLATE
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Please review http://spark.apache.org/contributing.html before opening a pull request.
Please review https://spark.apache.org/contributing.html before opening a pull request.
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,6 @@ target/
unit-tests.log
work/
docs/.jekyll-metadata
*.crc

# For Hive
TempStatsStore/
Expand All @@ -95,3 +94,6 @@ spark-warehouse/
*.Rproj.*

.Rproj.user

# For SBT
.jvmopts
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
## Contributing to Spark

*Before opening a pull request*, review the
[Contributing to Spark guide](http://spark.apache.org/contributing.html).
[Contributing to Spark guide](https://spark.apache.org/contributing.html).
It lists steps that are required before creating a PR. In particular, consider:

- Is the change important and ready enough to ask the community to spend time reviewing?
- Have you searched for existing, related JIRAs and pull requests?
- Is this a new feature that can stand alone as a [third party project](http://spark.apache.org/third-party-projects.html) ?
- Is this a new feature that can stand alone as a [third party project](https://spark.apache.org/third-party-projects.html) ?
- Is the change being proposed clearly explained and motivated?

When you contribute code, you affirm that the contribution is your original work and that you
Expand Down
4 changes: 2 additions & 2 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -222,7 +222,7 @@ Python Software Foundation License
----------------------------------

pyspark/heapq3.py

python/docs/_static/copybutton.js

BSD 3-Clause
------------
Expand Down Expand Up @@ -258,4 +258,4 @@ data/mllib/images/kittens/29.5.a_b_EGDP022204.jpg
data/mllib/images/kittens/54893.jpg
data/mllib/images/kittens/DP153539.jpg
data/mllib/images/kittens/DP802813.jpg
data/mllib/images/multi-channel/chr30.4.184.jpg
data/mllib/images/multi-channel/chr30.4.184.jpg
63 changes: 34 additions & 29 deletions LICENSE-binary
Original file line number Diff line number Diff line change
Expand Up @@ -209,34 +209,34 @@ org.apache.zookeeper:zookeeper
oro:oro
commons-configuration:commons-configuration
commons-digester:commons-digester
com.chuusai:shapeless_2.11
com.chuusai:shapeless_2.12
com.googlecode.javaewah:JavaEWAH
com.twitter:chill-java
com.twitter:chill_2.11
com.twitter:chill_2.12
com.univocity:univocity-parsers
javax.jdo:jdo-api
joda-time:joda-time
net.sf.opencsv:opencsv
org.apache.derby:derby
org.objenesis:objenesis
org.roaringbitmap:RoaringBitmap
org.scalanlp:breeze-macros_2.11
org.scalanlp:breeze_2.11
org.typelevel:macro-compat_2.11
org.scalanlp:breeze-macros_2.12
org.scalanlp:breeze_2.12
org.typelevel:macro-compat_2.12
org.yaml:snakeyaml
org.apache.xbean:xbean-asm5-shaded
com.squareup.okhttp3:logging-interceptor
com.squareup.okhttp3:okhttp
com.squareup.okio:okio
org.apache.spark:spark-catalyst_2.11
org.apache.spark:spark-kvstore_2.11
org.apache.spark:spark-launcher_2.11
org.apache.spark:spark-mllib-local_2.11
org.apache.spark:spark-network-common_2.11
org.apache.spark:spark-network-shuffle_2.11
org.apache.spark:spark-sketch_2.11
org.apache.spark:spark-tags_2.11
org.apache.spark:spark-unsafe_2.11
org.apache.spark:spark-catalyst_2.12
org.apache.spark:spark-kvstore_2.12
org.apache.spark:spark-launcher_2.12
org.apache.spark:spark-mllib-local_2.12
org.apache.spark:spark-network-common_2.12
org.apache.spark:spark-network-shuffle_2.12
org.apache.spark:spark-sketch_2.12
org.apache.spark:spark-tags_2.12
org.apache.spark:spark-unsafe_2.12
commons-httpclient:commons-httpclient
com.vlkan:flatbuffers
com.ning:compress-lzf
Expand All @@ -260,9 +260,6 @@ net.sf.supercsv:super-csv
org.apache.arrow:arrow-format
org.apache.arrow:arrow-memory
org.apache.arrow:arrow-vector
org.apache.calcite:calcite-avatica
org.apache.calcite:calcite-core
org.apache.calcite:calcite-linq4j
org.apache.commons:commons-crypto
org.apache.commons:commons-lang3
org.apache.hadoop:hadoop-annotations
Expand All @@ -287,25 +284,24 @@ org.apache.orc:orc-mapreduce
org.mortbay.jetty:jetty
org.mortbay.jetty:jetty-util
com.jolbox:bonecp
org.json4s:json4s-ast_2.11
org.json4s:json4s-core_2.11
org.json4s:json4s-jackson_2.11
org.json4s:json4s-scalap_2.11
org.json4s:json4s-ast_2.12
org.json4s:json4s-core_2.12
org.json4s:json4s-jackson_2.12
org.json4s:json4s-scalap_2.12
com.carrotsearch:hppc
com.fasterxml.jackson.core:jackson-annotations
com.fasterxml.jackson.core:jackson-core
com.fasterxml.jackson.core:jackson-databind
com.fasterxml.jackson.dataformat:jackson-dataformat-yaml
com.fasterxml.jackson.module:jackson-module-jaxb-annotations
com.fasterxml.jackson.module:jackson-module-paranamer
com.fasterxml.jackson.module:jackson-module-scala_2.11
com.fasterxml.jackson.module:jackson-module-scala_2.12
com.github.mifmif:generex
com.google.code.findbugs:jsr305
com.google.code.gson:gson
com.google.inject:guice
com.google.inject.extensions:guice-servlet
com.twitter:parquet-hadoop-bundle
commons-beanutils:commons-beanutils-core
commons-cli:commons-cli
commons-dbcp:commons-dbcp
commons-io:commons-io
Expand Down Expand Up @@ -372,6 +368,8 @@ org.eclipse.jetty:jetty-servlets
org.eclipse.jetty:jetty-util
org.eclipse.jetty:jetty-webapp
org.eclipse.jetty:jetty-xml
org.scala-lang.modules:scala-xml_2.12
org.opencypher:okapi-shade

core/src/main/java/org/apache/spark/util/collection/TimSort.java
core/src/main/resources/org/apache/spark/ui/static/bootstrap*
Expand Down Expand Up @@ -415,8 +413,7 @@ com.thoughtworks.paranamer:paranamer
org.scala-lang:scala-compiler
org.scala-lang:scala-library
org.scala-lang:scala-reflect
org.scala-lang.modules:scala-parser-combinators_2.11
org.scala-lang.modules:scala-xml_2.11
org.scala-lang.modules:scala-parser-combinators_2.12
org.fusesource.leveldbjni:leveldbjni-all
net.sourceforge.f2j:arpack_combined_all
xmlenc:xmlenc
Expand All @@ -437,15 +434,15 @@ is distributed under the 3-Clause BSD license.
MIT License
-----------

org.spire-math:spire-macros_2.11
org.spire-math:spire_2.11
org.typelevel:machinist_2.11
org.spire-math:spire-macros_2.12
org.spire-math:spire_2.12
org.typelevel:machinist_2.12
net.razorvine:pyrolite
org.slf4j:jcl-over-slf4j
org.slf4j:jul-to-slf4j
org.slf4j:slf4j-api
org.slf4j:slf4j-log4j12
com.github.scopt:scopt_2.11
com.github.scopt:scopt_2.12

core/src/main/resources/org/apache/spark/ui/static/dagre-d3.min.js
core/src/main/resources/org/apache/spark/ui/static/*dataTables*
Expand Down Expand Up @@ -487,6 +484,14 @@ org.glassfish.jersey.core:jersey-server
org.glassfish.jersey.media:jersey-media-jaxb


Eclipse Distribution License (EDL) 1.0
--------------------------------------

org.glassfish.jaxb:jaxb-runtime
jakarta.xml.bind:jakarta.xml.bind-api
com.sun.istack:istack-commons-runtime


Mozilla Public License (MPL) 1.1
--------------------------------

Expand Down
24 changes: 15 additions & 9 deletions NOTICE-binary
Original file line number Diff line number Diff line change
Expand Up @@ -792,15 +792,6 @@ Copyright 2005-2006 The Apache Software Foundation
Apache Jakarta HttpClient
Copyright 1999-2007 The Apache Software Foundation

Calcite Avatica
Copyright 2012-2015 The Apache Software Foundation

Calcite Core
Copyright 2012-2015 The Apache Software Foundation

Calcite Linq4j
Copyright 2012-2015 The Apache Software Foundation

Apache HttpClient
Copyright 1999-2017 The Apache Software Foundation

Expand Down Expand Up @@ -1172,3 +1163,18 @@ Copyright 2014 The Apache Software Foundation

Apache Mahout (http://mahout.apache.org/)
Copyright 2014 The Apache Software Foundation

scala-xml
Copyright (c) 2002-2019 EPFL
Copyright (c) 2011-2019 Lightbend, Inc.

scala-xml includes software developed at
LAMP/EPFL (https://lamp.epfl.ch/) and
Lightbend, Inc. (https://www.lightbend.com/).

Licensed under the Apache License, Version 2.0 (the "License").
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
18 changes: 18 additions & 0 deletions R/CRAN_RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,21 @@
---
license: |
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
---

# SparkR CRAN Release

To release SparkR as a package to CRAN, we would use the `devtools` package. Please work with the
Expand Down
18 changes: 18 additions & 0 deletions R/DOCUMENTATION.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,21 @@
---
license: |
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
---

# SparkR Documentation

SparkR documentation is generated by using in-source comments and annotated by using
Expand Down
18 changes: 5 additions & 13 deletions R/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ export R_HOME=/home/username/R

#### Build Spark

Build Spark with [Maven](http://spark.apache.org/docs/latest/building-spark.html#buildmvn) and include the `-Psparkr` profile to build the R package. For example to use the default Hadoop versions you can run
Build Spark with [Maven](https://spark.apache.org/docs/latest/building-spark.html#buildmvn) and include the `-Psparkr` profile to build the R package. For example to use the default Hadoop versions you can run

```bash
build/mvn -DskipTests -Psparkr package
Expand All @@ -35,23 +35,15 @@ SparkContext, you can run

./bin/sparkR --master "local[2]"

To set other options like driver memory, executor memory etc. you can pass in the [spark-submit](http://spark.apache.org/docs/latest/submitting-applications.html) arguments to `./bin/sparkR`
To set other options like driver memory, executor memory etc. you can pass in the [spark-submit](https://spark.apache.org/docs/latest/submitting-applications.html) arguments to `./bin/sparkR`

#### Using SparkR from RStudio

If you wish to use SparkR from RStudio or other R frontends you will need to set some environment variables which point SparkR to your Spark installation. For example
```R
# Set this to where Spark is installed
Sys.setenv(SPARK_HOME="/Users/username/spark")
# This line loads SparkR from the installed directory
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))
library(SparkR)
sparkR.session()
```
If you wish to use SparkR from RStudio, please refer [SparkR documentation](https://spark.apache.org/docs/latest/sparkr.html#starting-up-from-rstudio).

#### Making changes to SparkR

The [instructions](http://spark.apache.org/contributing.html) for making contributions to Spark also apply to SparkR.
The [instructions](https://spark.apache.org/contributing.html) for making contributions to Spark also apply to SparkR.
If you only make R file changes (i.e. no Scala changes) then you can just re-install the R package using `R/install-dev.sh` and test your changes.
Once you have made your changes, please include unit tests for them and run existing unit tests using the `R/run-tests.sh` script as described below.

Expand All @@ -66,7 +58,7 @@ To run one of them, use `./bin/spark-submit <filename> <args>`. For example:
```bash
./bin/spark-submit examples/src/main/r/dataframe.R
```
You can run R unit tests by following the instructions under [Running R Tests](http://spark.apache.org/docs/latest/building-spark.html#running-r-tests).
You can run R unit tests by following the instructions under [Running R Tests](https://spark.apache.org/docs/latest/building-spark.html#running-r-tests).

### Running on YARN

Expand Down
32 changes: 25 additions & 7 deletions R/WINDOWS.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,38 @@
---
license: |
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
---

## Building SparkR on Windows

To build SparkR on Windows, the following steps are required

1. Install R (>= 3.1) and [Rtools](http://cran.r-project.org/bin/windows/Rtools/). Make sure to
include Rtools and R in `PATH`.
1. Install R (>= 3.1) and [Rtools](https://cloud.r-project.org/bin/windows/Rtools/). Make sure to
include Rtools and R in `PATH`. Note that support for R prior to version 3.4 is deprecated as of Spark 3.0.0.

2. Install
[JDK8](http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html) and set
[JDK8](https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html) and set
`JAVA_HOME` in the system environment variables.

3. Download and install [Maven](http://maven.apache.org/download.html). Also include the `bin`
3. Download and install [Maven](https://maven.apache.org/download.html). Also include the `bin`
directory in Maven in `PATH`.

4. Set `MAVEN_OPTS` as described in [Building Spark](http://spark.apache.org/docs/latest/building-spark.html).
4. Set `MAVEN_OPTS` as described in [Building Spark](https://spark.apache.org/docs/latest/building-spark.html).

5. Open a command shell (`cmd`) in the Spark directory and build Spark with [Maven](http://spark.apache.org/docs/latest/building-spark.html#buildmvn) and include the `-Psparkr` profile to build the R package. For example to use the default Hadoop versions you can run
5. Open a command shell (`cmd`) in the Spark directory and build Spark with [Maven](https://spark.apache.org/docs/latest/building-spark.html#buildmvn) and include the `-Psparkr` profile to build the R package. For example to use the default Hadoop versions you can run

```bash
mvn.cmd -DskipTests -Psparkr package
Expand All @@ -34,7 +52,7 @@ To run the SparkR unit tests on Windows, the following steps are required —ass

4. Set the environment variable `HADOOP_HOME` to the full path to the newly created `hadoop` directory.

5. Run unit tests for SparkR by running the command below. You need to install the needed packages following the instructions under [Running R Tests](http://spark.apache.org/docs/latest/building-spark.html#running-r-tests) first:
5. Run unit tests for SparkR by running the command below. You need to install the needed packages following the instructions under [Running R Tests](https://spark.apache.org/docs/latest/building-spark.html#running-r-tests) first:

```
.\bin\spark-submit2.cmd --conf spark.hadoop.fs.defaultFS="file:///" R\pkg\tests\run-all.R
Expand Down
Loading

0 comments on commit 25533a0

Please sign in to comment.