Skip to content
47 changes: 34 additions & 13 deletions docs/setup/compile.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@

Sedona Scala/Java code is a project with multiple modules. Each module is a Scala/Java mixed project which is managed by Apache Maven 3.

* Make sure your Linux/Mac machine has Java 1.8, Apache Maven 3.3.1+, and Python3.7+. The compilation of Sedona is not tested on Windows machines.
* Make sure your Linux/Mac machine has Java 1.11, Apache Maven 3.3.1+, and Python3.8+. The compilation of Sedona is not tested on Windows machines.

To compile all modules, please make sure you are in the root folder of all modules. Then enter the following command in the terminal:

Expand Down Expand Up @@ -77,16 +77,25 @@ Sedona uses GitHub Actions to automatically generate jars per commit. You can go

## Run Python test

1. Set up the environment variable SPARK_HOME and PYTHONPATH
1) Set up the environment variable SPARK_HOME and PYTHONPATH

For example,

```
export SPARK_HOME=$PWD/spark-3.0.1-bin-hadoop2.7
export SPARK_VERSION=3.4.0
export SPARK_HOME=$PWD/spark-${SPARK_VERSION}-bin-hadoop3
export PYTHONPATH=$SPARK_HOME/python
```

2. Put JAI jars to ==SPARK_HOME/jars/== folder.
2) Install Spark if you haven't already

```
wget https://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop3.tgz
tar -xvzf spark-${SPARK_VERSION}-bin-hadoop3.tgz
rm spark-${SPARK_VERSION}-bin-hadoop3.tgz
```

3) Put JAI jars to ==SPARK_HOME/jars/== folder.

```
export JAI_CORE_VERSION="1.1.3"
Expand All @@ -97,13 +106,13 @@ wget -P $SPARK_HOME/jars/ https://repo.osgeo.org/repository/release/javax/media/
wget -P $SPARK_HOME/jars/ https://repo.osgeo.org/repository/release/javax/media/jai_imageio/${JAI_IMAGEIO_VERSION}/jai_imageio-${JAI_IMAGEIO_VERSION}.jar
```

3. Compile the Sedona Scala and Java code with `-Dgeotools` and then copy the ==sedona-spark-shaded-{{ sedona.current_version }}.jar== to ==SPARK_HOME/jars/== folder.
4) Compile the Sedona Scala and Java code with `-Dgeotools` and then copy the ==sedona-spark-shaded-{{ sedona.current_version }}.jar== to ==SPARK_HOME/jars/== folder.

```
cp spark-shaded/target/sedona-spark-shaded-xxx.jar $SPARK_HOME/jars/
cp spark-shaded/target/sedona-spark-shaded-*.jar $SPARK_HOME/jars/
```

4. Install the following libraries
5) Install the following libraries

```
sudo apt-get -y install python3-pip python-dev libgeos-dev
Expand All @@ -113,25 +122,31 @@ sudo pip3 install -U virtualenvwrapper
sudo pip3 install -U pipenv
```

Homebrew can be used to install libgeos-dev in macOS: `brew install geos`
5. Set up pipenv to the desired Python version: 3.7, 3.8, or 3.9
Homebrew can be used to install libgeos-dev in macOS:

```
brew install geos
```

6) Set up pipenv to the desired Python version: 3.8, 3.9, or 3.10

```
cd python
pipenv --python 3.7
pipenv --python 3.8
```

6. Install the PySpark version and the other dependency
7) Install the PySpark version and the other dependency

```
cd python
pipenv install pyspark
pipenv install pyspark==${SPARK_VERSION}
pipenv install --dev
```

`pipenv install pyspark` installs the latest version of pyspark.
In order to remain consistent with the installed spark version, use `pipenv install pyspark==<spark_version>`
7. Run the Python tests

8) Run the Python tests

```
cd python
Expand Down Expand Up @@ -201,3 +216,9 @@ If you just want to run one hook for example just run the `markdownlint` hook:
`pre-commit run markdownlint --all-files`

We have a [Makefile](https://github.com/apache/sedona/blob/master/Makefile) in the repository root which has three pre-commit convenience commands.

For example, you can run the following to setup pre-commit to run before each commit

```
make checkinstall
```