Skip to content

Commit

Permalink
[PYDF] Prepare release of 0.3.0
Browse files Browse the repository at this point in the history
Also update install process and instructions

PiperOrigin-RevId: 616216549
  • Loading branch information
rstz authored and Copybara-Service committed Mar 15, 2024
1 parent 89dae4f commit 2814d79
Show file tree
Hide file tree
Showing 15 changed files with 290 additions and 99 deletions.
3 changes: 0 additions & 3 deletions yggdrasil_decision_forests/port/python/.bazelrc
Original file line number Diff line number Diff line change
@@ -1,8 +1,5 @@
# Bazel configuration for Yggdrasil Decision Forests

# Common flags.
common --experimental_repo_remote_exec

# On Windows, uncomment the next line to solve long path issues:
# startup --output_user_root=C:/tmpbld

Expand Down
2 changes: 1 addition & 1 deletion yggdrasil_decision_forests/port/python/.bazelversion
Original file line number Diff line number Diff line change
@@ -1 +1 @@
5.3.0
6.5.0
28 changes: 25 additions & 3 deletions yggdrasil_decision_forests/port/python/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,39 @@
# Changelog

## HEAD
## 0.3.0 - 2024-03-15

## Breaking
### Breaking

- Custom losses now require to provide the gradient, instead of the negative
of the gradient.
- Clarified that YDF may modify numpy arrays containing the custom loss.
- Clarified that YDF may modify numpy arrays returned by a custom loss
function.

### Features

- Allow using Jax for custom loss definitions.
- Allow setting `may_trigger_gc` on custom losses.
- Add support for MHLD oblique decision trees.
- Expose hyperparameter `sparse_oblique_max_num_projections`.
- HTML plots for trees with `model.plot_tree()`.
- Fix protobuf version to 4.24.3 to fix some incompatibilities when using
conda.
- Allow to list compatible engines with `model.list_compatible_engines()`.
- Allow to choose a fast engine with `model.force_engine(...)`.

### Fix

- Fix slow engine creation for some combination of oblique splits.
- Improve error message when feeding multi-dimensional labels.

### Documentation

- Clarified documentation of hyperparameters for oblique splits.
- Fix plots, typos.

#### Release music

Doctor Gradus ad Parnassum from "Children's Corner" (L. 113). Claude Debussy

## 0.2.0 - 2024-02-22

Expand Down
96 changes: 96 additions & 0 deletions yggdrasil_decision_forests/port/python/INSTALLATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Building and installing YDF

## Install from PyPi

To install YDF, run:

```
pip install ydf --upgrade
```

## Building

### Pre-work

Use `tools/update_version.sh` to update the version number (if needed) and
remember to update `CHANGELOG.md`.

### Linux

#### Docker

For building manylinux2014-compatible packages, you can use an appropriate
Docker image. The pre-configured build script at
`tools/build_linux_release_in_docker.sh` starts a container and builds the
wheels end-to-end. You can find the wheels in the `dist/`subdirectory.

#### Manual build

Note that we may not be able to help with issues during manual builds.

**Requirements**

* Bazel - version as specified in `.bazelversion`,
[Bazelisk](https://github.com/bazelbuild/bazelisk) recommended
* GCC >= 9 or Clang >= 14
* rsync
* Python headers (e.g. `python-dev` package on Ubuntu)
* Python virtualenv

**Steps**

1. Compile and test the code with

```shell
# Create a virtual environment where Python dependencies will be installed.
python -m venv myvenv
RUN_TESTS=1 ./tools/test_pydf.sh
deactivate
```

Substitute for your compiler name / version

1. Build the Pip package

```shell
PYTHON_BIN=python
./tools/build_pydf.sh $PYTHON_BIN
```

If you want to build with [Pyenv](https://github.com/pyenv/pyenv) for all supported Python versions, run

```shell
./tools/build_pydf.sh ALL_VERSIONS
```

### MacOS

**Requirements**

* Bazel (version as specified in `.bazelversion`,
[Bazelisk](https://github.com/bazelbuild/bazelisk) recommended)
* XCode command line tools
* [Pyenv](https://github.com/pyenv/pyenv)

**Building for all supported Python versions**

Simply run

```shell
./tools/build_macos_release.sh
```
This will build a MacOS wheel for every supported Python version on the current
architecture. See the contents of this script for details about the build.

### MacOS cross-compilation

We have not tested MacOS cross-compilation (Intel <-> ARM) for YDF yet, though
it is on our roadmap.

### AArch64

We have not tested AArch64 compilation for YDF yet.

### Windows

TODO, see `tools/build.bat`.
15 changes: 2 additions & 13 deletions yggdrasil_decision_forests/port/python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ To install YDF, in Python, simply grab the package from pip:
pip install ydf
```

For build instructions, see INSTALLATION.md.

## Usage Example

```python
Expand All @@ -37,19 +39,6 @@ model.save("my_model")
loaded_model = ydf.load_model("my_model")
```

## Compiling & Building

To build the Python port of YDF, install Bazel, GCC 9 and run the following
command from the root of the port/python directory in the YDF repository

```sh
PYTHON_BIN=python3.9
./tools/test_pydf.sh
./tools/build_pydf.sh $PYTHON_BIN
```

Browse the `tools/` directory for more build helpers.

## Frequently Asked Questions

* **Is it PYDF or YDF?** The name of the library is simply ydf, and so is the
Expand Down
17 changes: 10 additions & 7 deletions yggdrasil_decision_forests/port/python/config/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
from setuptools.command.install import install
from setuptools.dist import Distribution

_VERSION = "0.2.0"
_VERSION = "0.3.0"

with open("README.md", "r", encoding="utf-8") as fh:
long_description = fh.read()
Expand All @@ -34,6 +34,8 @@

OPTIONAL_PACKAGES = {"pandas": ["pandas"]}

MAC_CROSS_COMPILED = False # Change if cross-compiled


class InstallPlatlib(install):

Expand Down Expand Up @@ -63,12 +65,13 @@ def finalize_options(self):

def get_tag(self):
python, abi, plat = _bdist_wheel.get_tag(self)
if platform.system() == "Darwin":
# Uncomment on of the lines below to adapt the platform string when
# cross-compiling.
# plat = "macosx_12_0_arm64"
# plat = "macosx_10_15_x86_64"
pass
if platform.system() == "Darwin" and MAC_CROSS_COMPILED:
if platform.processor() == "arm":
plat = "macosx_10_15_x86_64"
elif platform.processor() == "i386":
plat = "macosx_12_0_arm64"
else:
raise ValueError(f"Unknown processor {platform.processor()}")
return python, abi, plat

except ImportError:
Expand Down
66 changes: 66 additions & 0 deletions yggdrasil_decision_forests/port/python/examples/minimal.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Copyright 2022 Google LLC.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

r"""Minimal usage example of YDF.
This example trains, displays, evaluates and exports a Gradient Boosted Tree
model.
Usage example:
pip install ydf pandas -U
python minimal.py
"""

from absl import app
import pandas as pd
import ydf


def main(argv):
if len(argv) > 1:
raise app.UsageError("Too many command-line arguments.")

# Download the Adult dataset, load in a Pandas dataframe.
train_path = "https://raw.githubusercontent.com/google/yggdrasil-decision-forests/main/yggdrasil_decision_forests/test_data/dataset/adult_train.csv"
test_path = "https://raw.githubusercontent.com/google/yggdrasil-decision-forests/main/yggdrasil_decision_forests/test_data/dataset/adult_train.csv"
train_df = pd.read_csv(train_path)
test_df = pd.read_csv(test_path)

# Display full logs
ydf.verbose(2)

# Trains the model.
model = ydf.GradientBoostedTreesLearner(label="income").train(train_df)

# Some information about the model.
print(model.describe())

# Evaluates the model on the test dataset.
evaluation = model.evaluate(test_df)
print(evaluation)

# Exports the model to disk.
model.save("/tmp/ydf_model")

# Reload the model from disk
loaded_model = ydf.load_model("/tmp/ydf_model")

# Make predictions with the model from disk.
predictions = loaded_model.predict(test_df)
print(predictions)


if __name__ == "__main__":
app.run(main)
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@ function build_py() {
$PYTHON -m venv /tmp/venv_$PYTHON
source /tmp/venv_$PYTHON/bin/activate
bazel clean --expunge
COMPILERS="gcc" ./tools/test_pydf.sh
export CC="gcc"
./tools/test_pydf.sh
./tools/build_pydf.sh python
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,17 +14,18 @@
# limitations under the License.


DOCKER=gcr.io/tfx-oss-public/manylinux2014-bazel:bazel-5.3.0
DOCKER=quay.io/pypa/manylinux2014_x86_64@sha256:2e37241d9c9fbbccea009e59505a1384f9501a7bfea77b21fdcbf332c7036e70

# Current directory
# Useful if Yggdrasil Decision Forests is available locally in a neighbor
# directory.
BAZELISK_VERSION="v1.19.0"
YDF_PATH=$(realpath $PWD/../../..)
YDF_DIRNAME=${YDF_PATH##*/}

# Download docker
sudo docker pull ${DOCKER}
docker pull $DOCKER

# Start docker
sudo docker run -it -v ${PWD}/../../../../:/working_dir -w /working_dir/${YDF_DIRNAME}/yggdrasil_decision_forests/port/python ${DOCKER} \
/bin/bash -c "./tools/build_linux_release.sh"
# Start the container
docker run -it -v $YDF_PATH:/working_dir -w /working_dir/yggdrasil_decision_forests/port/python \
$DOCKER /bin/bash -c " \
yum update && yum install -y rsync && \
curl -L -o /usr/local/bin/bazel https://github.com/bazelbuild/bazelisk/releases/download/${BAZELISK_VERSION}/bazelisk-linux-amd64 && \
chmod +x /usr/local/bin/bazel && \
./tools/build_linux_release.sh "
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
#!/bin/bash
# Copyright 2022 Google LLC.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


set -vex

declare -a python_versions=("3.8" "3.9" "3.10" "3.11")

for pyver in "${python_versions[@]}"
do
pyenv install -s $pyver
export PYENV_VERSION=$pyver
rm -rf ${TMPDIR}venv
python -m venv ${TMPDIR}venv
source ${TMPDIR}venv/bin/activate
pip install --upgrade pip

echo "Building with $(python3 -V 2>&1)"

bazel clean --expunge
RUN_TESTS=0 CC="clang" ./tools/test_pydf.sh
./tools/build_pydf.sh python
deactivate
done
Loading

0 comments on commit 2814d79

Please sign in to comment.