# HypergraphPercol Colab build

This notebook reproduces the multi-stage Docker build pipeline inside a Google Colab runtime so that the `HypergraphPercol` package and its CGAL helpers are available directly from a Colab session.

> **Execution order**
> Run the cells sequentially from top to bottom in a fresh Colab runtime. The build relies on system packages, so restarting midway may require rerunning the earlier cells.

## 1. Install system dependencies

The Dockerfile installs a series of Ubuntu packages that provide CGAL, Boost, Eigen and a modern build toolchain. We replicate the same setup here.

In [None]:
%%bash
set -euo pipefail
apt-get update
DEBIAN_FRONTEND=noninteractive apt-get install -y     build-essential     cmake     git     libtbb-dev     libcgal-dev     libboost-all-dev     libeigen3-dev


## 2. Upgrade `pip` and core Python build tooling

The compiled extensions require an up-to-date Python build stack along with Cython and NumPy 1.26.x (force reinstalled to avoid mixed ABI issues on Colab). We also pre-install `cmake` through `pip` to match the Docker workflow.

In [None]:
%%bash
set -euo pipefail
python3 -m pip install --upgrade pip setuptools wheel "Cython>=3.0" cmake
python3 -m pip install --force-reinstall "numpy==1.26.4"


## 3. Clone the required repositories

We fetch both the main HypergraphPercol sources and the `cyminiball` dependency at the same revisions used by the Docker build.

In [None]:
%%bash
set -euo pipefail
cd /content
if [ -d HypergraphPercol ]; then
    git -C HypergraphPercol pull --ff-only
else
    git clone https://github.com/Ludwig-H/HypergraphPercol.git
fi
if [ -d cyminiball ]; then
    git -C cyminiball pull --ff-only
else
    git clone https://github.com/Ludwig-H/cyminiball.git
fi


## 4. Build and install `cyminiball`

The Docker image creates a wheel from source and installs it without build isolation. We mirror that approach so the same binary is present inside Colab.

In [None]:
%%bash
set -euo pipefail
mkdir -p /content/wheels
cd /content/cyminiball
python3 -m pip wheel --no-build-isolation --wheel-dir=/content/wheels .
python3 -m pip install --force-reinstall --no-index --find-links=/content/wheels cyminiball


## 5. Download the CGAL helper projects

The helper script clones the six CGAL-based executables required by HypergraphPercol.

In [None]:
%%bash
set -euo pipefail
cd /content/HypergraphPercol
python3 scripts/setup_cgal.py


## 6. Patch (if necessary) and compile the CGAL executables

The 3D helpers require an explicit pthread linkage, just like in the Dockerfile. Each project is then configured and built in Release mode.

In [None]:
%%bash
set -euo pipefail
cd /content/HypergraphPercol/CGALDelaunay

projects=(
    EdgesCGALDelaunay2D
    EdgesCGALDelaunay3D
    EdgesCGALDelaunayND
    EdgesCGALWeightedDelaunay2D
    EdgesCGALWeightedDelaunay3D
    EdgesCGALWeightedDelaunayND
)

for project in "${projects[@]}"; do
    cmakelists="$project/CMakeLists.txt"
    if [ "$project" = "EdgesCGALDelaunay3D" ]; then
        if ! grep -q 'target_link_libraries(EdgesCGALDelaunay3D PRIVATE pthread)' "$cmakelists"; then
            sed -i '/^add_executable(EdgesCGALDelaunay3D/a target_link_libraries(EdgesCGALDelaunay3D PRIVATE pthread)' "$cmakelists"
        fi
    elif [ "$project" = "EdgesCGALWeightedDelaunay3D" ]; then
        if ! grep -q 'target_link_libraries(EdgesCGALWeightedDelaunay3D PRIVATE pthread)' "$cmakelists"; then
            sed -i '/^add_executable(EdgesCGALWeightedDelaunay3D/a target_link_libraries(EdgesCGALWeightedDelaunay3D PRIVATE pthread)' "$cmakelists"
        fi
    fi
    cmake -S "$project" -B "$project/build" -DCMAKE_BUILD_TYPE=Release
    cmake --build "$project/build" -j"$(nproc)"
done


## 7. Install Python runtime dependencies

HypergraphPercol depends on scientific Python packages such as scikit-learn, HDBSCAN and GUDHI. Installing them upfront ensures that the later `pip install` step can reuse the locally built `cyminiball` wheel without attempting to rebuild it. Some of these wheels may opportunistically upgrade NumPy, so we immediately reinstall 1.26.4 afterwards to keep the ABI aligned with the compiled extensions.

In [None]:
%%bash
set -euo pipefail
python3 -m pip install --upgrade scikit-learn hdbscan gudhi joblib threadpoolctl
python3 -m pip install --force-reinstall "numpy==1.26.4"

## 8. Install HypergraphPercol from source

Finally, install the package so that it becomes importable inside the notebook runtime. Using `--no-deps` keeps the `cyminiball` wheel we built earlier instead of asking `pip` to recompile it.

In [None]:
%%bash
set -euo pipefail
cd /content/HypergraphPercol
python3 -m pip install --no-deps --force-reinstall .


## 9. Configure the runtime (optional) and validate the installation

The package defaults to `/content/HypergraphPercol/CGALDelaunay` when `cgal_root` is not provided. The following cell sets the environment variable explicitly and performs a simple clustering run to ensure everything is functional.

In [None]:

import os
import numpy as np

os.environ["CGALDELAUNAY_ROOT"] = "/content/HypergraphPercol/CGALDelaunay"

from hypergraphpercol import HypergraphPercol

rng = np.random.default_rng(0)
data = np.vstack([
    rng.normal(loc=-2.0, scale=0.4, size=(40, 3)),
    rng.normal(loc=2.0, scale=0.4, size=(40, 3)),
])
labels = HypergraphPercol(
    data,
    K=2,
    min_cluster_size=20,
    min_samples=10,
    metric="euclidean",
    complex_chosen="auto",
    expZ=2,
    precision="safe",
    verbeux=True,
    cgal_root=os.environ["CGALDELAUNAY_ROOT"],
)
print("Unique labels:", np.unique(labels))
