Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
c18aabb
Downgrade Ray to 2.43.0, update vec-inf version, use base image NCCL …
XkunW Sep 23, 2025
cc850d8
Revert to use vllm nccl 2.12.1 for now
XkunW Sep 24, 2025
0fc7168
Add models
XkunW Oct 3, 2025
d3fd292
Merge branch 'main' into bugfix/mpi-client-error
XkunW Oct 6, 2025
c9f89d2
Install vllm directly instead of using project dependencies
XkunW Oct 7, 2025
dbc38ec
Add missing RUN arg
XkunW Oct 7, 2025
f8dd12a
Fix bindpath env var
XkunW Oct 7, 2025
796a6c3
Simplify use_container logic
XkunW Oct 7, 2025
116801e
Print venv in launch
XkunW Oct 7, 2025
2104b1b
Add --system flag to install into system environment
XkunW Oct 7, 2025
2f4055e
Revert back to use project dependencies
XkunW Oct 7, 2025
e355251
Fix tests
XkunW Oct 7, 2025
1ba798d
[pre-commit.ci] Add auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 7, 2025
b04f90e
Add --allow-change-held-packages flag
XkunW Oct 7, 2025
3b6c375
Ban click 0.8.3 to resolve ray sentinel error (ban addressed in ray m…
XkunW Oct 8, 2025
4cc71a3
Downgrade ray to 2.48.0 to see if this resolves the MutableObjectMana…
XkunW Oct 8, 2025
9087b94
Downgrade ray to 2.45.0, which is the previous working version for ve…
XkunW Oct 14, 2025
2a6b980
Update ray to 2.50.0 to for prepackaged click version ban, install ad…
XkunW Oct 21, 2025
4018e6d
Add flash infer, add ray[default] dependencies for ray better ray deb…
XkunW Oct 23, 2025
9387907
Add allow pre-release flag
XkunW Oct 23, 2025
b99bd5b
Add allow prerelease flag for flash infer
XkunW Oct 24, 2025
a9abb7b
Add sglang to dependencies, remove PIP INDEX URL
XkunW Oct 30, 2025
4eeb00b
Add --prerelease=allow, add missing RUN
XkunW Oct 30, 2025
029bb20
Fix docs workflow
XkunW Oct 30, 2025
814bb7b
Fix unit tests workflow
XkunW Oct 30, 2025
7eda3e2
Update slurm template for RDMA setup and binding to resolve Ray issues
XkunW Oct 31, 2025
b1af69e
[pre-commit.ci] Add auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 31, 2025
59f4bdd
Remove extra line in test
XkunW Nov 4, 2025
7e02bc9
Update lock
XkunW Nov 4, 2025
8487dae
ruff fix
XkunW Nov 4, 2025
9ace730
mypy fix
XkunW Nov 4, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/code_checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ jobs:
with:
python-version-file: ".python-version"
- name: Install the project
run: uv sync --dev
run: uv sync --dev --prerelease=allow
- name: Install dependencies and check code
run: |
source .venv/bin/activate
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,10 +67,10 @@ jobs:
python-version-file: ".python-version"

- name: Install the project
run: uv sync --all-extras --group docs
run: uv sync --all-extras --group docs --prerelease=allow

- name: Build docs
run: uv run mkdocs build
run: uv run --frozen mkdocs build

- name: Create .nojekyll file
run: touch site/.nojekyll
Expand Down Expand Up @@ -104,7 +104,7 @@ jobs:
python-version-file: ".python-version"

- name: Install the project
run: uv sync --all-extras --group docs
run: uv sync --all-extras --group docs --frozen

- name: Configure Git Credentials
run: |
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/unit_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,18 +58,18 @@ jobs:
python-version: ${{ matrix.python-version }}

- name: Install the project
run: uv sync --dev
run: uv sync --dev --prerelease=allow

- name: Install dependencies and check code
run: |
uv run pytest -m "not integration_test" --cov vec_inf --cov-report=xml tests
uv run --frozen pytest -m "not integration_test" --cov vec_inf --cov-report=xml tests

- name: Install the core package only
run: uv sync --no-dev

- name: Run package import tests
run: |
uv run pytest tests/test_imports.py
uv run --frozen pytest tests/test_imports.py

- name: Import Codecov GPG public key
run: |
Expand All @@ -79,7 +79,7 @@ jobs:
uses: codecov/codecov-action@v5.5.1
with:
token: ${{ secrets.CODECOV_TOKEN }}
file: ./coverage.xml
files: ./coverage.xml
name: codecov-umbrella
fail_ci_if_error: true
verbose: true
20 changes: 12 additions & 8 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -35,29 +35,33 @@ RUN wget https://bootstrap.pypa.io/get-pip.py && \
rm get-pip.py && \
python3.10 -m pip install --upgrade pip setuptools wheel uv

# Install Infiniband/RDMA support
# Install RDMA support
RUN apt-get update && apt-get install -y \
libibverbs1 libibverbs-dev ibverbs-utils \
librdmacm1 librdmacm-dev rdmacm-utils \
rdma-core ibverbs-providers infiniband-diags perftest \
&& rm -rf /var/lib/apt/lists/*

# Set up RDMA environment (these will persist in the final container)
ENV LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH"
ENV UCX_NET_DEVICES=all
ENV NCCL_IB_DISABLE=0
ENV NCCL_SOCKET_IFNAME="^lo,docker0"
ENV NCCL_NET_GDR_LEVEL=PHB
ENV NCCL_IB_TIMEOUT=22
ENV NCCL_IB_RETRY_CNT=7
ENV NCCL_DEBUG=INFO

# Set up project
WORKDIR /vec-inf
COPY . /vec-inf

# Install project dependencies with build requirements
RUN PIP_INDEX_URL="https://download.pytorch.org/whl/cu128" uv pip install --system -e .[dev]
RUN uv pip install --system -e .[dev] --prerelease=allow

# Final configuration
RUN mkdir -p /vec-inf/nccl && \
mv /root/.config/vllm/nccl/cu12/libnccl.so.2.18.1 /vec-inf/nccl/libnccl.so.2.18.1
ENV VLLM_NCCL_SO_PATH=/vec-inf/nccl/libnccl.so.2.18.1
ENV NCCL_DEBUG=INFO
# Install a single, system NCCL (from NVIDIA CUDA repo in base image)
RUN apt-get update && apt-get install -y --allow-change-held-packages\
libnccl2 libnccl-dev \
&& rm -rf /var/lib/apt/lists/*

# Set the default command to start an interactive shell
CMD ["bash"]
1 change: 1 addition & 0 deletions MODEL_TRACKING.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ This document tracks all model weights available in the `/model-weights` directo
| `gemma-2b-it` | ❌ |
| `gemma-7b` | ❌ |
| `gemma-7b-it` | ❌ |
| `gemma-2-2b-it` | ✅ |
| `gemma-2-9b` | ✅ |
| `gemma-2-9b-it` | ✅ |
| `gemma-2-27b` | ✅ |
Expand Down
7 changes: 4 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,10 @@ dev = [
"xgrammar>=0.1.11",
"torch>=2.7.0",
"vllm>=0.10.0",
"vllm-nccl-cu12>=2.18,<2.19",
"ray>=2.40.0",
"cupy-cuda12x==12.1.0"
"ray[default]>=2.50.0",
"cupy-cuda12x==12.1.0",
"flashinfer-python>=0.4.0",
"sglang>=0.5.0",
]

[project.scripts]
Expand Down
1 change: 1 addition & 0 deletions tests/vec_inf/cli/test_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ def test_launch_command_success(runner):
"mem_per_node": "32G",
"model_weights_parent_dir": "/model-weights",
"vocab_size": "128000",
"venv": "/path/to/venv",
"vllm_args": {"max_model_len": 8192},
"env": {"CACHE": "/cache"},
}
Expand Down
2 changes: 2 additions & 0 deletions tests/vec_inf/cli/test_helper.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ def test_format_table_output(self):
"mem_per_node": "32G",
"model_weights_parent_dir": "/model-weights",
"log_dir": "/tmp/logs",
"venv": "/path/to/venv",
"vllm_args": {"max_model_len": 8192, "enable_prefix_caching": True},
"env": {"CACHE": "/cache"},
}
Expand Down Expand Up @@ -63,6 +64,7 @@ def test_format_table_output_with_minimal_params(self):
"mem_per_node": "16G",
"model_weights_parent_dir": "/weights",
"log_dir": "/logs",
"venv": "/path/to/venv",
"vllm_args": {},
"env": {},
}
Expand Down
15 changes: 7 additions & 8 deletions tests/vec_inf/client/test_slurm_script_generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ def singularity_params(self, basic_params):
singularity = basic_params.copy()
singularity.update(
{
"venv": "singularity",
"venv": "apptainer",
"bind": "/scratch:/scratch,/data:/data",
"env": {
"CACHE_DIR": "/cache",
Expand Down Expand Up @@ -109,7 +109,7 @@ def test_init_singularity(self, singularity_params):
def test_init_singularity_no_bind(self, basic_params):
"""Test Singularity initialization without additional binds."""
params = basic_params.copy()
params["venv"] = "singularity"
params["venv"] = "apptainer"
generator = SlurmScriptGenerator(params)

assert generator.params == params
Expand Down Expand Up @@ -173,7 +173,6 @@ def test_generate_launch_cmd_venv(self, basic_params):
generator = SlurmScriptGenerator(basic_params)
launch_cmd = generator._generate_launch_cmd()

assert "source /path/to/venv/bin/activate" in launch_cmd
assert "vllm serve /path/to/model_weights/test-model" in launch_cmd
assert "--served-model-name test-model" in launch_cmd
assert "--tensor-parallel-size 4" in launch_cmd
Expand All @@ -185,7 +184,7 @@ def test_generate_launch_cmd_singularity(self, singularity_params):
generator = SlurmScriptGenerator(singularity_params)
launch_cmd = generator._generate_launch_cmd()

assert "exec --nv" in launch_cmd
assert "apptainer exec --nv" in launch_cmd
assert "--bind /path/to/model_weights/test-model" in launch_cmd
assert "--bind /scratch:/scratch,/data:/data" in launch_cmd
assert "source" not in launch_cmd
Expand Down Expand Up @@ -306,9 +305,9 @@ def batch_params(self):
def batch_singularity_params(self, batch_params):
"""Generate batch SLURM configuration parameters with Singularity."""
singularity_params = batch_params.copy()
singularity_params["venv"] = "singularity" # Set top-level venv to singularity
singularity_params["venv"] = "apptainer" # Set top-level venv to apptainer
for model_name in singularity_params["models"]:
singularity_params["models"][model_name]["venv"] = "singularity"
singularity_params["models"][model_name]["venv"] = "apptainer"
singularity_params["models"][model_name]["bind"] = (
"/scratch:/scratch,/data:/data"
)
Expand Down Expand Up @@ -341,9 +340,9 @@ def test_init_singularity(self, batch_singularity_params):
def test_init_singularity_no_bind(self, batch_params):
"""Test Singularity initialization without additional binds."""
params = batch_params.copy()
params["venv"] = "singularity" # Set top-level venv to singularity
params["venv"] = "apptainer" # Set top-level venv to apptainer
for model_name in params["models"]:
params["models"][model_name]["venv"] = "singularity"
params["models"][model_name]["venv"] = "apptainer"

generator = BatchSlurmScriptGenerator(params)

Expand Down
Loading