Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
e5089bd
Add python API for programmatic access
rohan-uiuc Mar 2, 2025
d578cc8
[pre-commit.ci] Add auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 2, 2025
f311f85
Fix linting issues, initial clean-up
amrit110 Mar 11, 2025
05fb3ad
[pre-commit.ci] Add auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 11, 2025
c7fd760
Merge branch 'develop' into programmatic_access
amrit110 Mar 11, 2025
8e168e8
Merge branch 'develop' into programmatic_access
amrit110 Mar 12, 2025
1190427
Merge branch 'develop' into programmatic_access
amrit110 Mar 13, 2025
ec5e356
Fix mypy errors, test
amrit110 Mar 14, 2025
f3c7fab
Merge branch 'develop' into programmatic_access
amrit110 Mar 14, 2025
8d28d7f
Move common stuff to shared package
amrit110 Mar 15, 2025
f874bcf
Improve naming of usage files
amrit110 Mar 15, 2025
09edcf4
Fix readme of examples
amrit110 Mar 15, 2025
b396b1c
Remove ModelConfig from api package
amrit110 Mar 15, 2025
bb11cdc
Small fixes, use ModelStatus, ModelType
amrit110 Mar 16, 2025
6656a3f
Refactor shared launch stuff to shared module
amrit110 Mar 16, 2025
a65035b
Add python API docs, update user guide
amrit110 Mar 16, 2025
55ee883
Merge branch 'develop' into programmatic_access
amrit110 Mar 17, 2025
9320092
[pre-commit.ci] pre-commit autoupdate
pre-commit-ci[bot] Mar 24, 2025
027e3e6
Merge pull request #79 from VectorInstitute/pre-commit-ci-update-config
XkunW Mar 24, 2025
39bc328
Bump actions/setup-python from 5.4.0 to 5.5.0
dependabot[bot] Mar 27, 2025
047f367
Merge branch 'develop' into dependabot/github_actions/actions/setup-p…
amrit110 Mar 28, 2025
38a4fa7
Merge pull request #80 from VectorInstitute/dependabot/github_actions…
amrit110 Mar 28, 2025
18d2c21
Merge branch 'programmatic_access' of https://github.com/rohan-uiuc/l…
XkunW Apr 1, 2025
967e3d0
Merge LaunchHelper and ModelLauncher, move LaunchHelper into shared h…
XkunW Apr 1, 2025
8b8ff56
Remove LaunchHelper from cli/_helper.py, ruff format and mypy fixes
XkunW Apr 1, 2025
33c7509
Mark shared files as private, move CLI helpers to shared and create c…
XkunW Apr 1, 2025
85d82c5
Decouple shared helper classes from click dependency
XkunW Apr 1, 2025
98b9c76
Move custom exceptions and global vars to dedicated files, add try ca…
XkunW Apr 2, 2025
479d031
Move json mode out of parent ListHelper class
XkunW Apr 2, 2025
26eabbb
Rename cli_kwargs to be more generic
XkunW Apr 2, 2025
2e94d4a
Move SlurmJobException to shared exceptions, ruff check/format and my…
XkunW Apr 2, 2025
56c4d40
Remove model name field for ListHelper, add additional check for bool…
XkunW Apr 2, 2025
60f4a68
Add API helpers inherited from shared helpers
XkunW Apr 2, 2025
2c2b1c7
Use ModelStatus data class for util functions
XkunW Apr 2, 2025
362b9c7
Rename helper for CLI metrics command
XkunW Apr 2, 2025
73f1196
Refactored API code to use shared helpers, moved exceptions to shared…
XkunW Apr 2, 2025
19d14b5
Fix CLI and shared utils tests
XkunW Apr 2, 2025
df256cf
Fix API tests, use pathlib instead of os.path
XkunW Apr 3, 2025
930c999
Fix relative path in test examples
XkunW Apr 3, 2025
4ca3b10
Fix import tests
XkunW Apr 3, 2025
b28eba1
[pre-commit.ci] Add auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 3, 2025
c88084c
ruff fix
XkunW Apr 3, 2025
b8ad625
mypy fix for shared helper
XkunW Apr 3, 2025
1a755ba
mypy fixes for API
XkunW Apr 3, 2025
a1e3d89
mypy fix for advanced usage example
XkunW Apr 3, 2025
3e4b0f4
Merge branch 'develop' into programmatic_access
XkunW Apr 3, 2025
3e0c1e1
Use built-in type for hints
XkunW Apr 3, 2025
6bbc367
Removing the advanced usage example as it creates a new CLI by wrappi…
XkunW Apr 3, 2025
af7de98
Restructure code base, merge api and shared folder and rename to client
XkunW Apr 7, 2025
2af196a
Move post launch processing and launch logic into a single launch fun…
XkunW Apr 8, 2025
dcf8abf
Add slurm ID as class param for model launcher
XkunW Apr 8, 2025
ebd6898
Integrate status retrival code into ModelStatusMonitor class
XkunW Apr 8, 2025
988040c
Add slurm ID to model launcher params to be dumped into json
XkunW Apr 8, 2025
7722344
Refactor CLI to use client
XkunW Apr 8, 2025
cfca24e
Update tests
XkunW Apr 8, 2025
e39b995
[pre-commit.ci] Add auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 8, 2025
72af8b9
Remove redundant casts, seems like my local mypy was acting up
XkunW Apr 8, 2025
b392bb0
Merge branch 'programmatic_access' of https://github.com/rohan-uiuc/l…
XkunW Apr 8, 2025
8e2f8e6
Refactoring client for CLI use
XkunW Apr 9, 2025
3e5e5ad
Fix wrong var names and data access for client, removed unnecessary t…
XkunW Apr 9, 2025
ed0a5dd
Refactor CLI logic to use client instead of inheriting client helper …
XkunW Apr 9, 2025
35b96dc
[pre-commit.ci] Add auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 9, 2025
5ef4e1f
mypy fixes
XkunW Apr 9, 2025
d48afb1
Remove private imports from client, move util function only used by C…
XkunW Apr 10, 2025
1e9f8d7
[pre-commit.ci] Add auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 10, 2025
9c3a166
Merge pull request #54 from rohan-uiuc/programmatic_access
jwilles Apr 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/code_checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ jobs:
version: "0.5.21"
enable-cache: true
- name: "Set up Python"
uses: actions/setup-python@v5.4.0
uses: actions/setup-python@v5.5.0
with:
python-version-file: ".python-version"
- name: Install the project
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/docs_build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ jobs:
enable-cache: true

- name: "Set up Python"
uses: actions/setup-python@8039c45ed9a312fba91f3399cd0605ba2ebfe93c
uses: actions/setup-python@8d9ed9ac5c53483de85588cdf95a591a75ab9f55
with:
python-version-file: ".python-version"

Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/docs_deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ jobs:
enable-cache: true

- name: "Set up Python"
uses: actions/setup-python@8039c45ed9a312fba91f3399cd0605ba2ebfe93c
uses: actions/setup-python@8d9ed9ac5c53483de85588cdf95a591a75ab9f55
with:
python-version-file: ".python-version"

Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ jobs:
version: "0.6.6"
enable-cache: true

- uses: actions/setup-python@v5.4.0
- uses: actions/setup-python@v5.5.0
with:
python-version: '3.10'

Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/unit_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ jobs:
enable-cache: true

- name: "Set up Python ${{ matrix.python-version }}"
uses: actions/setup-python@v5.4.0
uses: actions/setup-python@v5.5.0
with:
python-version: ${{ matrix.python-version }}

Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ repos:
- id: check-toml

- repo: https://github.com/astral-sh/ruff-pre-commit
rev: 'v0.11.0'
rev: 'v0.11.2'
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix]
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
[![codecov](https://codecov.io/github/VectorInstitute/vector-inference/branch/develop/graph/badge.svg?token=NI88QSIGAC)](https://app.codecov.io/github/VectorInstitute/vector-inference/tree/develop)
![GitHub License](https://img.shields.io/github/license/VectorInstitute/vector-inference)

This repository provides an easy-to-use solution to run inference servers on [Slurm](https://slurm.schedmd.com/overview.html)-managed computing clusters using [vLLM](https://docs.vllm.ai/en/latest/). **All scripts in this repository runs natively on the Vector Institute cluster environment**. To adapt to other environments, update the environment variables in [`cli/_helper.py`](vec_inf/cli/_helper.py), [`cli/_config.py`](vec_inf/cli/_config.py), [`vllm.slurm`](vec_inf/vllm.slurm), [`multinode_vllm.slurm`](vec_inf/multinode_vllm.slurm) and [`models.yaml`](vec_inf/config/models.yaml) accordingly.
This repository provides an easy-to-use solution to run inference servers on [Slurm](https://slurm.schedmd.com/overview.html)-managed computing clusters using [vLLM](https://docs.vllm.ai/en/latest/). **All scripts in this repository runs natively on the Vector Institute cluster environment**. To adapt to other environments, update the environment variables in [`shared/utils.py`](vec_inf/shared/utils.py), [`shared/config.py`](vec_inf/shared/config.py), [`vllm.slurm`](vec_inf/vllm.slurm), [`multinode_vllm.slurm`](vec_inf/multinode_vllm.slurm) and [`models.yaml`](vec_inf/config/models.yaml) accordingly.

## Installation
If you are using the Vector cluster environment, and you don't need any customization to the inference server environment, run the following to install package:
Expand Down
13 changes: 10 additions & 3 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@

import os
import sys
from typing import List


sys.path.insert(0, os.path.abspath("../../vec_inf"))
Expand Down Expand Up @@ -51,8 +50,16 @@
copybutton_prompt_text = r">>> |\.\.\. "
copybutton_prompt_is_regexp = True

apidoc_module_dir = "../../vec_inf"
apidoc_excluded_paths = ["tests", "cli", "shared"]
exclude_patterns = ["reference/api/vec_inf.rst"]
apidoc_output_dir = "reference/api"
apidoc_separate_modules = True
apidoc_extra_args = ["-f", "-M", "-T", "--implicit-namespaces"]
suppress_warnings = ["ref.python"]

intersphinx_mapping = {
"python": ("https://docs.python.org/3.9/", None),
"python": ("https://docs.python.org/3.10/", None),
}

# Add any paths that contain templates here, relative to this directory.
Expand All @@ -61,7 +68,7 @@
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns: List[str] = []
exclude_patterns = ["reference/api/vec_inf.rst"]

# -- Options for Markdown files ----------------------------------------------
#
Expand Down
3 changes: 2 additions & 1 deletion docs/source/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,11 @@ hide-toc: true
:hidden:
user_guide
reference/api/index
```

This repository provides an easy-to-use solution to run inference servers on [Slurm](https://slurm.schedmd.com/overview.html)-managed computing clusters using [vLLM](https://docs.vllm.ai/en/latest/). **All scripts in this repository runs natively on the Vector Institute cluster environment**. To adapt to other environments, update the environment variables in [`cli/_helper.py`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/cli/_helper.py), [`cli/_config.py`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/cli/_config_.py), [`vllm.slurm`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/vllm.slurm), [`multinode_vllm.slurm`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/multinode_vllm.slurm), and model configurations in [`models.yaml`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/config/models.yaml) accordingly.
This repository provides an easy-to-use solution to run inference servers on [Slurm](https://slurm.schedmd.com/overview.html)-managed computing clusters using [vLLM](https://docs.vllm.ai/en/latest/). **All scripts in this repository runs natively on the Vector Institute cluster environment**. To adapt to other environments, update the environment variables in [`shared/utils.py`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/shared/utils.py), [`shared/config.py`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/shared/config_.py), [`vllm.slurm`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/vllm.slurm), [`multinode_vllm.slurm`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/multinode_vllm.slurm), and model configurations in [`models.yaml`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/config/models.yaml) accordingly.

## Installation

Expand Down
9 changes: 9 additions & 0 deletions docs/source/reference/api/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
Python API
==========

This section documents the Python API for the `vec_inf` package.

.. toctree::
:maxdepth: 4

vec_inf.api
7 changes: 7 additions & 0 deletions docs/source/reference/api/vec_inf.api.client.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
vec\_inf.api.client module
==========================

.. automodule:: vec_inf.api.client
:members:
:undoc-members:
:show-inheritance:
7 changes: 7 additions & 0 deletions docs/source/reference/api/vec_inf.api.models.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
vec\_inf.api.models module
==========================

.. automodule:: vec_inf.api.models
:members:
:undoc-members:
:show-inheritance:
17 changes: 17 additions & 0 deletions docs/source/reference/api/vec_inf.api.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
vec\_inf.api package
====================

.. automodule:: vec_inf.api
:members:
:undoc-members:
:show-inheritance:

Submodules
----------

.. toctree::
:maxdepth: 4

vec_inf.api.client
vec_inf.api.models
vec_inf.api.utils
7 changes: 7 additions & 0 deletions docs/source/reference/api/vec_inf.api.utils.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
vec\_inf.api.utils module
=========================

.. automodule:: vec_inf.api.utils
:members:
:undoc-members:
:show-inheritance:
15 changes: 15 additions & 0 deletions docs/source/reference/api/vec_inf.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
vec\_inf package
================

.. automodule:: vec_inf
:members:
:undoc-members:
:show-inheritance:

Subpackages
-----------

.. toctree::
:maxdepth: 4

vec_inf.api
13 changes: 10 additions & 3 deletions docs/source/user_guide.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# User Guide

## Usage
## CLI Usage

### `launch` command

Expand All @@ -17,7 +17,7 @@ You should see an output like the following:

#### Overrides

Models that are already supported by `vec-inf` would be launched using the [default parameters](vec_inf/config/models.yaml). You can override these values by providing additional parameters. Use `vec-inf launch --help` to see the full list of parameters that can be overriden. For example, if `qos` is to be overriden:
Models that are already supported by `vec-inf` would be launched using the [default parameters](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/config/models.yaml). You can override these values by providing additional parameters. Use `vec-inf launch --help` to see the full list of parameters that can be overriden. For example, if `qos` is to be overriden:

```bash
vec-inf launch Meta-Llama-3.1-8B-Instruct --qos <new_qos>
Expand All @@ -29,7 +29,7 @@ You can also launch your own custom model as long as the model architecture is [
* Your model weights directory naming convention should follow `$MODEL_FAMILY-$MODEL_VARIANT` ($MODEL_VARIANT is OPTIONAL).
* Your model weights directory should contain HuggingFace format weights.
* You should specify your model configuration by:
* Creating a custom configuration file for your model and specify its path via setting the environment variable `VEC_INF_CONFIG`. Check the [default parameters](vec_inf/config/models.yaml) file for the format of the config file. All the parameters for the model should be specified in that config file.
* Creating a custom configuration file for your model and specify its path via setting the environment variable `VEC_INF_CONFIG`. Check the [default parameters](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/config/models.yaml) file for the format of the config file. All the parameters for the model should be specified in that config file.
* Using launch command options to specify your model setup.
* For other model launch parameters you can reference the default values for similar models using the [`list` command ](#list-command).

Expand Down Expand Up @@ -179,3 +179,10 @@ If you want to run inference from your local device, you can open a SSH tunnel t
ssh -L 8081:172.17.8.29:8081 username@v.vectorinstitute.ai -N
```
Where the last number in the URL is the GPU number (gpu029 in this case). The example provided above is for the vector cluster, change the variables accordingly for your environment

## Python API Usage

You can also use the `vec_inf` Python API to launch and manage inference servers.

Check out the [Python API documentation](reference/api/index) for more details. There
are also Python API usage examples in the [`examples`](https://github.com/VectorInstitute/vector-inference/tree/develop/examples/api) folder.
3 changes: 3 additions & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,6 @@
- [`vlm/vision_completions.py`](inference/vlm/vision_completions.py): Python example of sending chat completion requests with image attached to prompt to OpenAI compatible server for vision language models
- [`logits`](logits): Example for logits generation
- [`logits.py`](logits/logits.py): Python example of getting logits from hosted model.
- [`api`](api): Examples for using the Python API
- [`basic_usage.py`](api/basic_usage.py): Basic Python example demonstrating the Vector Inference API
- [`advanced_usage.py`](api/advanced_usage.py): Advanced Python example with rich UI for the Vector Inference API
43 changes: 43 additions & 0 deletions examples/api/basic_usage.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
#!/usr/bin/env python
"""Basic example of Vector Inference API usage.
This script demonstrates the core features of the Vector Inference API
for launching and interacting with models.
"""

from vec_inf.client import VecInfClient


# Create the API client
client = VecInfClient()

# List available models
print("Listing available models...")
models = client.list_models()
print(f"Found {len(models)} models")
for model in models[:3]: # Show just the first few
print(f"- {model.name} ({model.type})")

# Launch a model (replace with an actual model name from your environment)
model_name = "Meta-Llama-3.1-8B-Instruct" # Use an available model from your list
print(f"\nLaunching {model_name}...")
response = client.launch_model(model_name)
job_id = response.slurm_job_id
print(f"Launched with job ID: {job_id}")

# Wait for the model to be ready
print("Waiting for model to be ready...")
status = client.wait_until_ready(job_id)
print(f"Model is ready at: {status.base_url}")

# Get metrics
print("\nRetrieving metrics...")
metrics = client.get_metrics(job_id)
if isinstance(metrics.metrics, dict):
for key, value in metrics.metrics.items():
print(f"- {key}: {value}")

# Shutdown when done
print("\nShutting down model...")
client.shutdown_model(job_id)
print("Model shutdown complete")
4 changes: 4 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ dev = [
"codecov>=2.1.13",
"mypy>=1.15.0",
"nbqa>=1.9.1",
"openai>=1.65.1",
"pip-audit>=2.8.0",
"pre-commit>=4.1.0",
"pytest>=8.3.4",
Expand Down Expand Up @@ -59,6 +60,9 @@ vec-inf = "vec_inf.cli._cli:cli"
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.hatch.build.targets.wheel]
packages = ["vec_inf"]

[tool.mypy]
ignore_missing_imports = true
install_types = true
Expand Down
22 changes: 17 additions & 5 deletions tests/test_imports.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,28 @@

import unittest

import pytest


class TestVecInfImports(unittest.TestCase):
"""Test the imports of the vec_inf package."""

def test_import_cli_modules(self):
"""Test the imports of the vec_inf.cli modules."""
def test_imports(self):
"""Test that all modules can be imported."""
try:
# CLI imports
import vec_inf.cli
import vec_inf.cli._cli
import vec_inf.cli._config
import vec_inf.cli._helper
import vec_inf.cli._utils # noqa: F401

# Client imports
import vec_inf.client
import vec_inf.client._config
import vec_inf.client._exceptions
import vec_inf.client._helper
import vec_inf.client._models
import vec_inf.client._utils
import vec_inf.client._vars # noqa: F401

except ImportError as e:
self.fail(f"Import failed: {e}")
pytest.fail(f"Import failed: {e}")
Loading
Loading