feat: add BaseInfer ABC, input size limits, and payload size middleware by alez007 · Pull Request #10 · alez007/modelship

alez007 · 2026-04-07T15:58:17Z

Introduce BaseInfer abstract base class that all inference backends (vLLM, Transformers, Diffusers, Custom) now inherit from. This unifies the interface, eliminates duplicated "not supported" boilerplate, and adds per-model max_context_length detection in every loader.

Add PayloadSizeLimitMiddleware to the gateway (YASHA_MAX_REQUEST_BODY_BYTES, default 50 MB) as a coarse safety net against oversized requests.

Add max_context_length() to all plugin base classes so plugins can report their model's context limit.

Introduce BaseInfer abstract base class that all inference backends (vLLM, Transformers, Diffusers, Custom) now inherit from. This unifies the interface, eliminates duplicated "not supported" boilerplate, and adds per-model max_context_length detection in every loader. Add PayloadSizeLimitMiddleware to the gateway (YASHA_MAX_REQUEST_BODY_BYTES, default 50 MB) as a coarse safety net against oversized requests. Add max_context_length() to all plugin base classes so plugins can report their model's context limit.

…ion and context length tracking - Remove use_gpu (int/str) config and CUDA_VISIBLE_DEVICES pinning — Ray handles GPU scheduling via num_gpus fractions - Add BaseInfer._get_memory_fraction() used by vllm, diffusers, transformers - Add BaseInfer._set_max_context_length() for per-model context tracking - Remove vllm speech serving (TTS requires loader=custom with plugin) - Add CUDA_DEVICE_ORDER=PCI_BUS_ID to Dockerfile - Add lint-fix Makefile target - Update docs to reflect simplified GPU allocation

Alex M added 2 commits April 7, 2026 15:57

alez007 merged commit d89d166 into main Apr 9, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add BaseInfer ABC, input size limits, and payload size middleware#10

feat: add BaseInfer ABC, input size limits, and payload size middleware#10
alez007 merged 2 commits intomainfrom
feat/input-size-limits

alez007 commented Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alez007 commented Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant