Queueing fixes by jwstanwick · Pull Request #15 · GridLLM/GridLLM

jwstanwick · 2025-08-05T17:27:27Z

Fixed so that jobs waiting in the queue are cached in the queue instead of just being thrown out if no active worker is available.

…ng for job queue management

…gement

* changed deployment process to be exclusively for docker * added bundle commands * E2E CI / CD (#2) * feat: Initialize GridLLM project with package.json, TypeScript configuration, and integration tests - Added package.json with scripts for server and client installation, building, and Docker commands. - Created integration tests for GridLLM, including health checks and job processing flow. - Implemented Jest setup for integration tests with increased timeout. - Configured TypeScript with strict settings and included necessary directories for compilation. * refactor: Update integration tests workflow and remove legacy test files * refactor: Update integration tests workflow to use 'docker compose' syntax * changed launch pattern * test * reduced the timeout and added ollama to network * refactor: Simplify Ollama container setup and connection to Docker Compose network * change testing flow * fix: update endpoint URL for Ollama API generation test (#3) * Local dev environment fixes (#4) * module alias fails with `npm run dev` - ilearnio/module-alias#103 * remove unused dep * fix dev on client also * bring back `npm start` from makefile if you want to run prod natively for whatever reason * feat: add GitHub Action for automatic code formatting on PR approval (#5) * Add hot swapping support for worker disconnections (#6) * added support for hot swapping in if a client disconnects * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * undo copilot * update action --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * added support for streaming (#7) * added support for streaming * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fix port conflicts in docker compose and enable /ollama/api/tags (#9) * fix: correct port mappings and healthcheck URLs in docker-compose.yml * fixed docker compose * fix: update server port mapping and enhance error handling in API tags endpoint * Added support for /api/embed and easier client connection configuration (#10) * updated package * changed `npm run client` to only launch in a dockerized container * added an ollama reference to compare responses to * embeddings is live * removed integration tests * updated prettier script * added workflow_dispatch * Update documentation (#11) * added license. rewrote readme * update license, readme * updated license to MIT * update license * Update README.md * Docker build image tweaks (#12) * bump node version - CVE for 18, also EOL * two stage build file, image size 512MB -> 210MB * Docker changes complete * Update .github/workflows/docker-build.yml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update .github/workflows/docker-build.yml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update format.yml --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: John Stanwick <48192612+jwstanwick@users.noreply.github.com> * cleaned up the mocked endpoints (#13) * cleaned up the mocked endpoints * fixed compilation * Update README.md * Potential caching strategy improvement - local testing showed ~40s-1m improvement on second build, some argument on GH issues if this cache strategy is broken with multi-stage builds * Queueing fixes (#15) * Enhance job scheduling logic to handle busy workers and improve logging for job queue management * Improve job scheduling logging for model worker status and queue management * update format script. resolve logger errors * update formatting * formatting * Added openai api support (#17) * added /v1/completions * added /v1/chat/completions * integration test * added equivelancy * update integration tests * update integration test * remove logprobs * update to use /v1/chat/completions for ollama streaming chat * update ci * Auto-format code with prettier [skip ci] --------- Co-authored-by: GitHub Action <action@github.com> --------- Co-authored-by: Camp Steiner <joefakocamp@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: GitHub Action <action@github.com>

jwstanwick added 2 commits August 4, 2025 13:01

Enhance job scheduling logic to handle busy workers and improve loggi…

ef13b1c

…ng for job queue management

Improve job scheduling logging for model worker status and queue mana…

642470c

…gement

jwstanwick requested a review from Copilot August 5, 2025 17:27

This comment was marked as outdated.

Sign in to view

jwstanwick added 4 commits August 5, 2025 12:50

update format script. resolve logger errors

1091335

update formatting

fdf50af

formatting

e441faa

Merge branch 'staging' into queueing-fixes

2419979

jwstanwick merged commit 37ca8c9 into staging Aug 5, 2025
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Queueing fixes#15

Queueing fixes#15
jwstanwick merged 6 commits intostagingfrom
queueing-fixes

jwstanwick commented Aug 5, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jwstanwick commented Aug 5, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants