Skip to content

[AINode] Decoupling inference manager into request_manager, pool_manager#16131

Merged
CRZbulabula merged 6 commits intoapache:masterfrom
yunbow30944:decoupling_inference_manager
Aug 19, 2025
Merged

[AINode] Decoupling inference manager into request_manager, pool_manager#16131
CRZbulabula merged 6 commits intoapache:masterfrom
yunbow30944:decoupling_inference_manager

Conversation

@yunbow30944
Copy link
Contributor

Description

  • Adding pool state and support state transfer
  • Decoupling inference manager into request_manager, pool_manager, pool_scheduler
  • Adding PoolGroup structure

@CRZbulabula CRZbulabula requested a review from Copilot August 9, 2025 02:30
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a major architectural refactoring that decouples the monolithic inference manager into separate, specialized components. The refactoring aims to improve maintainability and separation of concerns by creating distinct managers for different aspects of inference processing.

Key changes:

  • Introduces a new RequestManager to handle request lifecycle and result processing
  • Adds a PoolManager to manage inference request pools and their states
  • Creates a PoolScheduler to handle pool initialization and expansion logic
  • Adds pool state tracking with a new PoolState enum and PoolGroup structure

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
inference_manager.py Refactored to delegate pool and request management to specialized components
request_manager.py New component managing request lifecycle and result handling
pool_scheduler.py New component handling pool initialization and expansion logic
pool_manager.py New component managing pool registration, state tracking, and request dispatch
inference_request_pool_group.py New data structure grouping pools by model ID
inference_request_pool.py Added PoolState enum for state management
inference_request.py Added model_id field to support request routing
decorator.py Added synchronized decorator for thread safety

Copy link
Contributor

@CRZbulabula CRZbulabula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL

@yunbow30944 yunbow30944 force-pushed the decoupling_inference_manager branch from 44cee1d to 07292e4 Compare August 11, 2025 04:31
@yunbow30944 yunbow30944 force-pushed the decoupling_inference_manager branch from 97f6bc7 to 9a4b10b Compare August 18, 2025 04:00
Copy link
Contributor

@CRZbulabula CRZbulabula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@CRZbulabula CRZbulabula merged commit e271df7 into apache:master Aug 19, 2025
27 of 28 checks passed
@yunbow30944 yunbow30944 deleted the decoupling_inference_manager branch August 20, 2025 01:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments