[refactor][CGS][entrance] move smart queue selection logic to entrance layer via rpc#990
Merged
casionone merged 21 commits intodev-1.18.2-webankfrom Apr 17, 2026
Merged
Conversation
…eption (#964) * #AI commit# 开发阶段: 修复sr任务重试导致加载init_sql异常bug * #AI commit# 开发阶段: 修复sr任务重试导致加载init_sql异常bug * #AI commit# 开发阶段: 修复sr任务重试导致加载init_sql异常bug * #AI commit# 修复: * 增加任务重试开关覆盖范围
…t queue selection - Translate all Chinese log messages to English for consistency - Update comments and documentation to English - No functional changes, only log message translation
… in smart queue selection" This reverts commit 47fb4e6.
Add permission validation before using secondary queue to prevent task submission failures: Features: - Add configuration SECONDARY_QUEUE_PERMISSION_CHECK_ENABLED to enable/disable permission check - Add configuration SECONDARY_QUEUE_ALLOWED_USERS to configure user whitelist - Modify performSmartQueueSelection method to accept user parameter - Add checkQueuePermission method to validate user access to secondary queue - If user has no permission, log warning and fallback to primary queue - Prevents task submission failures due to insufficient queue permissions Configuration: - wds.linkis.rm.secondary.yarnqueue.permission.check.enable (default: false) - wds.linkis.rm.secondary.yarnqueue.allowed.users (default: empty)
…econdary queue Replace configuration-based whitelist with actual Yarn permission verification: Changes: - Remove configuration items SECONDARY_QUEUE_PERMISSION_CHECK_ENABLED and SECONDARY_QUEUE_ALLOWED_USERS - Rewrite checkQueuePermission method to use Yarn API for real permission validation - Query Yarn app info via externalResourceService.getAppInfo to verify user access - Detect permission errors (403/404/forbidden/unauthorized) and fallback to primary queue - Handle transient errors gracefully to avoid blocking legitimate users Permission Check Logic: 1. Try to get app info from target queue using Yarn REST API 2. If successful (even with empty app list) → user has permission 3. If permission error (403/404) → log warning and return false 4. If other error (network/timeout) → assume OK to avoid blocking
…ck for secondary queue" This reverts commit f91be62.
…lection" This reverts commit 08dad25.
# Conflicts: # linkis-computation-governance/linkis-manager/linkis-application-manager/src/main/scala/org/apache/linkis/manager/am/service/engine/DefaultEngineCreateService.scala
….0-secondary-queue # Conflicts: # linkis-computation-governance/linkis-manager/linkis-application-manager/src/main/scala/org/apache/linkis/manager/am/service/engine/DefaultEngineCreateService.scala
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What is the purpose of the change
Background/Problem:
The smart queue selection logic is currently embedded within the LinkisManager engine creation flow. This creates tight coupling between the queue selection logic and engine creation process, making it difficult to maintain and extend. Additionally, the feature toggle, engine type filtering, and creator filtering are implemented at the LinkisManager layer, which is not the most appropriate architectural layer for these concerns.
Purpose of Change:
To address this architectural issue, this PR refactors the smart queue selection feature by moving the logic from LinkisManager to the Entrance layer through an RPC-based approach. The Entrance layer now handles feature toggles, engine type filtering, and creator filtering via a new interceptor, while LinkisManager provides queue selection decision as an RPC service.
Value/Impact:
This refactoring improves code maintainability and separation of concerns. The Entrance layer is now responsible for traffic control (feature toggle, filtering), while LinkisManager focuses on resource-based decision making. This makes the system more modular and easier to extend with additional queue selection strategies in the future.
Related issues/PRs
Related issues: close apache#5415
Related pr:none
Brief change log
Checklist