Skip to content

Conversation

@fregataa
Copy link
Member

@fregataa fregataa commented Mar 11, 2025

resolves #3936 (BA-942)

How to test

  1. Run this command to inject data before test the API.
    backend.ai mgr etcd put "config/plugins/accelerator/cuda/quantum_size" 1.0

  2. new GQL Query

query ScalingGroup {
    accessible_scaling_groups(
        project_id: "PROJECT-ID"
    ) {
        name
        is_active
        accelerator_quantum_size
    }
}

Checklist: (if applicable)

  • Milestone metadata specifying the target backport version
  • Mention to the original issue
  • API server-client counterparts (e.g., manager API -> client SDK)

📚 Documentation preview 📚: https://sorna--3940.org.readthedocs.build/en/3940/


📚 Documentation preview 📚: https://sorna-ko--3940.org.readthedocs.build/ko/3940/

@fregataa fregataa added this to the 25Q1 milestone Mar 11, 2025
@fregataa fregataa self-assigned this Mar 11, 2025
@github-actions github-actions bot added size:S 10~30 LoC comp:manager Related to Manager component labels Mar 11, 2025
fregataa and others added 2 commits March 11, 2025 15:32
Co-authored-by: octodog <mu001@lablup.com>
@github-actions github-actions bot added the area:docs Documentations label Mar 11, 2025
fregataa and others added 3 commits March 11, 2025 15:44
…or-config-field' into feat/scaling-group-add-accelerator-config-field
Co-authored-by: octodog <mu001@lablup.com>
Copy link
Collaborator

@HyeockJinKim HyeockJinKim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this PR done? @fregataa

Copy link
Member

@yomybaby yomybaby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User can not use scaling_group and scaling_groups. I got an error when I try these using an user account.

{
    "data": {
        "scaling_group": null
    },
    "errors": [
        {
            "message": "Forbidden operation. (superadmin privilege required)",
            "locations": [
                {
                    "line": 2,
                    "column": 3
                }
            ],
            "path": [
                "scaling_group"
            ]
        }
    ]
}

@fregataa fregataa marked this pull request as ready for review March 18, 2025 10:59
@fregataa fregataa requested a review from yomybaby March 18, 2025 10:59
@github-actions github-actions bot added size:L 100~500 LoC and removed size:S 10~30 LoC labels Mar 18, 2025
Co-authored-by: octodog <mu001@lablup.com>
Copy link
Member

@yomybaby yomybaby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

graphite-app bot pushed a commit to lablup/backend.ai-webui that referenced this pull request Mar 24, 2025
…oup setting (#3383)

Resolves #3324 (FR-639)

> [!WARNING]
> This PR works on lablup/backend.ai#3940

# Add support for accelerator quantum size in resource allocation

This PR enhances the resource allocation form to support accelerator quantum sizes, allowing for more precise control over accelerator allocation based on scaling group settings.

Rule for accelerator UI step size:
- For shares type and single cluster:
  - If `accelerator_quantum_size` exists, use `accelerator_quantum_size`�.
  - If `accelerator_quantum_size` does not exist, use `0.1`.
- Otherwise, use `1`.

Key changes:
- Added `accelerator_quantum_size` field to the `ScalingGroup` type in GraphQL schema
- Added validation to ensure accelerator values are multiples of the quantum size
- Implemented step size adjustment based on the accelerator type and quantum size
- Added new translation strings for quantum size validation messages

![image.png](https://graphite-user-uploaded-assets-prod.s3.amazonaws.com/XqC2uNFuj0wg8I60sMUh/5e498d7f-cd1c-4639-8491-a71a8a805320.png)

- Added support for the `custom-accelerator-quantum-size` feature flag for version 25.5.0+
- Added new GraphQL query `accessible_scaling_groups` to fetch scaling groups with quantum size information

### How to test using local backend environment
#### Manager
- checkout the branch of lablup/backend.ai#3940
- update`VERSION` file to 25.5.0 (to pass the version compatible check for testing)
- run `./backend.ai mgr etcd put "config/plugins/accelerator/cuda/quantum_size" 0.3`
- run `./backend.ai mgr etcd put "config/plugins/accelerator/mock/allocation_mode" fractional`

- restart manager and agent

#### WebUI
- Now you can select `fgpu` in Session launcher
- The slider and input for the accelerator should only allow multiples of 0.3.
@HyeockJinKim HyeockJinKim added this pull request to the merge queue Mar 25, 2025
Merged via the queue into main with commit 855ffa8 Mar 25, 2025
23 checks passed
@HyeockJinKim HyeockJinKim deleted the feat/scaling-group-add-accelerator-config-field branch March 25, 2025 05:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:docs Documentations comp:manager Related to Manager component size:L 100~500 LoC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expand GET resource group API to return quantum size

4 participants