Skip to content

[Bug]: DetachedInstanceError when provisioning instances with placement groups reused #3904

@r4victor

Description

@r4victor

Steps to reproduce

  1. Create a cluster fleet with a placement group that fails to provision, e.g. an AWS H100 fleet in a zone with no capacity:
type: fleet
name: cloud-fleet
nodes:
  min: 1
placement: cluster
availability_zones:
  - us-east-1a
resources:
  gpu: H100
  1. On the first provisioning attempt all offers fail and the placement group remains

  2. Wait for the second provisioning attempt – and you'll get

raise orm_exc.DetachedInstanceError(
                    sqlalchemy.orm.exc.DetachedInstanceError: Parent instance
                    <PlacementGroupModel at 0x154a01520> is not bound to a
                    Session; lazy load operation of attribute 'project' cannot
                    proceed (Background on this error at:

Actual behaviour

Caused by missing PlacementGrouoModel project and fleet attribute loading:

placement_group_models = await get_fleet_placement_group_models(

Expected behaviour

No response

dstack version

master

Server logs

[14:08:19] ERROR    dstack._internal.server.background.pipeline_tasks.base:361
                    Unexpected exception when processing item
                    Traceback (most recent call last):
                      File
                    "/Users/r4victor/Projects/dstack/dstack/src/dstack/_interna
                    l/server/background/pipeline_tasks/base.py", line 359, in
                    start
                        await self.process(item)
                      File
                    "/Users/r4victor/Projects/dstack/dstack/src/dstack/_interna
                    l/server/utils/sentry_utils.py", line 28, in wrapper
                        return await f(*args, **kwargs)
                               ^^^^^^^^^^^^^^^^^^^^^^^^
                      File
                    "/Users/r4victor/Projects/dstack/dstack/src/dstack/_interna
                    l/server/background/pipeline_tasks/instances/__init__.py",
                    line 284, in process
                        process_context = await _process_pending_item(item)
                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                      File
                    "/Users/r4victor/Projects/dstack/dstack/src/dstack/_interna
                    l/server/background/pipeline_tasks/instances/__init__.py",
                    line 324, in _process_pending_item
                        result = await create_cloud_instance(instance_model)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                      File
                    "/Users/r4victor/Projects/dstack/dstack/src/dstack/_interna
                    l/server/background/pipeline_tasks/instances/cloud_provisio
                    ning.py", line 151, in create_cloud_instance
                        ) = await
                    _find_or_create_suitable_placement_group_model(
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                    ^^
                      File
                    "/Users/r4victor/Projects/dstack/dstack/src/dstack/_interna
                    l/server/background/pipeline_tasks/instances/cloud_provisio
                    ning.py", line 381, in
                    _find_or_create_suitable_placement_group_model
                        placement_group_model_to_placement_group(placement_grou
                    p_model),
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                    ^^^^^^^^
                      File
                    "/Users/r4victor/Projects/dstack/dstack/src/dstack/_interna
                    l/server/services/placement.py", line 36, in
                    placement_group_model_to_placement_group
                        project_name=placement_group_model.project.name,
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                      File
                    "/Users/r4victor/Projects/dstack/dstack/.venv/lib/python3.1
                    2/site-packages/sqlalchemy/orm/attributes.py", line 569, in
                    __get__
                        return self.impl.get(state, dict_)  # type: ignore
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
                      File
                    "/Users/r4victor/Projects/dstack/dstack/.venv/lib/python3.1
                    2/site-packages/sqlalchemy/orm/attributes.py", line 1096,
                    in get
                        value = self._fire_loader_callables(state, key,
                    passive)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                    ^
                      File
                    "/Users/r4victor/Projects/dstack/dstack/.venv/lib/python3.1
                    2/site-packages/sqlalchemy/orm/attributes.py", line 1131,
                    in _fire_loader_callables
                        return self.callable_(state, passive)
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                      File
                    "/Users/r4victor/Projects/dstack/dstack/.venv/lib/python3.1
                    2/site-packages/sqlalchemy/orm/strategies.py", line 922, in
                    _load_for_state
                        raise orm_exc.DetachedInstanceError(
                    sqlalchemy.orm.exc.DetachedInstanceError: Parent instance
                    <PlacementGroupModel at 0x154a01520> is not bound to a
                    Session; lazy load operation of attribute 'project' cannot
                    proceed (Background on this error at:
                    https://sqlalche.me/e/20/bhk3)

Additional information

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions