Skip to content

fix: use fleet blocks config for string SSH hosts in create_fleet_ssh_instance_model#3695

Closed
mattate wants to merge 1 commit intodstackai:masterfrom
mattate:fix/ssh-fleet-blocks-config
Closed

fix: use fleet blocks config for string SSH hosts in create_fleet_ssh_instance_model#3695
mattate wants to merge 1 commit intodstackai:masterfrom
mattate:fix/ssh-fleet-blocks-config

Conversation

@mattate
Copy link
Copy Markdown

@mattate mattate commented Mar 25, 2026

Summary

Fixes #3475

When SSH fleet hosts are specified as plain strings (e.g., hosts: ["192.168.0.1"]), the fleet-level blocks configuration is ignored and hardcoded to 1. This prevents instances from being divided into multiple GPU blocks, even when blocks: auto or a specific number is set in the fleet YAML.

Root Cause

In create_fleet_ssh_instance_model(), the if isinstance(host, str) branch hardcodes blocks = 1, while the else branch (for SSHHostParams objects) correctly reads blocks from host.blocks. The fleet-level spec.configuration.blocks is never used for string hosts.

Fix

Replace blocks = 1 with blocks = spec.configuration.blocks in the string host branch, consistent with how fleet-level configuration is propagated for other SSH parameters (ssh_user, ssh_key, port) in the same branch.

Steps to reproduce

  1. Create an SSH fleet with blocks: 3 (or blocks: auto):
type: fleet
name: my-fleet
blocks: 3
ssh_config:
  user: myuser
  identity_file: ~/.ssh/id_rsa
  hosts:
    - 192.168.0.1
  1. Apply: dstack apply -f fleet.dstack.yml
  2. Check blocks: dstack fleet get my-fleet --json | jq ".instances[0].total_blocks"
  3. Observe total_blocks: 1 instead of 3

After this fix, the instance correctly reports total_blocks: 3.

AI Assistance Disclosure

This PR was written with AI assistance (code agent).

…_instance_model

When SSH fleet hosts are specified as plain strings (e.g. hosts: ["192.168.0.1"]),
the fleet-level blocks configuration is ignored and hardcoded to 1. This prevents
instances from being divided into multiple GPU blocks.

The fix uses spec.configuration.blocks instead of the hardcoded value, consistent
with how the else branch reads host.blocks for SSHHostParams objects.

Fixes dstackai#3475
@un-def
Copy link
Copy Markdown
Collaborator

un-def commented Mar 26, 2026

Thanks for the contribution. Actually, this fix is not related to #3475 but addresses another known issue – #3278

The fix is partial, though – it works when a host is specified as a string, but the object form still defaults to blocks: 1 if ssh_config.hosts[*].blocks is not specified. I think this makes the situation worse:

Now:

blocks: 4
ssh_config:
  hosts:
    - 192.168.100.100  # blocks = 1, top-level blocks is ignored
    - hostname: 192.168.100.200  # blocks = 1, top-level blocks is ignored

With the fix:

blocks: 4
ssh_config:
  hosts:
    - 192.168.100.100  # blocks = 4, top-level blocks value is used as a fallback
    - hostname: 192.168.100.200  # blocks = 1, top-level blocks is still ignored

@peterschmidt85 I think we should reopen #3278 and make a proper fix with the following fallback chain: ssh_config.hosts[n].blocks → top-level blocks1

@un-def
Copy link
Copy Markdown
Collaborator

un-def commented Mar 26, 2026

Closed in favor of #3700

@un-def un-def closed this Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Instances created for jobs are not divided into blocks

2 participants