Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dstack pool add-ssh fails #1245

Closed
peterschmidt85 opened this issue May 18, 2024 · 2 comments · Fixed by #1253
Closed

dstack pool add-ssh fails #1245

peterschmidt85 opened this issue May 18, 2024 · 2 comments · Fixed by #1253
Labels
bug Something isn't working

Comments

@peterschmidt85
Copy link
Contributor

peterschmidt85 commented May 18, 2024

No certain steps to reproduce.

We used dstack pool add-ssh and dstack pool remove from different machines with the same instance.

At a certain point, dstack stopped adding it properly.

Now, if I try to add it, I see the following server log:

[10:03:18] DEBUG    dstack._internal.core.backends.remote.provisioning:218 Try to connect to
                    ubuntu@141.147.60.138:22 with key SHA256:nJWeUJk16WnGvROaxSm/0l2uEoJvxtEpPYkjgiDtd9c
[10:03:19] INFO     dstack._internal.server.background.tasks.process_instances:125 Connected to ubuntu
                    141.147.60.138
[10:03:20] DEBUG    dstack._internal.server.background.tasks.process_instances:136 The script for installing
                    dstack has been executed
[10:03:21] DEBUG    dstack._internal.server.background.tasks.process_instances:143 The dstack-shim
                    environemnt variables has been installed
[10:03:22] DEBUG    dstack._internal.core.backends.remote.provisioning:148 Cannot read `host_info.json`: cat:
                    /root/.dstack/host_info.json: No such file or directory
[10:03:25] DEBUG    dstack._internal.server.app:177 Processed request POST
                    http://127.0.0.1:3000/api/project/main/pool/show in 0.011149s
[10:03:26] DEBUG    dstack._internal.core.backends.remote.provisioning:148 Cannot read `host_info.json`: cat:
                    /root/.dstack/host_info.json: No such file or directory
[10:03:29] DEBUG    dstack._internal.core.backends.remote.provisioning:148 Cannot read `host_info.json`: cat:
                    /root/.dstack/host_info.json: No such file or directory
[10:03:32] DEBUG    dstack._internal.core.backends.remote.provisioning:148 Cannot read `host_info.json`: cat:
                    /root/.dstack/host_info.json: No such file or directory
[10:03:35] DEBUG    dstack._internal.server.app:177 Processed request POST
                    http://127.0.0.1:3000/api/project/main/pool/show in 0.023496s
           DEBUG    dstack._internal.core.backends.remote.provisioning:148 Cannot read `host_info.json`: cat:
                    /root/.dstack/host_info.json: No such file or directory
[10:03:38] DEBUG    dstack._internal.core.backends.remote.provisioning:148 Cannot read `host_info.json`: cat:
                    /root/.dstack/host_info.json: No such file or directory
[10:03:41] DEBUG    dstack._internal.core.backends.remote.provisioning:148 Cannot read `host_info.json`: cat:
                    /root/.dstack/host_info.json: No such file or directory
[10:03:45] DEBUG    dstack._internal.core.backends.remote.provisioning:148 Cannot read `host_info.json`: cat:
                    /root/.dstack/host_info.json: No such file or directory
           DEBUG    dstack._internal.server.app:177 Processed request POST
                    http://127.0.0.1:3000/api/project/main/pool/show in 0.009631s
[10:03:48] DEBUG    dstack._internal.core.backends.remote.provisioning:148 Cannot read `host_info.json`: cat:
                    /root/.dstack/host_info.json: No such file or directory

The instance status is stuck at provisioning

@peterschmidt85 peterschmidt85 added the bug Something isn't working label May 18, 2024
@peterschmidt85
Copy link
Contributor Author

Plus the instance doesn react to dstack pool remove. After it, it stuck at terminating.

@TheBits
Copy link
Contributor

TheBits commented May 21, 2024

No certain steps to reproduce.

I got certain steps:

  1. Add an instance using dstack pool add-ssh
  2. Remove the entire contents of the /root/.dstack directory from the instance. The dstack-shim.service must keep running.
  3. Use the command dstack pool remove to remove the instance from the dstack serve
  4. And then add the instance with dstack pool add-ssh

Following that, dstack-shim will not recreate host_info.json, and the server will wait for the file until the timeout. The instance status is stuck at provisioning with logs like in the first message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants