Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QUESTION: No available slot found for the model #888

Closed
yxybyq opened this issue Jan 11, 2024 · 12 comments
Closed

QUESTION: No available slot found for the model #888

yxybyq opened this issue Jan 11, 2024 · 12 comments
Labels
question Further information is requested stale
Milestone

Comments

@yxybyq
Copy link

yxybyq commented Jan 11, 2024

Server error: 503 - [address=0.0.0.0:36042, pid=294316] No available slot found for the model

@yxybyq yxybyq added the question Further information is requested label Jan 11, 2024
@XprobeBot XprobeBot modified the milestones: v0.8.0, v0.8.1 Jan 11, 2024
@aresnow1 aresnow1 changed the title QUESTION QUESTION: No available slot found for the model Jan 12, 2024
@aresnow1
Copy link
Contributor

It means GPU has been used by other launched models, there's no slot left.

@XprobeBot XprobeBot modified the milestones: v0.8.1, v0.8.2 Jan 19, 2024
@SwarmKit
Copy link

使用以下方式运行,此问题依然存在:
docker run -dit -p 8080:9997 -v /data/ModelFiles:/workspace --gpus all xprobe/xinference:v0.8.2 xinference-local -H 0.0.0.0

image

image

@aresnow1
Copy link
Contributor

aresnow1 commented Feb 1, 2024

使用以下方式运行,此问题依然存在: docker run -dit -p 8080:9997 -v /data/ModelFiles:/workspace --gpus all xprobe/xinference:v0.8.2 xinference-local -H 0.0.0.0

image

image

Have you launched any models before?

@SwarmKit
Copy link

SwarmKit commented Feb 1, 2024

是的,我已经加载一个模型ChatGLM3-6B-32k,再次加载其他的模型时就会出现这个错误提示!!!

@XprobeBot XprobeBot modified the milestones: v0.8.2, v0.8.4, v0.8.5, v0.9.0 Feb 2, 2024
@xiandan-erizo
Copy link

意思是一块GPU只能运行一个模型?这有点浪费资源了吧

@SwarmKit
Copy link

目前就是这样的……

@XprobeBot XprobeBot modified the milestones: v0.9.0, v0.9.1 Feb 22, 2024
@XprobeBot XprobeBot modified the milestones: v0.9.1, v0.9.2, v0.9.3 Mar 1, 2024
@XprobeBot XprobeBot modified the milestones: v0.9.3, v0.9.4, v0.9.5 Mar 15, 2024
@XprobeBot XprobeBot removed this from the v0.10.0 milestone Mar 29, 2024
@XprobeBot XprobeBot modified the milestones: v0.10.3, v0.11.0 Apr 24, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.0, v0.11.1, v0.11.2 May 11, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.2, v0.11.3 May 24, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.3, v0.11.4, v0.12.0, v0.12.1 May 31, 2024
@XprobeBot XprobeBot modified the milestones: v0.12.1, v0.12.2 Jun 14, 2024
@Valdanitooooo
Copy link
Contributor

持续关注,没想到会这样,先用回FastChat吧

@XprobeBot XprobeBot modified the milestones: v0.12.2, v0.12.4 Jun 28, 2024
@jony4
Copy link

jony4 commented Jun 30, 2024

我这边的复现逻辑

  1. 创建一个可能启动不起来的模型,或者说加载时间很长的模型,或者在正常可以启动的模型启动中途去执行第二步
  2. 然后通过接口删除掉这个模型 delete /v1/models/chatglm3-32k 接口
  3. 再去创建模型实例,就出现 No available slot found for the model
  4. /v1/models/instances 这个接口会看到第一步里边的模型依旧处于 creating 状态。通过查看代码貌似是这段中的 list 在第二5 步执行时候没有清理list 的某个值
class StatusGuardActor(xo.StatelessActor):
    def __init__(self):
        super().__init__()
        self._model_uid_to_info: Dict[str, InstanceInfo] = {}  # type: ignore

Copy link

github-actions bot commented Aug 7, 2024

This issue is stale because it has been open for 7 days with no activity.

@github-actions github-actions bot added the stale label Aug 7, 2024
Copy link

This issue was closed because it has been inactive for 5 days since being marked as stale.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested stale
Projects
None yet
Development

No branches or pull requests

8 participants