Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move model filelocks from /tmp/ to ~/.cache/vllm/locks/ dir #3241

Merged
merged 2 commits into from
Mar 8, 2024

Conversation

mgoin
Copy link
Collaborator

@mgoin mgoin commented Mar 6, 2024

Creating and accessing filelocks in the global /tmp/ directory that are based on model names becomes problematic on multi-user systems since if one user runs LLM("facebook/opt-125"), then a filelock is created at /tmp/facebook-opt-125m.lock and left there. This is fine if a single user is doing this, but as soon as another user runs LLM("facebook/opt-125"), then vLLM will try to access the filelock that another user made, which they will not have permission for, triggering:

PermissionError: [Errno 13] Permission denied: '/tmp/facebook-opt-125m.lock'

This PR attempts to resolve this by using a user's local ~/.cache/vllm/locks directory to create and access locks, preventing file permissions conflicts with other users.

Addresses: #2179 #2232 #2675

Copy link
Collaborator

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Left a small comment.

vllm/model_executor/weight_utils.py Show resolved Hide resolved
@mgoin
Copy link
Collaborator Author

mgoin commented Mar 7, 2024

Thanks for the review @zhuohan123! Regarding the location and XDG_CACHE_HOME, I took the precedent from the Usage Stats PR #2852 since they use the XDG_CONFIG_HOME to control ~/.config

I think the failing lora test is flaky but I don't seem able to rerun the job

@zhuohan123 zhuohan123 merged commit c2c5e09 into vllm-project:main Mar 8, 2024
23 checks passed
dtransposed pushed a commit to afeldman-nm/vllm that referenced this pull request Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants