Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory leak when running mistralai/Mistral-7B-Instruct-v0.1 #1321

Closed
captify-sivakhno opened this issue Oct 11, 2023 · 7 comments
Closed

memory leak when running mistralai/Mistral-7B-Instruct-v0.1 #1321

captify-sivakhno opened this issue Oct 11, 2023 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@captify-sivakhno
Copy link

I have an error suggesting memory leak when running

sampling_params = SamplingParams(best_of=3, temperature=0.8, top_p=0.95, max_tokens=450, presence_penalty = 1.0, frequency_penalty=1.)
outputs = llm.generate(prompts, sampling_params)

ValueError: Double free! PhysicalTokenBlock(device=Device.GPU, block_number=415, ref_count=0) is already freed.

Any suggestions on what to look into would be most appreciated!

@WoosukKwon WoosukKwon added the bug Something isn't working label Oct 11, 2023
@WoosukKwon
Copy link
Collaborator

Hi @captify-sivakhno, thanks for reporting the bug. Could you share your prompts so that we can reproduce the bug?

@captify-sivakhno
Copy link
Author

@WoosukKwon please find attached the prompt - it's a simple summary promp
prompt.txt

@frankiedrake
Copy link

I have the same issue running inference on multiple GPU

@chadlzx
Copy link

chadlzx commented Nov 16, 2023

The issue seems to occur when the Prompt is too long; I estimate that this threshold of input_id length is above 2048.

@WoosukKwon WoosukKwon self-assigned this Nov 17, 2023
@chujiezheng
Copy link
Contributor

@WoosukKwon Hi, is there any progress on this issue?

@hmellor
Copy link
Collaborator

hmellor commented Mar 9, 2024

Closing in preference to #1556

@hmellor hmellor closed this as completed Mar 9, 2024
@hmellor hmellor closed this as not planned Won't fix, can't repro, duplicate, stale Mar 9, 2024
@manzke
Copy link

manzke commented Mar 11, 2024

Watch out - the mistral v0.1 has a sliding window of 4096. If you text is above, it will run into #1556 - either patch the config.json (set "sliding_window": null) or make sure you are below the 4k.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants