Why memory usage not change when add different input with GGML format #566
Unanswered
SiraHaruethaipree
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I don't know too much about GGML format. But I know memory usage in vram GPU was changed depending on input sequence if input is long sequence it will increase the memory usage like when I test with load_8_bit or load_4_bit method from huggingface. So I need to know how memory usage is always the same value when use GGML format with GPU. Please someone explain.
![image](https://private-user-images.githubusercontent.com/82432680/273835931-b73cc551-f9d4-4534-be30-737da94a12d3.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA0NjkwNjksIm5iZiI6MTcyMDQ2ODc2OSwicGF0aCI6Ii84MjQzMjY4MC8yNzM4MzU5MzEtYjczY2M1NTEtZjlkNC00NTM0LWJlMzAtNzM3ZGE5NGExMmQzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MDglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzA4VDE5NTkyOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTJlNWI1NTE1MmRjOGViZTQxNWE4YTUzODVhZDFjMWJhYjNjNTA1MjAxOWVkZjlhOWUwODFlODFlOWNkNDNjYjgmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.RyGhCvMRCy5uEAKV8E9tbEQMGB04qYeU4OKs2fgnEFY)
Thanks
Beta Was this translation helpful? Give feedback.
All reactions