You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When downloading the model it seems like it wants to load the whole file into RAM before writing it to the disk. Could the download method be changed, so that it gradually downloads and saves the file, without putting it completely to memory.
This is how it looks like on my machine:
while downloading the model:
The memory consumption grew gradually during the download till it stops working because the memory is full.
The text was updated successfully, but these errors were encountered:
Files appear to be downloaded in chunks of 15.2gb at a time.
Memory usage increases while the chunk downloads, then drops down afterwards.
You'll need enough memory spare to download 15gb at once (or else source the weights via torrent, but I'm not sure which torrent or how to add this to llama.cpp).
Here's a screenshot showing my memory usage dropping after one chunk finished downloading:
When downloading the model it seems like it wants to load the whole file into RAM before writing it to the disk. Could the download method be changed, so that it gradually downloads and saves the file, without putting it completely to memory.
This is how it looks like on my machine:
while downloading the model:
The memory consumption grew gradually during the download till it stops working because the memory is full.
The text was updated successfully, but these errors were encountered: