You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was testing i loaded a 2B model then wanted to change the system prompt, so i changed it and hit start, and the new gen was much slower, when i checked i managed to verify the issue, it seems that the app is not unloading the GPU memory before loading the new model so it's getting loaded into shared memory instead of the native gpu memory.
The text was updated successfully, but these errors were encountered:
I was testing i loaded a 2B model then wanted to change the system prompt, so i changed it and hit start, and the new gen was much slower, when i checked i managed to verify the issue, it seems that the app is not unloading the GPU memory before loading the new model so it's getting loaded into shared memory instead of the native gpu memory.
The text was updated successfully, but these errors were encountered: