Describe the Issue
I have a system with 3 5090 and I'm trying to use the last two for running an AI model while leaves my top gpu open for gaming etc. I have to select all, but then no matter what I pick for my tensor split (0,50,50) for example, the first gpu is still selected for cache/main processing.
I've tried various bat files using cuda hidden to try and hide the gpu from kobold.cpp but to no luck. I know in llama.cpp you can use --devices CUDA1,CUDA2 to fix this. It would be nice if kobold would add something like this to the gui, not just select 1 gpu or all in the dropdown, or some way you can add "extra flags" where I might be able to add commands. that might fix that Sort of like what Oooba/textui has for llama.cpp.
Describe the Issue
I have a system with 3 5090 and I'm trying to use the last two for running an AI model while leaves my top gpu open for gaming etc. I have to select all, but then no matter what I pick for my tensor split (0,50,50) for example, the first gpu is still selected for cache/main processing.
I've tried various bat files using cuda hidden to try and hide the gpu from kobold.cpp but to no luck. I know in llama.cpp you can use --devices CUDA1,CUDA2 to fix this. It would be nice if kobold would add something like this to the gui, not just select 1 gpu or all in the dropdown, or some way you can add "extra flags" where I might be able to add commands. that might fix that Sort of like what Oooba/textui has for llama.cpp.