Skip to content

Commit

Permalink
use lowvram flag for offload qkv
Browse files Browse the repository at this point in the history
  • Loading branch information
LostRuins committed Dec 8, 2023
1 parent ec21fa7 commit 7469f20
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion gpttype_adapter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -895,7 +895,7 @@ ModelLoadResult gpttype_load_model(const load_model_inputs inputs, FileFormat in
//llama_ctx_paran_parts = -1;
llama_ctx_params.seed = -1;
//llama_ctx_params.f16_kv = true;
//llama_ctx_params.low_vram = inputs.low_vram;
llama_ctx_params.offload_kqv = !inputs.low_vram;
llama_ctx_params.mul_mat_q = inputs.use_mmq;
llama_ctx_params.logits_all = false;
model_params.use_mmap = inputs.use_mmap;
Expand Down

0 comments on commit 7469f20

Please sign in to comment.