Activity
Increase GGML_MAX_SRC from 6 to 8
Increase GGML_MAX_SRC from 6 to 8
Print requested context size before asserting and crashing
Print requested context size before asserting and crashing
Add ggml_total_size_for_tensor_data
Add ggml_total_size_for_tensor_data
Change eps in ggml_compute_forward_group_norm_f32 from 1e-6f to 1e-5f…
Change eps in ggml_compute_forward_group_norm_f32 from 1e-6f to 1e-5f…
Increase GGML_MAX_NODES from 4096 to 80000
Increase GGML_MAX_NODES from 4096 to 80000
Fix build when used as a submodule in rwkv.cpp
Fix build when used as a submodule in rwkv.cpp
Add Q4_1_O data type; fix build when used as a submodule in rwkv.cpp
Add Q4_1_O data type; fix build when used as a submodule in rwkv.cpp
Force push
Add Q4_1_O data type; fix build when used as a submodule in rwkv.cpp
Add Q4_1_O data type; fix build when used as a submodule in rwkv.cpp
Force push
Add Q4_1_O data type; fix build when used as a submodule in rwkv.cpp
Add Q4_1_O data type; fix build when used as a submodule in rwkv.cpp
Force push
Make Q4_1_O more similar to Q4_1
Make Q4_1_O more similar to Q4_1
Move memcpy out of the row loop, move outlier multiplication into the…
Move memcpy out of the row loop, move outlier multiplication into the…
Optimize Q4_1_O by moving outlier multiplication out of the dequantiz…
Optimize Q4_1_O by moving outlier multiplication out of the dequantiz…
Fix build when used as a submodule in rwkv.cpp
Fix build when used as a submodule in rwkv.cpp