-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rpc : send hash when tensor data is above some fixed threshold #12496
Conversation
b7bda76
to
3b5f524
Compare
I added support for caching tensors from GGUF files (via multiple The problem with caching GGUF files is that they need to stay in RAM the whole time. I tried to keep a map of The good news is that caching large tensors in a local dir and loading them from there works fine. I am inclined to drop support for caching GGUFs and leave only the cache dir support. Keeping GGUFs in memory also makes the reported available memory inaccurate with the CPU backend. |
I'm OK with that - it will make the change even simpler. |
OK, I have left only the Should we try to reuse |
Sounds good. Maybe instead of linking |
I changed the server to accept a |
Yes, that's better - the cache path can be controlled with the env variable.
Yes. |
examples/rpc/rpc-server.cpp
Outdated
|
||
namespace fs = std::filesystem; | ||
|
||
// NOTE: this is copied from common.cpp to avoid linking with libllama |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// NOTE: this is copied from common.cpp to avoid linking with libllama | |
// NOTE: this is copied from common.cpp to avoid linking with libcommon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok but I don't understand why putting common
in target_link_libraries
is pulling libllama
as dependency. I have a static libcommon.a
built, why is not possible to link against it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
libcommon
already links to libllama
:
llama.cpp/common/CMakeLists.txt
Line 141 in f125b8d
target_link_libraries (${TARGET} PRIVATE ${LLAMA_COMMON_EXTRA_LIBS} PUBLIC llama Threads::Threads) |
So anything that links to libcommon
will indirectly link to libllama
. In theory, we can separate all "common" functionality that does not depend on libllama
into a separate standalone common library that would be suitable to link in this case.
Btw, on master
, the rpc-server
example links to libllama
, which is not necessary. You can simply remove this dependency:
diff --git a/examples/rpc/CMakeLists.txt b/examples/rpc/CMakeLists.txt
index ae48fb98d..892db89ea 100644
--- a/examples/rpc/CMakeLists.txt
+++ b/examples/rpc/CMakeLists.txt
@@ -1,2 +1,2 @@
add_executable(rpc-server rpc-server.cpp)
-target_link_libraries(rpc-server PRIVATE ggml llama)
+target_link_libraries(rpc-server PRIVATE ggml)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, fixed
ref #10095