-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include server in releases + other build system cleanups #1610
Include server in releases + other build system cleanups #1610
Conversation
|
afaik @SlyEcho was doing something with the server binaries too? |
I looked at the pending pulls and didn't see anything obvious that overlapped but I might have missed it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should merge #1606 this before, to fix ci
can you take a very quick look and merge? :)
otherwise looks good
It looks like a simple change but I don't have a way to test it (I'm not on Windows). Do you still want me to just go ahead and merge it? (Or you can?) |
yea i just merged it. pull and let ci run :) |
e175150
to
5aed385
Compare
Looks like the Windows clBLAST stuff doesn't work: https://github.com/ggerganov/llama.cpp/actions/runs/5098706551 What do you want to do at this point? |
this is very weird. let's wait for @SlyEcho to fix it :) |
also the cublas build failing is gg's doing 😄 |
I didn't manage to make a PR before hitting the other issues, but you can compare what I did here: master...SlyEcho:llama.cpp:enable_server I also added it to the Docker image and command. |
I might be biased, but I like my approach to the If you applied your pull after mine, you could just remove the |
I didn't pay much attention to the Makefile but I have some justification: the HTTP and JSON single file libraries are external, so it is not expected that they change at all between builds, but I guess this assumption is not technically correct. I think the |
I'm against renaming dependencies. Also, technically, C and C++ headers can be different and I prefer using |
I doubt they'll change frequently, but it would be weird and confusing if someone copied in a newer version of those libraries and running |
I don't mean to suggest to rewrite everything but there is actually a better solution to this: server: examples/server/server.cpp examples/server/httplib.h examples/server/json.hpp build-info.h ggml.o llama.o common.o $(OBJS)
$(CXX) $(CXXFLAGS) -Iexamples/server $(filter-out %.h,$(filter-out %.hpp,$^)) -o $@ $(LDFLAGS) to: %.o: %.cpp
$(CXX) $(CPPFLAGS) $(CXXFLAGS) -c -o $@ $<
%: %.o
$(CXX) $(LDFLAGS) -o $@ $^ $(LDLIBS)
VPATH+= examples/server
server.o: build-info.h httplib.h json.hpp something like that |
I agree that approach is better, but I wasn't really looking to refactor the I also am not really sure about making changes to the Do you feel strongly that refactoring the |
Add server to Makefile (still uses LLAMA_BUILD_SERVER define) Remove vdot binary when running make clean Fix compile warnings in server example Add .hpp files to workflow (the server example has one)
5aed385
to
81996ea
Compare
Absolutely not. |
I'd be fine if this rolls in any of the changes from my PR (#1570) as well, and I can close mine. Currently the server assumes a way of handling the context that doesn't really fit with most frontends, which do heavy context editing. It's also missing a number of generation settings that could just be passed directly though the server but aren't implemented. Welp, merged while I was typing this so nevermind. Haha. |
Set
LLAMA_BUILD_SERVER
in workflow so theserver
example gets build, however I only added this to the Windows builds because it seems like only Windows binary artifacts are included in releases.Add
server
example toMakefile
(still usesLLAMA_BUILD_SERVER
define and does not build by default)Remove
vdot
binary when runningmake clean
.Fix compile warnings in
server
example.Add
.hpp
files to workflow (the server example has one).I don't really have a way to test the workflow changes. They seem pretty straightforward but I'm far from an expert on GitHub workflows.
I also don't actually know C++, so my changes to
server.cpp
were done without really understanding how it works. All I can say is it appears to run okay and compiles without warnings on clang 15, gcc 13 and gcc 11.Closes #1578