Skip to content

Conversation

@ngxson
Copy link
Collaborator

@ngxson ngxson commented Nov 29, 2025

Extracted part of the changes in #17554 into this dedicated PR, just in case something goes wrong it's easier to trace back.

Compare to the proposed approach in the mentioned PR, which simply move everything to .h, this PR do some extra thing:

  • Moving code via a dedicated commit using git mv, so that auto-merge can be more happy (I hope so, will need to test)
  • Expose only a subset of infrastructure via server-context.h; so for example, server_slot is now a private implementation
  • Simplify the public API of server_context, consolidate everything into 4 main functions: init(), load_model(), start_loop(), terminate()

This should allow easier integration of server inside CLI, while allow downstream to incorporate server as a library (cc @bandoti , probably pre-cursor to llamax)

Comment on lines 54 to 71
struct server_slots_t {
~server_slots_t();
std::vector<server_slot*> data;
size_t size() const { return data.size(); }
server_slot & operator[](size_t idx) { return *(data[idx]); }
server_slot & operator[](size_t idx) const { return *(data[idx]); }
void clear();
server_slot & create();
struct iterator {
typename std::vector<server_slot*>::iterator it;
iterator(typename std::vector<server_slot*>::iterator i) : it(i) {}
server_slot & operator*() { return **it; }
iterator & operator++() { ++it; return *this; }
bool operator!=(const iterator& other) const { return it != other.it; }
};
iterator begin() { return iterator(data.begin()); }
iterator end() { return iterator(data.end()); }
};

This comment was marked as outdated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I think pimpl will be the cleaner way (and also easier for us, in case we need to add a new utility function into server_context)

Will implement this in my next commit

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented in 9a7b4f3 :

  • server_context is moved to server_context_impl (just renaming struct, the code is left untouched)
  • the new server_context only expose: init(), load_model(), start_loop(), terminate()

@ggerganov Would be nice if you can review it in a short time frame (ideally before #17470 as some conflicts are inevitable). Thanks!

@ngxson ngxson force-pushed the xsn/create_server_context branch from 22039aa to 9a7b4f3 Compare November 29, 2025 18:04
@ngxson
Copy link
Collaborator Author

ngxson commented Nov 29, 2025

Nice, thanks for the speedy review!

@ngxson ngxson force-pushed the xsn/create_server_context branch from 5f7b502 to c5f8cdf Compare November 29, 2025 19:21
@ngxson ngxson mentioned this pull request Nov 29, 2025
@ngxson ngxson merged commit ab49f09 into ggml-org:master Nov 29, 2025
72 of 74 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants