-
Notifications
You must be signed in to change notification settings - Fork 13.8k
server/public_simplechat update - builtin client side tool calls with zero setup - reasoning - vision - uncompressed 300kb - no external deps #17451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Update Me class Update show settings Update show props info Update readme
Moved into Me.chatProps
Also remove more inner/detailed stuff from show info in not bAll mode, given that many of the previous differentiated stuff have been moved into chatProps and inturn shown for now
Dont allow tool names to be changed in settings page
The config entries should be named same as their equivalent cmdline argument entries but without the -- prefix
Allow fetching from only specified allowed.domains
Had confused between js and python wrt accessing dictionary contents and its consequence on non existent key. Fixed it. Use different error ids to distinguish between failure in common urlreq and the specific urltext and urlraw helpers.
with allowed domains set to few sites in general to show its use this includes some sites which allow search to be carried out through them as well as provide news aggregation
ie include User-Agent, Accept-Language and Accept in the generated request using equivalent values got in the request being proxied.
The tagging of messages wrt ValidateUrl and UrlReq Also dump req Move check for --allowed.domains to ValidateUrl NOTE: Also with mimicing of user agent etal from got request to the generated request, yahoo search/news is returning results now, instead of the bland error before.
mimicing got req in generated req helps with duckduckgo also and not just yahoo. also update allowed.domains to allow a url generated by ai when trying to access the bing's news aggregation url
Use DOMParser parseFromString in text/html mode rather than text/xml as it makes it more relaxed without worrying about special chars of xml like & etal
ie during proxying
Instead of simple concatenating of tool call id, name and result now use browser's dom logic to create the xml structure used for now to store these within content field. This should take care of transforming / escaping any xml special chars in the result, so that extracting them later for putting into different fields in the server handshake doesnt have any problem.
bing raised a challenge for chrome triggered search requests after few requests, which were spread few minutes apart, while still seemingly allowing wget based search to continue (again spread few minutes apart). Added a simple helper to trace this, use --debug True to enable same.
avoid logically duplicate debug log
Instead of enforcing always explicit user triggered tool calling, now user is given the option whether to use explicit user triggered tool calling or to use auto triggering after showing tool details for a user specified amount of seconds. NOTE: The current logic doesnt account for user clicking the buttons before the autoclick triggers; need to cancel the auto clicks, if user triggers before autoclick, ie in future.
also cleanup the existing toolResponseTimeout timer to be in the same structure and have similar flow convention.
identified by llama.cpp editorconfig check * convert tab to spaces in json config file * remove extra space at end of line
Add missing newline to ending bracket line of json config file
include info about the auto option within tools. use nonwrapped text wrt certain sections, so that the markdown readme can be viewed properly wrt the structure of the content in it.
So as to split browser js webworker based tool calls from web related tool calls.
Remove the unneed (belonging to the other file) stuff from tooljs and toolweb files. Update tools manager to make use of the new toolweb module
Initial go at implementing a web search tool call, which uses the existing UrlText support of the bundled simpleproxy.py. It allows user to control the search engine to use, by allowing them to set the search engine url template. The logic comes with search engine url template strings for duckduckgo, brave, bing and google. With duckduckgo set by default.
Avoid code duplication, by creating helpers for setup and toolcall. Also send indication of the path that will be used, when checking for simpleproxy.py server to be running at runtime setup.
Missing DivStream caught. Logic ready for a distant future, where one may allow the chat session to be switched even if there is a pending ai server / tool call response. Avoids the unneeded removeChild before calling appendChild. Also found the reason why sometimes the DivStream was missing from DivChat. Rather when I switched from the transient <p> element to the persistant session specific <div> elements, I had forgotten to convert the flow fully, rather I had forgotten to replace the elP.remove() with elP.replaceChildren Also retaining that debug log wrt missing DivStream path, Just in case for now, to cross check once later, I havent missed any other path.
Make DivStream hold a Role element and Data element and inturn have the live got data go into the data element. Set some of the relavent classes wrt these, so that it is themed matching other chat blocks, to any extent. Add a clear helper function to cleanup as and when needed. NOTE: Remember to use this to get hold of the DivStream instance being worked on. NOTE: Dont forget that setting a parent wrt a HTMLElement wont automatically add it to the corresponding DOM with a parent child relation. The new html element will just remain in memory ignored by everyone else.
Add logic for hiding and showing and use them as needed.
Include the DivStream of ExternalAi toolcall in the other chat session UIs, so that user can see what the external_ai toolcall is doing, without having to switch out to external ai session tab. Update the name of the tool call external ai session. Ensure to clear previous chat session wrt external ai tool calls As the ai may call external ai toolcall with the same system prompt sometimes, which wont trigger the autoclear logic, wrt the corresponding chat session. TODO: In future maybe provide a option to continue any previous chat session if the system prompt is not changed wrt external ai toolcall.
Update the external ai tool call description to indicate that it doesnt have access to internet or tool calls. Update the sys_date_time description and avoid the confusion caused to some ai wrt the template string as to whether it is optional or required and wasting reasoning time around it. Now simply state that it is a reqd argument, and suggest the internal default template as a useful one to use. Update the msg returned by data store tool calls, to make them less verbose (ie avoid duplicating key list or got key data) while also more informative (ie num of keys, data length) Update the readme.
TODO: Rather than maintaining a reference to Me within SimpleChat, maintain an duplicated instance of the Config class in there, bcas that is what is needed in SimpleChat wrt Me. This should also avoid the seperete data object which was created yesterday to accomodate external_ai tool call.
Update SimpleChat to have a duplicated instance of Config instead of a reference to the global Me instance. Most logics updated to work with the config entries corresponding to the current chat session. ShowSettings, ShowInfo and getting current sliding window size in user friendly textual form have been moved to Config from Me. ALERT: With this now, user clicking settings will be able to modify only the settings corresponding to current in focus chat session. TODO: ToolsMgr related use within SimpleChat and beyond needs to be thought through and updated as needed. Should tools.proxyUrl be unique to each SimpleChat or ...
Given that now currently settings relates to only those related to the current chat session, so indicate the name/chatId of the current chat session in the Settings heading. * this inturn makes the id dynamic, so change the css rule wrt settings block from using id to classname, and to help with same also set a class name for the top level settings block. As part of same, as well as to help and ensure sanity in general add a helper to clean up a string to be a form usable as a chatId
Split init from setup wrt Tools. Init is called when the program is started, while setup is called when ever a session is created and or when ever any tools setup related config is changed (like proxyUrl getting modified or so). This allows one to modify tool related config and inturn update the tools related states wrt same (like wrt the proxyUrl, bcas the simpleproxy server will change, one needs to cross check which and all web services are available with the new simpleproxy server and so and so ...) To help with the above, now ToolsManager maintains multiple tool call switches, where there will be a tc_switch wrt each chat session. All tool related codes, where needed have been updated to work with the chat session for which they have been called, instead of any global states. In future, this can enable one to allow user to enable or disable the tool calls exposed in each of the chat sessions, This immidiately needed toolweb and toolai to account for chat session as needed. In future, if we want to keep the data store isolated across chat sessions, then this can be useful, in that it can create uniq store names for each of the chat sessions, isntead of sharing a common store name across all sessions. The chat session creation as well as tool init/setup related flows have been updated to chain through or account for the promises as needed
Given that ValidatedToolCallTrigger UI setup has to also setup the auto trigger logic, which inturn is dependent on the autosecs value in the Config associated with the Chat session involved, so pass chatId to ShowMessage and inturn ValidatedToolCallTriggerUI setup logic Rename the function name to better match the semantic.
When switching to settings ui, hide the User input and ValidateTC areas. When switching back to any chat session, unhide User input, while the ValidateTC ui will be handled by corresponding helper logic called through the ShowMessage
Ensure toolNames array is reset, each time setup is called, so that it doesnt end up with duplicate entries and equally dont end up with entries of tool calls which are no longer available, maybe because some config changed etal. Ensure the ChatId is logged wrt the toolweb related setup actions. Ensure that ExternalAi tool call related chat session, has its tools config disabled, when its created itself, so that end user doesnt get confused, given that external_ai toolcall explicitly forces tools support to disabled. Update some of the notes and readme
Rename the default 2 chat session names to make them neutral. Update internal name to include AnveshikaSallap, better matching the semantic of the logic, given support for tool calling, vision, reasoning, ... Also update usage note wrt simpleproxy.py for web access related tool calls and few other minor tweeks.
Ensure we are working with the Chat Messages in Chat session, which are in the currently active sliding window. Remove any chat message blocks in the chat session ui, which are no longer in the sliding window of context. This brings uirefresh semantic to be in sync with chat_show logic
|
The ui will show only the messages which are within the client side sliding window of messages to be sent to the ai server, along with the last/latest ai handshake. NOTE: This was the original behaviour with the full fledged ui show logic and inturn with the newer optimized uirefresh flow this had got changed as it was only touching new messages added to the chat session, now it also takes care of dropping older outside the sliding window messages from ui. |
Make a note on how user can always view the full chat history if they want to.
Move existing readme.md into docs/details.md Similarly move the screenshot image into docs/
Gives quick overview of the features, given that the original readme (now docs/details.md++) got created over a period of disjoined time as features got added.
Remove the unneeded , wrt last entry Rather the import map isnt used currently, but certain entries kept there for future, more as a reminder.
By default ensure external_ai tool call related special chat session starts with tool calls disabled and client side sliding window of 1. Add a helper in SimpleChat class to set these along with clearing of any chat history. Inturn now give the user flexibility to change this from within the program, if they need to for what ever reason, till the program restarts.
With this PR and others in this series, the alternate tools/server/public_simplechat web client ui has been updated to support tool calls along with a bunch of builtin ones ready to use (without needing any additional setup), reasoning and vision for ai models that support the same. Also each chat session has its own settings.
Using this client ui along with llama-server one can get the local ai to fetch and summarise latest news or get the latest research papers / details from arxiv / ... for a topic of interest and summarise the same or generate javascript code snippets and test them out or use it to validate mathematical statements ai might make and or answer queries around these or ... its up to you and the ai model ... ai model can even call into system prompt based self modified variants of itself, if it deems it necessary ...
Remember to cross check the tool calls before allowing their execution and responses before submitting them to the ai model, just to be on safe side.
You always have access to the reasoning from ai models that support the same. And for ai models that support vision, one can send images to explore the same.
Look into included readme.md for additional info. The immidiate previous PR in the series is #17415
One could get going with
build/bin/llama-server -m ../llama.cpp.models/gpt-oss-20b-mxfp4.gguf --jinja --path tools/server/public_simplechat/ --ctx-size 64000 --n-gpu-layers 12 -fa on
NOTE: even default context size should be good enough for simple stuffs. Explicitly set the --ctx-size if working with many web site / pdf contents or so as needed.
if one needs the additional power/flexibility got with web search, web fetch and pdf related tool calls, then also run
cd tools/server/public_simplechat/local.tools; python3 ./simpleproxy.py --config simpleproxy.json
NOTE: Remember to edit simpleproxy.json with the list of sites you want to allow access to, as well as to disable local file access, if needed.
Hi @ggerganov
I hadnt noticed your response few weeks back to my older PR in this series, As I was in the middle of exploring some of the features, which I have added in this series of PRs.
If you look at this PR, you will notice that this alternate client ui continues to use a pure html + css + js based flow (also avoiding dependence on external libraries in general) and now supports reasoning, vision and tool calling (with a bunch of useful built in client side based tool calls with no need for any additional setup, ++). In turn all within a uncompressed source code size of 300KB (including the python simpleproxy.py for web access and related tool calls). Also the logical ui elements have their own unique id/class, if one wants to theme.
While the default web ui is around 1.2 MB or so compressed, needs one to understand svelte framework (in addition to html/css/js) and also needs one to track the different bundled modules. Also currently it doesnt support tool calling, and the plan is more towards server side / back end MCP based tool call support, if I understand correctly.
Given the above significant differences, I feel it makes more sense to continue this updated lightweight alternate ui option within llama.cpp itself, parallel to the default webui. My embedded background also biases me towards simple yet flexible and functional options. Eitherway the final decision is up to you and the team of open source developers who work on this proactively, rather than once in a bluemoon me, as to whether you would prefer to apply these into llama.cpp itself or not. Do let me know your thoughts.
NOTE: Also when I revisited ai after almost a year++ wanting to explore some of the recent ai developments, I couldnt find any sensible zero or minimal setup based tool calling supported open source ai clients out there including the default bundled web ui, so I started on this series of patches/PRs.