Local API or Gradio Client Support focus. #3

waefrebeorn · 2024-03-13T06:40:46Z

Gradio clients that run local language models such as “OobaBooga” and allow api support should be a major consideration for the roadmap process. Creating usable model swapping with a cache functionality is feasible. I made an example chart months ago when I saw the potential in MinP greedy sampling that Kalomaze did work on being helpful for memory driven tasked recall due to the token accuracy.

Please note that current projects like MemoryGPT allow api usage but no widespread application allows for effective model swapping or multi system offloading. It’s also important to note that a side server “chain” of cheaper machines or a GGML focused network solution could allow for more garage labs.

Current Roadblocks are memory management, non-useful hallucinations (effective hallucinations could generate better idea tokens in a agent focus), and ineffective inter model conversation solutions that are actually open source for System prompting style implementation.

The most feasible multi model solution is to allow for most elements to be cpu offloaded but for features like live training a model with a model doing RLHF being a “drop in” use that requires a GPU with enough vram for training. Unless a Traditional ram based training solution is usable with current model base such as mistral.

To summarize, a focus on using API solutions such as chatgpt or Claude will stagnate research on local language model feasibility. Creating a feasible framework for agent structures and Lora based live tuning for memory retention elements on a version based task list will most likely be the best course.

waefrebeorn · 2024-03-13T06:51:57Z

Please note that my picture example is of a Call Center agent system I designed in October 2023 that ended up not being used. The designed structure is a feasible alternative to a decision management system managed by a central query system for each “console” or emulated agent. Measuring the amount of cluster agents in the loop is my preposed measurement of scale for the complexity of the task, with the central query system being the “database model” that is consistently improved upon with base model usage being swapped out and a Lora imprint system for creating the “readiness” for being in the system with minimal overhead.

braveokafor · 2024-03-14T05:12:02Z

Hi @emangamer ,

How does a project like Ollama hold up for this use case?

waefrebeorn · 2024-03-14T18:02:58Z

Hi @emangamer ,

How does a project like Ollama hold up for this use case?

Ollama has a REST API for running and managing models.

You'd need a different project for training models, this looks to be a simple chat interface with prompt commands.

Gradio based projects have shown a marked standard in the AI space and the versatile nature of the web environment allows things like docker based google collab use, greatly increasing availability for phone users as well. As it was used it RVC voice synthesis.

braveokafor · 2024-03-14T20:33:27Z

Got it, I'll look into Gradio.

waefrebeorn · 2024-03-14T20:35:50Z

Got it, I'll look into Gradio.

If you're looking for a user client that uses Gradio, I suggest OobaBooga, Gradio is an open source webUI front end, not a AI model service. Open Devin should have the interface in Gradio.

huybery · 2024-03-27T05:43:31Z

@emangamer We're currently aiming for rapid prototyping (and won't consider using a complex framework for now), so feel free to discuss future architectural options with us at slack.

huybery closed this as completed Mar 27, 2024

mALIk-sHAHId mentioned this issue Apr 13, 2024

No agent started. Please wait a second. #1025

Closed

zhonggegege mentioned this issue Apr 26, 2024

[Bug]: litellm.exceptions.APIConnectionError #1380

Closed

2 tasks

gbenaa mentioned this issue Apr 27, 2024

[Bug]: local models not able to be configured to work #1416

Closed

2 tasks

SupercaliG mentioned this issue Jun 13, 2024

[Bug]: Unable to use cohere command LLM #2421

Open

2 tasks

Woodenfather mentioned this issue Jun 19, 2024

[Bug]: Its not working on Ollama #2517

Closed

2 tasks

OverloadedTech mentioned this issue Jul 8, 2024

[Bug]: Unable to run using Ollama (Llama3 Pulled) on Ubuntu 22.04 #2844

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Local API or Gradio Client Support focus. #3

Local API or Gradio Client Support focus. #3

waefrebeorn commented Mar 13, 2024

waefrebeorn commented Mar 13, 2024

braveokafor commented Mar 14, 2024

waefrebeorn commented Mar 14, 2024

braveokafor commented Mar 14, 2024

waefrebeorn commented Mar 14, 2024

huybery commented Mar 27, 2024

Local API or Gradio Client Support focus. #3

Local API or Gradio Client Support focus. #3

Comments

waefrebeorn commented Mar 13, 2024

waefrebeorn commented Mar 13, 2024

braveokafor commented Mar 14, 2024

waefrebeorn commented Mar 14, 2024

braveokafor commented Mar 14, 2024

waefrebeorn commented Mar 14, 2024

huybery commented Mar 27, 2024