localllama

Here are 4 public repositories matching this topic...

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

Use your open source local model from the terminal

Local AI Search assistant web or CLI for ollama and llama.cpp. Lightweight and easy to run, providing a Perplexity-like experience.

Summarize emails received by Thunderbird mail client extension via locally run LLM. Early development.

Add a description, image, and links to the localllama topic page so that developers can more easily learn about it.

To associate your repository with the localllama topic, visit your repo's landing page and select "manage topics."