KATI-LLAMA is an interface for chatting with Large Language Models on a private PC. The Language Model can be downloaded automatically in the settings and then used offline. The KATI application allows the user to communicate with an AI in a human-like manner. The AI's responses can be output with a natural voice and the AI's avatar image changes appearance depending on the chatbot's mood. Below is a summary of the features of KATI-LLAMA.
Key features of KATI:
- Talk to AI without an internet connection
- Optional voice output with a voice pre-installed in the operating system or a natural-sounding TikTok voice. (The TikTok voice requires an internet connection)
- Voice input (System Speech or Whisper)
- Dynamic avatar images to represent AI emotions.
- Chat history with filter function and read-aloud function.
- Rating function for AI responses as an aid to the filter function
- Reduce wait times by streaming responses directly. (If the read-aloud function is active, the output only happens when the sentence is complete)
- Text and code are formatted for better readability.
- Multilingual user interface (DE, EN, FR, ES, PT, JA, KO)
- Demonstration of how to install and use KATI-LLAMA (German)
- Demonstration of how to use STT in KATI-LLAMA (German)
- LLamaSharp
MIT License
- ElectronNET.API
MIT License
- Esprima
BSD 3-Clause License
- LiteDB
MIT License
- Microsoft.AspNetCore.SignalR.Client
MIT License
- NAudio
License Info
- Newtonsoft.Json
MIT License
- System.Data.SQLite
Public Domain
- System.Linq.Async
MIT License
- System.Speech
MIT License
- SoundTouch
License Info
- WhisperNet
MPL-2.0 License
- Add more Language Models in the settings for download.
- Can't find any bugs yet :)
- Depending on the model configured, more or less RAM and processor power are required. This can affect the performance of the AI's responses. Try a smaller model and see if the AI responds faster. Keep in mind that the smaller the model, the lower the quality of the responses.
- Slow output may also be due to the configured processor setting. AVX is very slow, but it is mostly supported by older processors. With AVX2, the latency is significantly lower, but not all processors support it. Try chatting with AVX2 to see if it works for you.
- If the read-aloud function is enabled, the program waits to output until a complete sentence is available. To minimize the response time, you can disable the audio output, then the response text will be streamed without interruption.
- The AI sometimes takes longer to answer in general if it finds little information about a question. In this case, you can try to cancel the chat session and rephrase the question