KATI-LLAMA (local large language model chat)

KATI-LLAMA is an interface for chatting with Large Language Models on a private PC. The Language Model can be downloaded automatically in the settings and then used offline. The KATI application allows the user to communicate with an AI in a human-like manner. The AI's responses can be output with a natural voice and the AI's avatar image changes appearance depending on the chatbot's mood. Below is a summary of the features of KATI-LLAMA.

Key features of KATI:

Talk to AI without an internet connection
Optional voice output with a voice pre-installed in the operating system or a natural-sounding TikTok voice. (The TikTok voice requires an internet connection)
Voice input (System Speech or Whisper)
Dynamic avatar images to represent AI emotions.
Chat history with filter function and read-aloud function.
Rating function for AI responses as an aid to the filter function
Reduce wait times by streaming responses directly. (If the read-aloud function is active, the output only happens when the sentence is complete)
Text and code are formatted for better readability.
Multilingual user interface (DE, EN, FR, ES, PT, JA, KO)

Demonstration Video

Nuget packages and associated licenses used in KATI

LLamaSharp MIT License
ElectronNET.API MIT License
Esprima BSD 3-Clause License
LiteDB MIT License
Microsoft.AspNetCore.SignalR.Client MIT License
NAudio License Info
Newtonsoft.Json MIT License
System.Data.SQLite Public Domain
System.Linq.Async MIT License
System.Speech MIT License
SoundTouch License Info
WhisperNet MPL-2.0 License

Next milestone (research)

Add more Language Models in the settings for download.

Known bugs that will be fixed soon

Can't find any bugs yet :)

Performance issues

Depending on the model configured, more or less RAM and processor power are required. This can affect the performance of the AI's responses. Try a smaller model and see if the AI responds faster. Keep in mind that the smaller the model, the lower the quality of the responses.
Slow output may also be due to the configured processor setting. AVX is very slow, but it is mostly supported by older processors. With AVX2, the latency is significantly lower, but not all processors support it. Try chatting with AVX2 to see if it works for you.
If the read-aloud function is enabled, the program waits to output until a complete sentence is available. To minimize the response time, you can disable the audio output, then the response text will be streamed without interruption.
The AI sometimes takes longer to answer in general if it finds little information about a question. In this case, you can try to cancel the chat session and rephrase the question

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
LICENSE		LICENSE
README.md		README.md
Screenshot.png		Screenshot.png
Screenshot2.png		Screenshot2.png
Screenshot3.png		Screenshot3.png
about-de.pdf		about-de.pdf
about-en.pdf		about-en.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KATI-LLAMA (local large language model chat)

Demonstration Video

Nuget packages and associated licenses used in KATI

Next milestone (research)

Known bugs that will be fixed soon

Performance issues

About

Releases 5

License

hswlab/kati-llama

Folders and files

Latest commit

History

Repository files navigation

KATI-LLAMA (local large language model chat)

Demonstration Video

Nuget packages and associated licenses used in KATI

Next milestone (research)

Known bugs that will be fixed soon

Performance issues

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 5