Skip to content

Coding assistant is a lightweight llama.cpp wrapper for quantized local SLM deployment

License

Notifications You must be signed in to change notification settings

amrhas82/coding-assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Local Coding Assistant

Run coding LLMs locally with llama.cpp. No GPU required.

Quick Start

./setup.sh                              # One-time: build llama.cpp
./download-model.sh qwen2.5-coder-7b    # Download model (~4GB)
./chat.sh                               # Chat in terminal

For OpenCode

./server.sh                             # Start API server

Then configure OpenCode to use http://127.0.0.1:8080/v1

See CLI_TOOLS_SETUP.md for setup details.

Switch Models

./download-model.sh                     # List available models
./download-model.sh qwen2.5-coder-3b    # Download another
nano config.sh                          # Change ACTIVE_MODEL

Configuration

Edit config.sh:

Setting Default Description
ACTIVE_MODEL qwen2.5-coder-7b Model to use
N_THREADS 6 CPU threads
CONTEXT_SIZE 4096 Context window
TEMPERATURE 0.5 Creativity (0-1)
SERVER_PORT 8080 API port

Files

File Purpose
config.sh Runtime settings
models.conf Available models
setup.sh Build llama.cpp
download-model.sh Get models
server.sh API server
chat.sh Terminal chat

More Docs

About

Coding assistant is a lightweight llama.cpp wrapper for quantized local SLM deployment

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages