Llama.cpp API Python Integration

A simple guide to get the llama.cpp chat API up and running using Python.

Steps to Follow

Step 1: Clone and Run the Llama.cpp Library

First, clone the llama.cpp library from the following repository:

After cloning, navigate to the llama.cpp folder.

Step 2: Run the Server

In the llama.cpp folder, run the server file using the command below. Make sure to edit the path to your ggml file accordingly:

./server -m models/13b-chat/ggml-model-q4_0.bin -c 2048

Step 3: Clone the API Python Repository

After running the server, clone the Llama.cpp API Python repository:

git clone https://github.com/avinrique/Llama.cpp-api-python-

Navigate to the cloned directory:

cd Llama.cpp-api-python-

Copy and paste file over the parent directory

cp fetch_chatapi.py ./../

Run the file

python app.py

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
fetch_chatapi.py		fetch_chatapi.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Llama.cpp API Python Integration

Steps to Follow

Step 1: Clone and Run the Llama.cpp Library

Step 2: Run the Server

Step 3: Clone the API Python Repository

Navigate to the cloned directory:

Copy and paste file over the parent directory

Run the file

About

Releases

Packages

Languages

avinrique/Llama.cpp-api-python-

Folders and files

Latest commit

History

Repository files navigation

Llama.cpp API Python Integration

Steps to Follow

Step 1: Clone and Run the Llama.cpp Library

Step 2: Run the Server

Step 3: Clone the API Python Repository

Navigate to the cloned directory:

Copy and paste file over the parent directory

Run the file

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages