Skip to content

jittagornp/llm-setup

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

LLM Setup

Article

Install APISIX

cd apisix
docker compose up -d

Run LLM Model

docker run -d --runtime nvidia --gpus all \
-p 4001:30000 \
-v /data/models/.cache/huggingface:/root/.cache/huggingface \
-e "HF_TOKEN=<YOUR_HUGGING_FACE_TOKEN>" \
--ipc=host \
--network ai \
--restart always \
--name sglang-neuralmagic-Meta-Llama-3.1-70B-Instruct-FP8 \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path neuralmagic/Meta-Llama-3.1-70B-Instruct-FP8 \
--host 0.0.0.0 --port 30000 \
--mem-fraction-static 0.5 \
--tp 2

CURL LLM API via APISIX

curl http://localhost/llm/neuralmagic/Meta-Llama-3.1-70B-Instruct-FP8/v1/completions \
-H "Content-Type: application/json" \
-d '{
     "model": "neuralmagic/Meta-Llama-3.1-70B-Instruct-FP8",
     "prompt": "What is the LLM?",
     "temperature": 0.7
   }' -H "Authorization: Bearer 1234"

Custom SGLang Docker image

Dockerfile

FROM lmsysorg/sglang:latest

# Install python-multipart
RUN pip install python-multipart

Build image

docker build -t lotuss/sglang .

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published