This repository was archived by the owner on Jul 4, 2025. It is now read-only.

Description
Problem
- I am trying to integrate Jan RAG feature using Nitro embedding at POST
http://localhost:3928/v1/embeddings
- It supports the
input in payload as string by default, however as the application requires batch embedding, nitro yields 500
Success Criteria
- Load model with embedding true
curl --location 'http://127.0.0.1:3928/inferences/llamacpp/loadmodel' \
--header 'Content-Type: application/json' \
--data '{
"llama_model_path": "/Users/hiro/Downloads/ggml-model-q4_k.gguf",
"ctx_len": 2048,
"ngl": 100,
"cont_batching": false,
"embedding": true,
"system_prompt": "",
"user_prompt": "\n### Instruction:\n",
"ai_prompt": "\n### Response:\n"
}'
- Try to call the embedding endpoint with
input as array
curl --location 'http://localhost:3928/v1/embeddings' \
--header 'Content-Type: application/json' \
--header 'Accept: text/event-stream' \
--header 'Access-Control-Allow-Origin: *' \
--data '{
"input": ["Hello", "Nam", "here"],
"model": "embedding",
"encoding_format": "float"
}'
Additional context