feat: Support for batch embedding

**Problem**
- I am trying to integrate Jan RAG feature using Nitro embedding at POST `http://localhost:3928/v1/embeddings`
- It supports the `input` in payload as string by default, however as the application requires batch embedding, nitro yields 500

**Success Criteria**
- Load model with embedding true
```
curl --location 'http://127.0.0.1:3928/inferences/llamacpp/loadmodel' \
--header 'Content-Type: application/json' \
--data '{
   "llama_model_path": "/Users/hiro/Downloads/ggml-model-q4_k.gguf",
   "ctx_len": 2048,
   "ngl": 100,
   "cont_batching": false,
   "embedding": true,
   "system_prompt": "",
   "user_prompt": "\n### Instruction:\n",
   "ai_prompt": "\n### Response:\n"
 }'
```
- Try to call the embedding endpoint with `input` as array
```
curl --location 'http://localhost:3928/v1/embeddings' \
--header 'Content-Type: application/json' \
--header 'Accept: text/event-stream' \
--header 'Access-Control-Allow-Origin: *' \
--data '{
    "input": ["Hello", "Nam", "here"],
    "model": "embedding",
    "encoding_format": "float"
}'
```
- The server returns 500

**Additional context**
- It's OAI compatible API: https://platform.openai.com/docs/api-reference/embeddings/create
![CleanShot 2024-01-23 at 23 42 58@2x](https://github.com/janhq/nitro/assets/22463238/a613af08-e9bd-440c-bafb-8cdb99f29e23)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Support for batch embedding #371

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: Support for batch embedding #371

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions