Speed up Text Embedding Service with Optimized quantized minilm Model #101

tybalex · 2024-02-24T03:34:00Z

what does this PR include:

Use an Optimized and quantized embedding model, which should make the embedding service ~ 3X faster.
enabled logging in the vector_db service @sangee2004

This fix will help PR #54 .

…m embedding model, which is on average 3x faster.

cloudflare-workers-and-pages · 2024-02-24T03:34:46Z

Deploying with Cloudflare Pages

Latest commit:	`b97aa3e`
Status:	✅ Deploy successful!
Preview URL:	https://e34b5c37.rubra.pages.dev
Branch Preview URL:	https://optimized-quantized-minilm.rubra.pages.dev

tybalex added 4 commits February 23, 2024 18:24

update embedding service to use optimum optimized and quantized minil…

459a31a

…m embedding model, which is on average 3x faster.

download model from huggingface

a5b5ad0

Merge branch 'main' into optimized-quantized-minilm

1fb5d43

config logger properly

b91b500

tybalex requested review from sanjay920 and sangee2004 February 24, 2024 03:34

tybalex self-assigned this Feb 24, 2024

tybalex mentioned this pull request Feb 24, 2024

knowledge retrieval embedding model is taking long to index. #54

Closed

better naming

b97aa3e

sanjay920 closed this Jun 27, 2024

sanjay920 deleted the optimized-quantized-minilm branch June 27, 2024 04:36