You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Let me first congratulate everyone working on this for:
Python bindings for llama.cpp
Making them compatible with openai's api
Superb documentation!
Was wondering if anyone can help me get this working with BLAS? Right now when the model loads, I see BLAS=0.
I've been using kobold.cpp, and they have a BLAS flag at compile time which enables BLAS. It cuts down the prompt loading time by 3-4X. This is a major factor in handling longer prompts and chat-style messages.
P.S - Was also wondering what the difference is between create_embedding(input) and embed(input)?