# Embedding-Models
Embedding Models are a varity of model, in some sense more old fashioned, where the goal is not to generate a higher-level end product, but merely to output the matrix vector of the input.

And though this may seem like only half a model, this is an
extremely powerful form of model and tool.

Fortunately, though much more attention is paid to chat-bot oriented contexts, tools are available to make use of gguf formats of embedding models highly practical.

## To run in a local jupyter notebook:

```bash
python3 -m venv env; source env/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install jupyter
python3 -m pip install llama-cpp-python
jupyter notebook
```
Or, if the packages are already in a requirements.txt:
```bash
python3 -m venv env; source env/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install -r requirements.txt
jupyter notebook
```

https://huggingface.co/CompendiumLabs/bge-large-en-v1.5-gguf

https://pypi.org/project/llama-cpp-python/

https://huggingface.co/CompendiumLabs/bge-large-en-v1.5-gguf/tree/main

# Download a model from Huggingface
Download a model from Huggingface into a clear directory (not hidden mysteriously away on your system)
This just one file, you can download most huggingface files in a similar way, or use the download link on the page.

From the model-page, you can usually find a link to 'Files'

In [None]:
!wget https://huggingface.co/CompendiumLabs/bge-large-en-v1.5-gguf/resolve/main/bge-large-en-v1.5-q4_k_m.gguf?download=true

### Note
You may need to remove a suffix from the downloaded file so that it is a .gguf file.

E.g.
```
 bge-large-en-v1.5-q4_k_m.ggufvs. 'bge-large-en-v1.5-q4_k_m.gguf?download=true'
```

#### Look at your files

In [None]:
!ls

 bge-large-en-v1.5-q4_k_m.gguf	'bge-large-en-v1.5-q4_k_m.gguf?download=true'   sample_data


## Select your model
Many gguf formats of various sizes form float 32 to 2bit quantized
may be available, or you can ~easily make your own with llama.cpp

- https://github.com/ggerganov/llama.cpp

In some cases, many such will be available already on github:

- https://huggingface.co/CompendiumLabs/bge-large-en-v1.5-gguf/tree/main

In [None]:
model_path = "bge-large-en-v1.5-f32.gguf"

In [None]:
model_path = "bge-large-en-v1.5-q4_k_m.gguf"

In [None]:
!pip install -q llama-cpp-python

## Select your Text or document

In [None]:
text_to_embed = "hello world"

In [None]:
from llama_cpp import Llama
model = Llama(model_path, embedding=True)
embed = model.embed(text_to_embed)

llama_model_loader: loaded meta data with 24 key-value pairs and 389 tensors from bge-large-en-v1.5-q4_k_m.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = bert
llama_model_loader: - kv   1:                               general.name str              = bge-large-en-v1.5
llama_model_loader: - kv   2:                           bert.block_count u32              = 24
llama_model_loader: - kv   3:                        bert.context_length u32              = 512
llama_model_loader: - kv   4:                      bert.embedding_length u32              = 1024
llama_model_loader: - kv   5:                   bert.feed_forward_length u32              = 4096
llama_model_loader: - kv   6:                  bert.attention.head_count u32              = 16
llama_model_loader: - kv   7:          bert.attention.layer_norm_epsilon f32

In [None]:
print( len(embed) )

1024


In [None]:
print( type(embed) )

<class 'list'>


In [None]:
print(embed)

[0.03165217461323325, 0.033889619719268134, 0.01912820186619531, -0.032772243958794404, -0.01931842989562261, -0.008072671940074674, 0.025432156143579653, 0.04622867630988473, 0.047650769445096525, 0.02189269463983589, 0.00614942865728535, 0.01783864372734062, -0.014499344345420929, 0.01249645163180539, -0.034944146301901395, 0.023229510308671104, -0.021632539498691625, -0.02178636476955489, -0.05115432617837057, 0.035607083112478774, -0.017893531940869328, 0.008966290184070289, -0.08713871655900453, -0.03908880708937072, -0.024282553194817195, 0.0432239051363899, 0.03500900112251344, 0.00015309711301403058, 0.04670999292647939, 0.044311901310598714, -0.022041661532005888, 0.005089308265087585, -0.006032462634072559, -0.04189450966128981, -0.011056442048643887, -0.031939642091096294, 0.03824428773339507, -0.019390034080646868, -0.02618023009399158, -0.03659285995730876, 0.032068804116546584, 0.006792712478955299, 0.030363039165303963, -0.046252158758785425, -0.059326611994650245, -0.01