### Convert Model to gguf format.

In  this notebook we will save the the model to the gguf format. The GGUF format is a file format for storing model for inference with GGML.  GGML is a tensor library developed for Machine Learning.

You can learn more about the format [here.](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md)

In [None]:
from huggingface_hub import snapshot_download

In [None]:
model_name = 'Lajavaness/bilingual-embedding-large'

In [None]:
from pathlib import Path

In [None]:
model_repository = Path.cwd().joinpath("models")

In [None]:
model_repository.exists()

In [None]:
model_path = model_repository.joinpath(model_name)

### Download the model 

Uncomment the bellow row to download the model.

In [None]:
snapshot_download(repo_id=model_name, local_dir=model_path,
                  force_download=True, revision="main")

After downloading the model, we need to save it to gguf file, which is the file format used by llam cpp

In [None]:
gguf_32_bits_path  = model_path.parent.joinpath(f"{model_name.split('/')[0]}_32.gguf")
gguf_16_bits_path  = model_path.parent.joinpath(f"{model_name.split('/')[0]}_16.gguf")
assert gguf_32_bits_path.parent.exists()
assert gguf_16_bits_path.parent.exists()

In [None]:
llama_cpp_path = Path.cwd().parent.joinpath("llama.cpp")
convert_script_path = llama_cpp_path.joinpath(
    "convert_hf_to_gguf.py").__str__()

In [None]:
!python $convert_script_path $model_path --outfile $gguf_16_bits_path --outtype f16