## AI disclosure

This file was created after extensive troubleshooting during the process of converting our merged model into GGUF format. With assistance from ChatGPT, we were able to identify and resolve several issues encountered along the way. This notebook documents a complete and functional workflow for downloading our merged FP16 model from Hugging Face and converting it into a GGUF version suitable for CPU inference using tools such as llama.cpp or GPT4All.

In [None]:
%%capture
%uv pip install unsloth
# Also get the latest nightly Unsloth!
%uv pip uninstall unsloth -y && pip install --upgrade --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git@nightly git+https://github.com/unslothai/unsloth-zoo.git

In [None]:
from huggingface_hub import snapshot_download

local_model = snapshot_download(
    repo_id="Jeppcode/ScalableLab2",
    allow_patterns=["merged-model-fp16/*"]
)

print(local_model)


In [None]:
import os

for root, dirs, files in os.walk(local_model, topdown=True):
    for name in files:
        print(os.path.join(root, name))


In [None]:
!pip install sentencepiece
!pip install huggingface_hub


In [None]:
!git clone https://github.com/ggerganov/llama.cpp


In [None]:
%cd llama.cpp


In [None]:
HF_MODEL_PATH = "/root/.cache/huggingface/hub/models--Jeppcode--ScalableLab2/snapshots/95c561c6ea6eb2d76c5691d6b4dd2dcbc533da7f/merged-model-fp16"

!python convert_hf_to_gguf.py \
    --outtype f16 \
    --outfile /content/model-f16.gguf \
    $HF_MODEL_PATH


In [None]:
%cd /content/llama.cpp


In [None]:
!mkdir build
%cd build

In [None]:
!cmake ..

In [None]:
!cmake --build . --config Release


In [None]:
!find /content/llama.cpp -type f -name "quantize*"


In [None]:
!cmake --build . --config Release -t llama-quantize

In [None]:
!find /content/llama.cpp/build -type f -name "llama-quantize"

In [None]:
!"/content/llama.cpp/build/bin/llama-quantize" \
    /content/model-f16.gguf \
    /content/model-q4_k_m.gguf \
    q4_K_M

In [None]:
!ls -lh /content | grep gguf

In [None]:

from huggingface_hub import login
login()

In [None]:
from huggingface_hub import HfApi

api = HfApi()

api.upload_file(
    path_or_fileobj="/content/model-q4_k_m.gguf",
    path_in_repo="model-q4_k_m.gguf",
    repo_id="Jeppcode/ScalableLab2",
)

api.upload_file(
    path_or_fileobj="/content/model-f16.gguf",
    path_in_repo="model-f16.gguf",
    repo_id="Jeppcode/ScalableLab2",
)

print("Upload complete!")