## Demo running the Llama-3.2-3B-ARChitects-ReArc-bnb-4bit model 

### Prerequisites 

1. The `Llama-3.2-3B-ARChitects-ReArc-bnb-4bit` folder should be at the top level of this project and is imported as a git submodule when your ran `git submodule update --init`
2. `Llama-3.2-3B-ARChitects-ReArc-bnb-4bit/model.safetensors` is a large file and is pulled not from github.com but huggingface.com. This means that you need to either setup an [ssh key](https://huggingface.co/docs/hub/security-git-ssh) or personal access token for your huggingface account. Once this is done you can pull `the model.safetensors` file by `cd Llama-3.2-3B-ARChitects-ReArc-bnb-4bit` followed by `git lfs pull`. Note git LFS should be  installed and initialised as per the *Getting Started* section in the main read me.
3. All the required packages (except for `bitsandbytes non CUDA backend` ) are managed by uv. Run `uv sync` to make sure they are installed.
4. Manually install `bitsandbytes non CUDA backend` with this [guide](https://huggingface.co/docs/bitsandbytes/main/en/installation?backend=Intel+CPU+%2B+GPU#multi-backend) by huggingface. Availability is hardware dependant, I suspect the mac users among us do not have their hardware supported - in this case we can move this repository onto a hosted platform with cloud compute.

*Before starting this notebook always make sure you have done:* `uv pip install -e "../bitsandbytes/"`. *Otherwise do this now and restart the kernel.*

##### Notes/recommendations:

If you are compiling the `bitsandbytes`package with a non-CUDA backend from source. Clone the repo adjacent to this one and follow the build instructions. You can install the package to the venv associated with this repo via running `uv pip install -e "../bitsandbytes/"`in your terminal.

In general you can prefix any `pip install` command with uv for the uv package manager to add packages installed this way to its dependency graph.   

In [None]:
from pathlib import Path
from transformers import AutoModelForCausalLM, AutoTokenizer


### Load the model

In [None]:
# path to repository cloned as a submodule
model_repo_path = Path('../../../Llama-3.2-3B-ARChitects-ReArc-bnb-4bit')

#check directory is as expected
print(model_repo_path.resolve())  

In [None]:
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_repo_path)
model = AutoModelForCausalLM.from_pretrained(model_repo_path)

# Check if the tokenizer and model are loaded correctly
assert tokenizer is not None, "Tokenizer not loaded correctly."
assert model is not None, "Model not loaded correctly."

### Load the Data 

In [None]:
from mol_arc_agi.io_helpers import load_all_json_files_concurrently