


<span style="font-size: 35px;">Setting Up and Running BitNet On a Notebook: A Step-by-Step</span>











<span style="font-size: 24px;color:brown">Go to the terminal and follow these steps:</span>



1. **Clone the Repository**  
   ```sh
   git clone --recursive https://github.com/microsoft/BitNet.git
   cd BitNet
   ```

2. **Step 1.2: Create a New Conda Environment**
   ```sh
   conda create -n bitnet-cpp python=3.9 -y
   conda activate bitnet-cpp
   ```

3. **Install the dependencies**
   ```sh
   pip install -r requirements.txt
   ```

4. **Aditionally install ipykernel for Jupyter Notebook** 
   ```sh
   pip install ipykernel 
   ```

5. **Download the model from Huggingface**
   ```sh
   huggingface-cli download HF1BitLLM/Llama3-8B-1.58-100B-tokens --local-dir models/Llama3-8B-1.58-100B-tokens**
   ```

6. **Quantize and prepare the model**
   ```sh
   python setup_env.py -md models/Llama3-8B-1.58-100B-tokens -q i2_s
   ```


7. **Other installations if required** 
   ```sh
   conda install -n bitnet-cpp -c conda-forge cmake
   sudo apt install cmake  build-essential clang (for linux)
   brew install cmake (macOS)
   sudo apt install clang 
   ```

8. **Check Clang version and are compatible**
   ```sh
   clang --version
   clang++ --version
   ```

9. **Manually add them to your path**
   ```sh
   export CC=/usr/bin/clang
   export CXX=/usr/bin/clang++
   ```

10. **Install libstdc++-12-dev if missing**
   ```sh
   sudo apt-get install libstdc++-12-dev
   ```







In [2]:
# Import Necessary Libraries
import os
import subprocess
import platform


In [3]:
# Set Up Paths
'''Make sure you have the correct paths to the model and the executable. In Jupyter Notebook, we'll define these as variables:'''

model_path = "models/Llama3-8B-1.58-100B-tokens/ggml-model-i2_s.gguf"

# Determine the executable path
if platform.system() == "Windows":
    main_path = os.path.join("build", "bin", "Release", "llama-cli.exe")
    if not os.path.exists(main_path):
        main_path = os.path.join("build", "bin", "llama-cli")
else:
    main_path = os.path.join("build", "bin", "llama-cli")


In [4]:
# Create a Function to Run Inference
'''This function replicates what run_inference.py does, but can be executed directly in the notebook''' 

def run_inference(model, prompt, n_predict=6, threads=2, ctx_size=2048, temperature=0.8):
    """Run inference with the specified arguments"""
    command = [
        f'{main_path}',
        '-m', model,
        '-n', str(n_predict),
        '-t', str(threads),
        '-p', prompt,
        '-ngl', '0',
        '-c', str(ctx_size),
        '--temp', str(temperature),
        "-b", "1"
    ]
    try:
        result = subprocess.run(command, check=True, text=True, capture_output=True)
        print(result.stdout)
    except subprocess.CalledProcessError as e:
        print(f"Error occurred: {e}")


In [5]:
# Run Inference
prompt_text = "Daniel went back to the garden. Mary travelled to the kitchen. Sandra journeyed to the hallway. John went to the bedroom. Mary went back to the garden. Where is Mary?\nAnswer:"
run_inference(model_path, prompt_text, n_predict=6, temperature=0.6)


Daniel went back to the garden. Mary travelled to the kitchen. Sandra journeyed to the hallway. John went to the bedroom. Mary went back to the garden. Where is Mary?
Answer: Mary is in the garden.



