# Selective lesioning
Now, our aim is to erode a model in a **controlled manner**. Model performance is tracked by benchmarking over a question set (generated w/ an assist from gpt-4 :)). 

To do: 
+ Load base model + replace mps code x
+ Pull in question set x 
+ Establish basic eval framework - need to (1) feed questions (async?); (2) randomly shut off weights from non-embed layers; (3) track perf. changes as culling increases 

Notes: 
+ Keep an eye on mem. use - disk space can be monitored via `du -hs $HOME /workspace/*` - we have 100GB avail. 

In [4]:
# Load libraries
# import flash_attn
from dotenv import main
import torch
import json
import jinja2
import os
import sys
import re
import pandas as pd
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig # for quantization
from torch.nn import Softmax
import plotly
from transformers import pipeline, set_seed

# auth for gated repos (like llama) - gen token here: https://huggingface.co/settings/tokens
from huggingface_hub import notebook_login
notebook_login(os.getenv('HF_TOKEN'))

# model ids
model_id = ["microsoft/Phi-3-mini-4k-instruct"]

# Set seed for reproducibility 
torch.random.manual_seed(0)

# Increase max width of pd df columns 
pd.set_option('max_colwidth', 300)

# Instantiate jinja environment - used later for icl prompting 
environment = jinja2.Environment()

# requirements.txt
# !pip3 freeze > requirements.txt

User is already logged in.


In [35]:
# Define utility functions 
# mem. monitoring! 
def check_memory():
    print("Allocated: %fGB"%(torch.cuda.memory_allocated(0)/1024/1024/1024))
    print("Reserved: %fGB"%(torch.cuda.memory_reserved(0)/1024/1024/1024))
    print("Total: %fGB"%(torch.cuda.get_device_properties(0).total_memory/1024/1024/1024))

# notification/text-to-speech
def text_to_speech(text):
    if sys.platform == 'darwin':
        os.system(f'say "{text}"')
    elif sys.platform.startswith('linux'):
        os.system(f'espeak "{text}"')
    else:
        print("Text-to-speech is not supported on this platform.")

In [55]:
# Load bnb config, base model, and tokenizer
bnb_config = BitsAndBytesConfig(
    load_in_4bit = True,
    bnb_4bit_use_double_quant = True,
    bnb_4bit_quant_type = 'nf4',
    bnb_4bit_compute_dtype = torch.bfloat16
)

base_model = AutoModelForCausalLM.from_pretrained(
    model_id[0],
    # device_map = 'auto', # not sure what's up with device_map, but this is what causes errors
    quantization_config = bnb_config,
    trust_remote_code = True
)

# Load tokenizer - remove bos token since my function already pre-pends
tokenizer = AutoTokenizer.from_pretrained(model_id[0],
                                         add_eos_token = False,
                                         add_bos_token = False,
                                         padding_side = 'left')

`low_cpu_mem_usage` was None, now set to True since model is quantized.


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


# Eval. setup
Here, we generate a set of validatable prompts by concatenating single questions to a base prompt.

In [6]:
# set base prompt 
base_prompt = [
    {
        "role": "system",
        "content": "You are a helpful, honest, and intelligent AI assistant who can only respond with a single JSON object. Solve each of the following questions. Return a JSON object containing two keys, `rationale` and `answer`."
    },
    {
        "role": "user",
        "content": "What's the integer ceiling of 5/3?\nA. 3\nB. 4.25\nC. dog\nD. 2"
    },
    {
        "role": "assistant",
        "content": '{"rationale": "5/3 is between 1 (3/3) and 2 (6/3), so the integer ceiling is 2.", "answer": "D"}'
    }, 
    {
        "role": "user",
        "content": "What's the capital of the U.S. state of Georgia?\nA. Tblisi\nB. Atlanta\nC. Nashville\nD. Toronto"
    },
    {
        "role": "assistant",
        "content": '{"rationale": "The capital of the U.S. state of Georgia is Atlanta, located in the Northwest of the state.", "answer": "B"}'
    }
]

In [34]:
# Questions df
# GPT-4 generation prompt
# I am benchmarking an LLM. I want you to create 100 MMLU-style questions. Return them in a JSON array of the format specified below. The questions should be a mix of easy/medium/hard difficulty. 
# The types should be "math", "extraction", "reasoning", "facts". 
# - "Math" questions should be related to arithmetic, calculus, or statistics. 
# - "Extraction" questions should focus on NLP-style NER tasks.
# - "Reasoning" should focus on logic. 
# - "Facts" should be focused on facts related to science or nature.
# Here is an example of a question (do not use this question).
# ```
# [
# {"question": "Suppose you have a data source that generates binary messages. Each message can either be 0 or 1. If both outcomes are equally likely, what is the entropy of this data source?", "options": [{"code": "A", "text": "0 bits"}, {"code": "B", "text": "0.5 bits"}, {"code": "C", "text": "1 bit"}, {"code": "D", "text": "2 bits"}], "solution": "C", "difficulty": "hard", "type": "math"},
# {"question": "What element is represented by the symbol 'Na' on the periodic table?", "options": [{"code": "A", "text": "Nitrogen"}, {"code": "B", "text": "Nickel"}, {"code": "C", "text": "Neon"}, {"code": "D", "text": "Sodium"}], "solution": "D", "difficulty": "easy", "type": "facts"},
# ]
# ```

# Load questions.json
q_file_path = os.getcwd() + '/data/question.json'
q_file = open(q_file_path)
q_list = json.load(q_file) # yields list of dicts 



# eval_df =\
#     pd.DataFrame(raw_questions)\
#     .assign(
#         full_question = lambda df: df.apply(lambda row: row['question'] + '\n' + '\n'.join([o['code'] + '. ' + o['text'] for o in row['options']]),  axis = 1),
#         llm_input = lambda df: df.apply(lambda row: parse_phi(base_prompt + [{'role': 'assistant', 'content': row['full_question']}]), axis = 1)
#     )

# print(len(eval_df))
# print(eval_df['llm_input'][0])

# eval_df.groupby('difficulty').count()

[{'question': 'Suppose you have a data source that generates binary messages. Each message can either be 0 or 1. If both outcomes are equally likely, what is the entropy of this data source?', 'options': [{'code': 'A', 'text': '0 bits'}, {'code': 'B', 'text': '0.5 bits'}, {'code': 'C', 'text': '1 bit'}, {'code': 'D', 'text': '2 bits'}], 'solution': 'C', 'difficulty': 'hard', 'type': 'math'}, {'question': "What element is represented by the symbol 'Na' on the periodic table?", 'options': [{'code': 'A', 'text': 'Nitrogen'}, {'code': 'B', 'text': 'Nickel'}, {'code': 'C', 'text': 'Neon'}, {'code': 'D', 'text': 'Sodium'}], 'solution': 'D', 'difficulty': 'easy', 'type': 'facts'}, {'question': 'A rectangle has a length of 10 meters and a width of 5 meters. What is its area?', 'options': [{'code': 'A', 'text': '15 square meters'}, {'code': 'B', 'text': '50 square meters'}, {'code': 'C', 'text': '25 square meters'}, {'code': 'D', 'text': '100 square meters'}], 'solution': 'B', 'difficulty': 'ea