# Contrastive Knowledge Assesment (CKA) Notebook Demo
This notebook enables interactive experimentation with CKA for models including `Flan-ul2`, `Flan-t5s`, `OPTs`, `GPT-Neos`, `Robertas`, `Berts`, and `GPT2s`.
The goal is to probe if factual statements are predicted at a higher probability than a given counterfactual.

<a target="_blank" href="https://colab.research.google.com/github/daniel-furman/Capstone/blob/main/notebooks/cka_run_main_demo.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>


## Dependencies

In [None]:
!git clone https://github.com/daniel-furman/Capstone.git
!pip install -r /content/Capstone/requirements.txt

## Imports

In [None]:
import os

In [None]:
os.chdir('/content/Capstone/src/cka_scripts')
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 

## Notebook usage

In [None]:
# import the main wrapper function for running cka

from run_cka import main
config = {}

Here, you can specify a large language model in ```config["models"]```. See [README](https://github.com/daniel-furman/Capstone#models-tested) for the full list of model families supported and HuggingFace for specific model tags.  

Some example VRAM usages:
* `gpt2-xl` runs comfortably on Y
* `EleutherAI/gpt-j-6B` requires slightly more than 7GB VRAM
* `google/flan-ul2` requires X

For example, to run a couple different smaller models, consider `roberta-base` and/or `google/flan-t5-base`.




In [None]:
config["models"] = ["EleutherAI/gpt-j-6B",
                    "gpt2-xl",
                   ]

Next, new facts can be input in ```config["input_information"]```. These must be input in the same format as the given examples. 
  * For instance: to input "Lebron James is famous for playing the sport of {true: basketball; false: football}", see the last example. 

In [None]:
config["input_information"] = {
    
        "0": {
            "stem": "The 2020 Olympics were held in",
            "true": "Tokyo",
            "false": ["London", "Berlin", "Chicago"],
        },
        "1": {
            "stem": "Operation Overlord took place in",
            "true": "Normandy",
            "false": ["Manila", "Santiago", "Baghdad"],
        },
        "2": {
            "stem": "Steve Jobs is the founder of",
            "true": "Apple",
            "false": ["Microsoft", "Google", "Facebook"],
        },

        # Example addition(s)
        "3": {
            "stem": "Lebron James is famous for playing the sport of",
            "true": "basketball",
            "false": ["football"],
        },   
    }

Lastly, verbosity controls how much info is printed to the user.

In [None]:
config["verbosity"] = False

In [None]:
config

In [None]:
score_dicts = main(config)

print(score_dicts[0])
print(score_dicts[1])
print(score_dicts[2])

## CLI usage 
* Compatible with cached configs included in the repo
* You can also create a custom config for an experiment
    * See: ```./content/Capstone/src/cka_scripts/configs```

In [None]:
# full benchmark dataset

!python run_cka.py configs.rome_full.distilgpt2_rome_full