# Demo related to the paper "AGentV: Access Control Policy Generation and Verification Framework with Language Models"


> Estimated time: 30 minutes



To start with, create a shortcut of the [folder](https://drive.google.com/drive/folders/1LfI9UaoIX_KStJgpnm-AkS1AKkVz34UO?usp=sharing) to your google drive. It contains all the checkpoints needed to execute this notebook, created by fine-tuning the respective models using the *Overall dataset* mentioned in the paper. Furthermore, it contains the privacy policies extracted from [HotCRP](https://hotcrp.com/privacy) and converted into a markdown file. *If you want to use your own high-level requirement specifications/privacy policies, copy the content and format it as a markdown file as the ```privacy_hotcrp.md``` file in the ```/demo``` directory*.

---

<details>
  <summary>HotCRP Privacy Policy</summary>

  This page explains how <u>[HotCRP.com](https://hotcrp.com/)</u> uses the data you provide and how you can control that use. The open-source HotCRP software also runs on sites we don’t manage; this policy applies only to sites with domains ending in “.hotcrp.com”.  
(Updated 2020-08-24)  

# What we store and why

The HotCRP.com service handles the submission and review of *artifacts*, such as scientific and engineering publications. In most cases, HotCRP.com does not own the data you provide. Rather, we broker that data on behalf of scholarly societies, such as ACM and USENIX; on behalf of conference review boards; and on behalf of other site managers.
- We store **Submission artifacts**, including submissions (such as PDFs), metadata (such as titles and author lists), and supplementary information (such as topics and other uploaded files).
- We store **Review artifacts**, including reviews, discussions, and associated uploaded files.
- We store **Configuration settings** used by site managers to configure a site.
- We store **Profile data**, such as user names, affiliations, email addresses, topic interests, and review preferences.
- We store **Demographic data**, such as user gender identity and approximate age.
- We store **Browsing data**, such as logs of which sites a computer has accessed.

# Controlling your information

**Submission and review artifacts**: When you submit artifacts to a HotCRP.com site, you give that site’s managers (e.g., the program chairs) permission to store and view those artifacts indefinitely. You give the site managers permission to distribute the artifacts at their discretion, for review or other purposes. Finally, you give HotCRP.com permission to process the artifacts and to store them indefinitely. Site managers control who can access artifacts, and HotCRP.com doesn’t share artifacts except as site managers allow. However, we may publicize aggregate information that cannot be traced to any site or user, such as total submission counts across all sites. If an artifact was submitted in error, you can request its permanent deletion. Such requests should be directed to the relevant site managers (e.g., program chairs).

**Configuration settings** are stored by HotCRP.com indefinitely. Site managers may request the deletion of a HotCRP.com site, if allowed by their site agreements with HotCRP.com. This will delete all associated site data, including submission and review artifacts.

**Profile data** is stored in several ways.
- Every HotCRP.com user has a **global profile**, which includes name, email, and affiliation, and other information.
- Upon submitting an artifact to a HotCRP.com site, a user gains a **site profile** for that site. This generally contains the same information as the global profile, but it can differ. (For example, changes to a global user affiliation only affect future site profiles.)

Users can update their profiles at any time. A site manager can also create a profile for a user, for example by inviting them to join a conference program committee. Contact us if you want your global profile deleted and your site profiles disabled.  

**Demographic data** is stored in user global profiles, and can only be modified by users themselves (never by site managers). Users control what demographic data is stored and how demographic data is shared using the Profile \> Demographics tab on any HotCRP.com site. Information shared only with scholarly societies is provided directly to those societies and cannot be accessed by site managers. Information that is explicitly shared with site managers may also be analyzed by HotCRP.com, but will be published only in aggregate, such as in terms of percentages of active HotCRP.com users.  

**Browsing data** is stored for up to a month and per-user browsing data is never shared. We may store and share aggregate information such as total page loads. Misbehavior, such as denial-of-service attacks and attempts to crack a user’s password, may be publicized and may be preserved indefinitely.


</details>

---

Finally, before we begin, since LLaMa 3 8B is a gated model, you have visit [huggingface](https://huggingface.co/meta-llama/Meta-Llama-3-8B) and get the access to the base model.


## Install and import necessary libraries

In [None]:
!pip install -U spacy-experimental
!pip install https://github.com/explosion/spacy-experimental/releases/download/v0.6.1/en_coreference_web_trf-3.4.0a2-py3-none-any.whl
!pip install transformers==4.30.2
!pip install nltk
!pip install gdown==5.1.0

In [4]:
!python -m spacy download en_core_web_sm

2024-06-05 07:21:50.659145: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-06-05 07:21:50.659192: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-06-05 07:21:50.754001: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-06-05 07:21:50.943694: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-06-05 07:21:58.822111: I external/local_

In [5]:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = "0"

import spacy
import json
import nltk
nltk.download('punkt')
from nltk.tokenize import sent_tokenize

# Change the following path appropriately
PATH_DEMO_DATA = '/content/drive/MyDrive/demo/privacy_hotcrp.md'


[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


## Download checkpoints

In [None]:
!gdown --folder https://drive.google.com/drive/folders/1LfI9UaoIX_KStJgpnm-AkS1AKkVz34UO

## Pre-processing the input privacy policy document

This involves seperating the privacy policy document into paragraphs, resolving co-references in each paragraph, and segmenting each paragraph into individual sentences.

In [6]:
nlp = spacy.load('en_coreference_web_trf')

def resolve_references(sent):
    """Function for resolving references with the coref ouput
    doc (Doc): The Doc object processed by the coref pipeline
    RETURNS (str): The Doc string with resolved references
    """

    doc = nlp(sent)
    token_mention_mapper = {}
    output_string = ""
    clusters = [
        val for key, val in doc.spans.items() if key.startswith("coref_cluster")
    ]

    for cluster in clusters:
        first_mention = cluster[0]
        for mention_span in list(cluster)[1:]:
            token_mention_mapper[mention_span[0].idx] = first_mention.text + mention_span[0].whitespace_

            for token in mention_span[1:]:
                token_mention_mapper[token.idx] = ""

    for token in doc:
        if token.idx in token_mention_mapper:
            output_string += token_mention_mapper[token.idx]
        else:
            output_string += token.text + token.whitespace_

    return output_string

In [7]:
with open(PATH_DEMO_DATA, 'r+') as f:
    content = f.read().replace('*','').replace('#','').replace('-','').replace('\xa0',' ')

In [8]:
paragraphs = content.split('\n\n')
coref_resolved = [resolve_references(k) for k in paragraphs]


In [9]:
preprocessed_lines = []
for p in coref_resolved:
    preprocessed_lines.extend(p.split('\n'))
sents = []
for p in preprocessed_lines:
    sents.extend(sent_tokenize(p))

with open('high_level_requirements.json', 'w') as f:
  json.dump(sents, f)

## Access control policy identification and generation

You might need to restart the kernel whn installing the following libraries.

In [None]:
!pip install -U transformers
!pip install -U peft
!pip install bitsandbytes
!pip install einops
!pip install captum

In [2]:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = "0"

import torch
from peft import PeftModel, PeftConfig
from transformers import (
    AutoModelForCausalLM,
    BitsAndBytesConfig,
    AutoTokenizer,
    AutoModelForSequenceClassification,
    set_seed,
    BertForSequenceClassification,
    BertTokenizerFast
)
from captum.attr import LayerIntegratedGradients, TokenReferenceBase, visualization
import json
import re
import pandas as pd
from tqdm import tqdm
from utils import create_out_string, prepare_inputs_bart, process_label

# Change the following path appropriately
PATH_CKPT = 'demo/'

device = 'cuda:0'

### Loading the NLACP identification model

In [None]:
ID_MODEL_NAME = PATH_CKPT + 'checkpoints/checkpoint_identification/'
NUM_CLASSES = 2

id_model = BertForSequenceClassification.from_pretrained(ID_MODEL_NAME, num_labels=NUM_CLASSES).to(device)
id_tokenizer = BertTokenizerFast.from_pretrained(ID_MODEL_NAME)

id_model.eval()

### Loading the access control policy generation model

In [None]:
EX_MODEL_NAME = "meta-llama/Meta-Llama-3-8B"
peft_model_id = PATH_CKPT + "checkpoints/checkpoint_generation/"

base_model = AutoModelForCausalLM.from_pretrained(
        EX_MODEL_NAME,
        trust_remote_code=True,
        low_cpu_mem_usage=True,
        token=token

    )

base_model.config.pretraining_tp = 1
base_model.config.use_cache = True

gen_model = PeftModel.from_pretrained(base_model, peft_model_id).to(device)
gen_tokenizer = AutoTokenizer.from_pretrained(EX_MODEL_NAME, trust_remote_code=True, padding_side='left')
gen_tokenizer.pad_token = gen_tokenizer.eos_token
gen_model.eval()

In [None]:
def generate_policy(input):

    tokens = id_tokenizer(input, return_token_type_ids = False, return_tensors="pt").to(device)

    out = id_model(**tokens)
    res = torch.argmax(out.logits, dim=1)

    if res.item() == 1:

        input_text = "Policy: " + input + "\nEntities: {decision: "
        batch = gen_tokenizer(input_text, return_tensors="pt", return_token_type_ids=False).to(device)
        translated = gen_model.generate(**batch,
                                    pad_token_id=gen_tokenizer.eos_token_id,
                                    do_sample=False,
                                    max_length=512,
                                    num_return_sequences=1,
                                    early_stopping = False,)
        result = gen_tokenizer.decode(translated[0], skip_special_tokens=True)
        pattern = r'\{.*?\}'
        sanitized = re.findall(pattern, result)
        return sanitized[0]
    else:
        return "Not an ACP"

### Access control policy generation and identification

Results will be saved to your working directory.


In [None]:
with open('high_level_requirements.json','r') as f:
    sents = json.load(f)


inputs, outputs = [], []

for s in tqdm(sents):
    inputs.append(s)
    output = generate_policy(s)
    outputs.append(output)

df = pd.DataFrame({
    'inputs': inputs,
    'outputs': outputs
})

df.to_csv('results.csv')

## Access control policy verification and visualization

Even though in the actual pipeline, verification happens soon after a policy is generated, to save time and computing resources we perform this step after the generation process. Results will be saved to your working directory.

While our verifier is verifying the policies and if the finds that an access control policy is incorrect, it performs feature attribution using integrated gradients via Captum library.

### Loading the access control policy verification model

In [None]:
torch.cuda.empty_cache()

In [None]:
VER_MODEL_NAME = "facebook/bart-large"
verification_ckpt = PATH_CKPT + "checkpoints/checkpoint_verification/"

ID2AUGS = {0: 'allow_deny',
                 1: 'csub',
                 2: 'cact',
                 3: 'cres',
                 4: 'ccond',
                 5: 'cpur',
                 6: 'msub',
                 7: 'mres',
                 8: 'mcond',
                 9: 'mpur',
                 10: 'mrules',
                 11: 'correct'}


ver_tokenizer = AutoTokenizer.from_pretrained(VER_MODEL_NAME)
ver_model = AutoModelForSequenceClassification.from_pretrained(verification_ckpt).to(device)
ver_model.eval()

In [None]:
def verify_policy(nlacp, policy):

    pred_pol,_ = process_label([policy])
    pp = create_out_string(pred_pol)
    ver_inp = prepare_inputs_bart(nlacp, pp, ver_tokenizer, device)
    with torch.no_grad():
      preds = ver_model(**ver_inp).logits
    probs = torch.softmax(preds, dim=1)

    mprob = probs.max().item()
    pred = probs.argmax(axis=-1).item()
    highest_prob_cat = ID2AUGS[pred]

    return highest_prob_cat, pred, mprob

The below cell defines functions required for feature attribution to identify the location of the error in the generated access control policy

In [None]:
def fwd_function(input_ids, attention_mask):

    inp = {'input_ids': input_ids.to(device), 'attention_mask': attention_mask.to(device)}

    # with torch.no_grad():
    pred = ver_model(**inp).logits

    return pred


def construct_input_ref(s,l, ref_token_id, sep_token_id, bos_token_id):
    toks = ver_tokenizer.encode(s)
    tokl = ver_tokenizer.encode(l)

    input_ids = ver_tokenizer.encode(s,l)

    # construct reference token ids
    ref_input_ids = [bos_token_id] + [ref_token_id] * (len(toks)-2) + [sep_token_id] + [sep_token_id] + (len(tokl)-2)*[ref_token_id] + [sep_token_id]

    return torch.tensor([input_ids], device=device), torch.tensor([ref_input_ids], device=device)

def construct_attention_mask(input_ids):
    return torch.ones_like(input_ids)


def clean(tokens):
    l = []
    for t in tokens:
        if t.startswith('Ġ'):
            l.append(t[1:])
        else:
            l.append(t)
    return l

The below cell defines the layer integrated gradients instance using the verifier's forward methos and its embeddings.

In [None]:
lig = LayerIntegratedGradients(fwd_function, ver_model.base_model.encoder.embed_tokens)

The below cell defines functions need to store and visualize feature attributions by the Captum library.

In [None]:
vis_data_records_ig = []

def add_attributions_to_visualizer(attributions, text, pred, pred_ind, label, delta, vis_data_records):
    attributions = attributions.sum(dim=2).squeeze(0)
    attributions = attributions / torch.norm(attributions)
    attributions = attributions.cpu().detach().numpy()

    # storing couple samples in an array for visualization purposes
    vis_data_records.append(visualization.VisualizationDataRecord(
                            attributions,
                            pred,
                            ID2AUGS[pred_ind],
                            ID2AUGS[label],
                            ID2AUGS[pred_ind],
                            attributions.sum(),
                            text,
                            delta))

def interpret_sentence(sentence, policy, tokenizer, pred_ind, pred, label):

    input_ids, input_ref = construct_input_ref(sentence, policy, tokenizer.pad_token_id, tokenizer.sep_token_id, tokenizer.bos_token_id)

    indices = input_ids[0].detach().tolist()
    text = clean(tokenizer.convert_ids_to_tokens(indices))

    attention_mask = construct_attention_mask(input_ids)

    attributions_ig, delta = lig.attribute(inputs=input_ids,
                                           baselines=input_ref,
                                           additional_forward_args=(attention_mask),
                                           n_steps=500,
                                           target=pred_ind,
                                           internal_batch_size = 4,
                                           return_convergence_delta=True)

    print('pred: ', ID2AUGS[pred_ind], '(', '%.2f'%pred, ')', ', delta: ', abs(delta))
    add_attributions_to_visualizer(attributions_ig, text, pred, pred_ind, label, delta, vis_data_records_ig)

### Access control policy verification and feature attribution

In [None]:
results = pd.read_csv('results.csv')
ins, outs, vers = [],[],[]

for i,o in zip(results['inputs'], results['outputs']):
    ins.append(i)
    outs.append(o)
    if o == "Not an ACP":
        vers.append('None')
    else:
        ver_res, pred_ind, mprob = verify_policy(i,o)
        if pred_ind!=11:
          interpret_sentence(i, o, ver_tokenizer, pred_ind, mprob, pred_ind)
        vers.append(ver_res)


df = pd.DataFrame({
    'inputs': ins,
    'outputs': outs,
    'verification': vers
})

df.to_csv('results_verification.csv')


In [None]:
_ = visualization.visualize_text(vis_data_records_ig)