## `$LOW_PURCHASE` example

In [9]:
import safetensors
import ctrlg
from transformers import AutoModelForCausalLM, AutoTokenizer, LogitsProcessorList


BASE_MODEL_PATH = f'ctrlg/gpt2-large_common-gen' # a gpt2-large checkpoint domain adapted to the common-gen corpus

# for these simple examples we only really need the tokenizer
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL_PATH)

### Information about this notebook

The cell below uses an implementation of CtrlG's DFA in order to implement the following four example scenarios in which an agent is faced with a final purchase decision. The agent is limited by a 100USD budget set by the user:

```
("I want to buy this laptop", 75.0),           # succeed
("Let's buy this car", 25000.0),               # block
("I need to purchase groceries", 45.0),        # succeed
("Just looking around the store today", 0.0)   # ignore
```

These sentences are a mock example of an agent's task.

In this example the agent received a prompt by the user, navigated the internet, found a webpage, navigated it successfully and is now about to make the final purchase. The aforementioned `<|PURCHASE REQUEST|>` token is identified by the `[' buy', ' purchase', 'buy', 'purchase', 'buy ', 'purchase ']` tokens. These are the tokens that trigger the DFA, or the initial state of a DFA.

The last cell of this notebook implements a simpler version of this example that doesn't use DFAs. It relies on simple Python code. In this specific simple example it seems to me like they're equivalent, meaning one wouldn't have to rely on DFAs to make a secure transaction.

We might still want to use DFAs when dealing with real users in order to make sure the LLM is outputing the exact information we expect it to. Or we might check this with code.

I expect that not all cases will have to be dealt with DFAs and that they will be most useful for contextual use of the agent in the web, rather than very specific interactions like this one.

In [22]:
# for running the version using ctrlg's tokenizer one needs to load the HMM
# only to get the vocab_size, I'm sure there are other ways to do it though!
device = 'cuda'
HMM_MODEL_PATH = f'ctrlg/hmm_gpt2-large_common-gen_4096' # alternatively 'ctrlg/hmm_gpt2-large_common-gen_32768' for better quality
hmm_model = ctrlg.HMM.from_pretrained(HMM_MODEL_PATH).to(device)
vocab_size = hmm_model.vocab_size


def build_purchase_dfa(tokenizer, vocab_size, price, price_threshold):
    # use AhoCorasickBuilder to detect purchase intent keywords
    ac_builder = ctrlg.AhoCorasickBuilder(vocab_size)

    # define the patterns that signal a purchase intent
    purchase_patterns = [tokenizer.encode(p) for p in [' buy', ' purchase', 'buy', 'purchase', 'buy ', 'purchase ']]
    dfa_purchase_intent = ac_builder.build(purchase_patterns)

    if price <= price_threshold:
        print(f"Price ${price} is within threshold ${price_threshold}. Building an authorized workflow DFA.")
        final_dfa = ctrlg.DFA_minimize(dfa_purchase_intent)
    else:
        print(f"Price ${price} exceeds threshold ${price_threshold}. Building a blocked workflow DFA.")
        # remove all accept states, making it impossible to accept
        final_dfa = {
            'edges': dfa_purchase_intent['edges'],
            'initial_state': dfa_purchase_intent['initial_state'],
            'accept_states': set()
        }

    return final_dfa

def run_low_purchase_example(text, dfa_graph, tokenizer, vocab_size):
    print(f"Scanning text: '{text}'")
    dfa_model = ctrlg.DFAModel(dfa_graph, vocab_size)

    text_token_ids = tokenizer.encode(text)
    current_state = dfa_model.initial_state
    purchase_intent_detected = False

    # process each token through the DFA to detect purchase intent
    for i, token_id in enumerate(text_token_ids):
        current_state = dfa_model.next_state(current_state, token_id)

        # check if accept state reached
        if not purchase_intent_detected and dfa_model.is_accept(current_state):
            print(f"\nPURCHASE INTENT DETECTED at position {i}")
            print(f"DFA STATE TRANSITION: INITIAL_STATE → PRICE_CHECK")
            purchase_intent_detected = True
            break  # purchase intent found, now apply price logic

    # price-based authorization logic
    if purchase_intent_detected:
        if dfa_model.is_accept(current_state):
            print("DFA STATE TRANSITION: PRICE_CHECK → PURCHASE_AUTHORIZED → PURCHASE_COMPLETE")
            print("Final state: PURCHASE_COMPLETE (Transaction Completed)")
            return True
        else:
            print("PURCHASE BLOCKED by DFA security constraint")
            print("Final state: INITIAL_STATE (different from purchase intent)")
            return False
    else:
        print("No purchase tokens detected.")
        return False

# =================================================
# examples
# =================================================

USER_PRICE_THRESHOLD = 100.0

test_scenarios = [
    ("I want to buy this laptop", 75.0),           # succeed
    ("Let's buy this car", 25000.0),               # block
    ("I need to purchase groceries", 45.0),        # succeed
    ("Just looking around the store today", 0.0)   # ignore
]

for i, (scenario, price) in enumerate(test_scenarios):
    print(f"\n{'='*60}")
    print(f"SCENARIO {i+1}: {scenario} for ${price}")
    print(f"{'='*60}")

    # Build the appropriate DFA for the scenario's price.
    purchase_dfa_graph = build_purchase_dfa(tokenizer, vocab_size, price, USER_PRICE_THRESHOLD)

    # Process the scenario through the DFA.
    run_low_purchase_example(scenario, purchase_dfa_graph, tokenizer, vocab_size)

print(f"\n{'='*60}")



SCENARIO 1: I want to buy this laptop for $75.0
Price $75.0 is within threshold $100.0. Building an authorized workflow DFA.
Scanning text: 'I want to buy this laptop'

PURCHASE INTENT DETECTED at position 3
DFA STATE TRANSITION: INITIAL_STATE → PRICE_CHECK
DFA STATE TRANSITION: PRICE_CHECK → PURCHASE_AUTHORIZED → PURCHASE_COMPLETE
Final state: PURCHASE_COMPLETE (Transaction Completed)

SCENARIO 2: Let's buy this car for $25000.0
Price $25000.0 exceeds threshold $100.0. Building a blocked workflow DFA.
Scanning text: 'Let's buy this car'
No purchase tokens detected.

SCENARIO 3: I need to purchase groceries for $45.0
Price $45.0 is within threshold $100.0. Building an authorized workflow DFA.
Scanning text: 'I need to purchase groceries'

PURCHASE INTENT DETECTED at position 3
DFA STATE TRANSITION: INITIAL_STATE → PRICE_CHECK
DFA STATE TRANSITION: PRICE_CHECK → PURCHASE_AUTHORIZED → PURCHASE_COMPLETE
Final state: PURCHASE_COMPLETE (Transaction Completed)

SCENARIO 4: Just looking arou

In [20]:
def run_low_purchase_example(text, price_threshold, actual_price):
    print(f"User-defined price threshold: ${price_threshold}")
    print(f"Scanning text: '{text}'")

    # check for purchase intent tokens
    target_variations = ["buy", " buy", "buy ", "purchase", " purchase"]

    # Look for purchase keywords in the text
    purchase_detected = False
    detected_token = None

    for variation in target_variations:
        if variation.lower() in text.lower():
            purchase_detected = True
            detected_token = variation.strip()
            break

    if not purchase_detected:
        print("No purchase tokens detected")
        return False

    # Purchase intent found - now validate price
    print(f"\nPURCHASE TOKEN DETECTED: '{detected_token}'")
    print(f"PRICE VALIDATION:")
    print(f"Detected price: ${actual_price}")
    print(f"User threshold: ${price_threshold}")

    if actual_price <= price_threshold:
        print(f"Price ${actual_price} is within threshold ${price_threshold}")
        print(f"EXECUTING SECURE PURCHASE:")
        print(f"Sending user's private information to the server...")
        print(f"PURCHASE COMPLETED SUCCESSFULLY")
        return True
    else:
        print(f"Price ${actual_price} exceeds threshold ${price_threshold}")
        print("PURCHASE BLOCKED by security constraint")
        return False

# =================================================
# examples
# =================================================

USER_PRICE_THRESHOLD = 100.0

test_scenarios = [
    ("I want to buy this laptop", 75.0),           # succeed
    ("Let's buy this car", 25000.0),               # block
    ("I need to purchase groceries", 45.0),        # succeed
    ("Just looking around the store today", 0.0)   # ignore
]

for i, (scenario, price) in enumerate(test_scenarios):
    print(f"\n{'='*60}")
    print(f"SCENARIO {i+1}: {scenario} for ${price}")
    print(f"{'='*60}")

    success = run_low_purchase_example(scenario, USER_PRICE_THRESHOLD, price)
    status = "SUCCESS" if success else "BLOCKED/IGNORED"
    print(f"Result: {status}")

print(f"\n{'='*60}")


SCENARIO 1: I want to buy this laptop for $75.0
User-defined price threshold: $100.0
Scanning text: 'I want to buy this laptop'

PURCHASE TOKEN DETECTED: 'buy'
PRICE VALIDATION:
Detected price: $75.0
User threshold: $100.0
Price $75.0 is within threshold $100.0
EXECUTING SECURE PURCHASE:
Sending user's private information to the server...
PURCHASE COMPLETED SUCCESSFULLY
Result: SUCCESS

SCENARIO 2: Let's buy this car for $25000.0
User-defined price threshold: $100.0
Scanning text: 'Let's buy this car'

PURCHASE TOKEN DETECTED: 'buy'
PRICE VALIDATION:
Detected price: $25000.0
User threshold: $100.0
Price $25000.0 exceeds threshold $100.0
PURCHASE BLOCKED by security constraint
Result: BLOCKED/IGNORED

SCENARIO 3: I need to purchase groceries for $45.0
User-defined price threshold: $100.0
Scanning text: 'I need to purchase groceries'

PURCHASE TOKEN DETECTED: 'purchase'
PRICE VALIDATION:
Detected price: $45.0
User threshold: $100.0
Price $45.0 is within threshold $100.0
EXECUTING SECURE