## Ad Auctions for LLMs via Retrieval Augmented Genearation

Here is the outline of the experiments we plan to run on this notebook:

1. Given a query $x$, we generate fictional advertisers $\text{ad}_1, \text{ad}_2, ..., \text{ad}_n$ with their ads.
2. We sample bids $b_1, b_2, ..., b_n$ from a lognormal distribution (mean and std of this should be adjusted w.r.t. values of $q_i$)
3. We run different mechanism for this LLM query/ads configuration, *i.e.*, $(x, \{ad_i\}_{i=1}^{n}, \{b_i\}_{i=1}^{n}$.

    + single-allocation segment auction with replacement
    + single-allocation segment auction without replacement
    + multi-allocation greedy mechanism
    + naive (i) mechanism where $y = y_{\text{orig}} + \text{ad}_1 + \text{ad}_2 + ... + \text{ad}_k$
    + naive (ii) mechanism where we run 2nd price auction in each round

4. We run $N$ times each of the above mechanisms as they are randomized, we report the expectation of the following metrics:
    + social welfare (sum of $v_{i_t} q_{i_t}$)
    + revenue (sum of $p_{i_t}$)
    + relevance (sum of $q_{i_t}$)
    + output quality; we think of the following metrics
        + distance of output including ad to the original output -- measures if the output including ad answers the query or not.
        + we ask an off-the-shelf LLM to evaluate the output including ads has too much advertisement or not, probably asking it to rate from $1$ to $10$.
        + we ask an off-the-shelf LLM to evaluate paragraphs are coherent or not, probably asking it to rate from $1$ to $10$.


There are several design choices that we consider:
+ How long each segment should be, we first run with a paragraph.
+ dependant/independent segment auction
+ distribution that $x_i$ (allocation vector) comes from. Since we use RAG-based aggregation function on bids to get the allocation vector, we need to control how this allocation vector looks like, *e.g.*, we were thinking of putting some ratio for $\frac{x_{\max}}{x_{\min}}$.
+ how many segments should we consider? we put $k=3$ in our experiments, this further gives us the results for $k = 1, 2$.
+ design choices for multi-allocation greedy mechanism.

### GPT-4 API Access

In [1]:
from openai import OpenAI

def query_to_chatgpt(prompt):
    chat_completion_det = client.chat.completions.create(
        messages = [
            {
                "role": "user",
                "content": prompt
            },
            {
                "role": "assistant",
                "content": ""
            }
        ],
        model = "gpt-4-turbo",
        logprobs = False,
        temperature = 1,
        max_tokens = 200
    )
    return chat_completion_det.choices[0].message.content

### Libraries

In [2]:
from sentence_transformers import SentenceTransformer, util
from tqdm import tqdm
import time
import numpy as np
import json
from itertools import combinations_with_replacement

### Models

In [3]:
model = SentenceTransformer('multi-qa-MiniLM-L6-cos-v1')

### Initiating query and advertisers
I borrow samples from the previous experiments we did -- we may need to generate fictional advertisers (ask Sebastien).

In [4]:
prompt = 'How to activate Internationl Roaming?'
advertisers = [
    'AT&T', 'T-Mobile', 'Vodafone', 'Huawei',
    'Apple', 'Samsung', 'LG', 'Sony',
    'BMW', 'Costco', 'Starbucks', 'ALDI', 'Lidl',
]
prompt_for_adv = 'Give me a paragraph about {} so that I can use it to advertise this brand.'

### Obtaining $\text{ad}_i$ For All Advertisers

Here, we use the `prompt_for_adv` above to get a paragraph describing each of the advertisers.
Responses will generate a set of advertisements/documents that are used according to RAG to include advertisement within the output. Similar to what we had in paper denoted by: 
$$\{\text{ad}_1, \text{ad}_2, ..., \text{ad}_n\}.$$

In [5]:
%%time
ads = []
for c in tqdm(advertisers):
    ads.append(query_to_chatgpt(prompt_for_adv.format(c)))

100%|██████████| 13/13 [01:43<00:00,  7.97s/it]

CPU times: user 128 ms, sys: 15.1 ms, total: 143 ms
Wall time: 1min 43s





In [8]:
ads

["Experience the power of connection with AT&T, America's reliable network. Whether you're calling your loved ones, streaming your favorite series, or managing your business remotely, AT&T provides seamless service and cutting-edge technology to keep you connected in more places. With our fast 5G speeds, comprehensive coverage, and a range of flexible plans tailored to fit your needs, AT&T ensures that every call is clear and every connection is strong. Join millions who trust AT&T to stay connected to the world around them — because with AT&T, it's not just about communication; it's about connecting with confidence.",
 "Experience the power of connectivity with T-Mobile, the leader in 5G technology. At T-Mobile, we're committed to providing the best mobile experience, with lightning-fast speeds, unparalleled reliability, and nationwide coverage that keeps you connected wherever you go. Whether you're streaming, browsing, gaming, or just staying in touch with loved ones, T-Mobile ensur

### Segment-wise RAG-based Output Modification

In [50]:
prompt_to_init_answer = '''
{} please respond to this question for only 1 paragraph while also advertise {} with this context:  "{}" 
Make sure to connect the answer and the advertisement very naturally, not something like appending the ads after just answering the question.
Focus on answering the question, there shouldn't be too much advertisment in the output.
Your output must be 1 paragraph, 4 sentences please.
'''

prompt_to_continue_answer = '''
please answer the question "{}" by continuing your response "{}". please write only 1 paragraph while also advertise {} with this context:  "{}" 
Make sure to connect the answer and the advertisement very naturally, not something like appending the ads after just answering the question.
Focus on answering the question, there shouldn't be too much advertisment in the output. Ads should be minimal.
Your output must be 1 paragraph, 4 sentences please.
'''

prompt_to_init_multi_ad_k3 = '''
{} please respond to this question for only 1 paragraph while minimally advertise {}, {}, and {} with these three contexts:  1) "{}" \n 2) "{}" \n 3) "{}".
Make sure that you advertise all of them and connect the answer and the advertisement very naturally, not something like appending the ads after just answering the question.
Focus on answering the question, there shouldn't be too much advertisment in the output.
Your output must be 1 paragraph.
'''

prompt_to_continue_multi_ad_k3 = '''
please answer the question "{}" by continuing this response "{}". write only 1 paragraph while minimally advertise {}, {}, and {} with these three contexts:  1) "{}" \n 2) "{}" \n 3) "{}".
Make sure that you advertise all of them and connect the answer and the advertisement very naturally, not something like appending the ads after just answering the question.
Focus on answering the question, there shouldn't be too much advertisment in the output.
Your output must be 1 paragraph.
'''

prompt_to_init_multi_ad_k2 = '''
{} please respond to this question for only 1 paragraph while minimally advertise {} and {} with these two contexts:  1) "{}" \n 2) "{}".
Make sure that you advertise both of them and connect the answer and the advertisement very naturally, not something like appending the ads after just answering the question.
Focus on answering the question, there shouldn't be too much advertisment in the output.
Your output must be 1 paragraph.
'''

prompt_to_continue_multi_ad_k2 = '''
please answer the question "{}" by continuing this response "{}". write only 1 paragraph while minimally advertise {} and {} with these two contexts:  1) "{}" \n 2) "{}".
Make sure that you advertise both of them and connect the answer and the advertisement very naturally, not something like appending the ads after just answering the question.
Focus on answering the question, there shouldn't be too much advertisment in the output.
Your output must be 1 paragraph.
'''

prompt_to_init_multi_ad_k1 = '''
{} please respond to this question for only 1 paragraph while also advertise {} with this context:  "{}" 
Make sure to connect the answer and the advertisement very naturally, not something like appending the ads after just answering the question.
Focus on answering the question, there shouldn't be too much advertisment in the output.
Your output must be 1 paragraph.
'''

prompt_to_continue_multi_ad_k1 = '''
{} please respond to this question while also advertise {} with this context:  "{}" 
Make sure to connect the answer and the advertisement very naturally, not something like appending the ads after just answering the question.
Focus on answering the question, there shouldn't be too much advertisment in the output. Ads should be minimal.
Your output must be 1 paragraph.
'''

### Segment-wise Generation
At each segment, we ask LLM to continue its response for prompt $x$ while including ad from the advertiser who has won this round of auction.

In [10]:
def segment_based_RAG_generation(prompt: str, advertiser: str, ad: str, curr_y=None):
    if curr_y is None:
        curr_y = query_to_chatgpt(prompt_to_init_answer.format(prompt, advertiser, ad))
    else:
        curr_y += '\n' + query_to_chatgpt(prompt_to_continue_answer.format(prompt, curr_y, advertiser, ad))
    return curr_y

In [11]:
def segment_based_multi_ad_generation(prompt: str, advertisers: list, ads: str, curr_y=None):
    if len(advertisers) == 1:
        if curr_y is None:
            curr_y = query_to_chatgpt(prompt_to_init_multi_ad_k1.format(prompt, advertisers[0], ads[0]))
        else:
            curr_y += '\n' + query_to_chatgpt(prompt_to_init_multi_ad_k1.format(prompt, curr_y, advertisers[0], ads[0]))
        
    elif len(advertisers) == 2:
        if curr_y is None:
            curr_y = query_to_chatgpt(prompt_to_init_multi_ad_k2.format(prompt, advertisers[0], advertisers[1], ads[0], ads[1]))
        else:
            curr_y += '\n' + query_to_chatgpt(prompt_to_init_multi_ad_k2.format(prompt, curr_y, advertisers[0], advertisers[1], ads[0], ads[1]))
    elif len(advertisers) == 3:
        if curr_y is None:
            curr_y = query_to_chatgpt(prompt_to_init_multi_ad_k2.format(prompt, advertisers[0], advertisers[1], advertisers[2], ads[0], ads[1], ads[2]))
        else:
            curr_y += '\n' + query_to_chatgpt(prompt_to_init_multi_ad_k2.format(prompt, curr_y, advertisers[0], advertisers[1], advertisers[2], ads[0], ads[1], ads[2]))
    else:
        assert 1 == 0
        
    return curr_y    

### Relevance Measure

Below code finds $\text{P}_{\eta}\left[\text{ad}_i\ |\ x\right]$ using embedding space of `SentenceTransformer`.

In [52]:
def rag_based_relevance(x: str, ads: list): 
    x_embedding = model.encode(x)
    ads_embedding = model.encode(ads)
    bf = (1 + util.dot_score(x_embedding, ads_embedding)[0].numpy()) / 2
    return (bf - 0.4) /0.3

### Metrics

We compare different mechanisms w.r.t below metrics. For an auction with $k$ segments, they are defined as follows:
$$\begin{align*}
    \texttt{Revenue}:& \sum_{t=1}^{k} p_{i_t} \\
    \texttt{Social Welfare}:& \sum_{t=1}^{k} v_{i_t} q_{i_t} \\
    \texttt{Relevance}:& \sum_{t=1}^{k} q_{i_t} \\
\end{align*}$$


In [13]:
SOCIAL_WELFARE = 'social_welfare'
REVENUE = 'revenue'
RELEVANCE = 'relevance'
OUTPUT = 'output'

In [14]:
def init_metrics():
    return {
        SOCIAL_WELFARE: [],
        REVENUE: [],
        RELEVANCE: [],
        OUTPUT: [],
    }

def update_metrics(metrics: dict, payment: float, value: float, rel: float, output: str):
    metrics[REVENUE].append(payment)
    metrics[SOCIAL_WELFARE].append(value * rel)
    metrics[RELEVANCE].append(rel)
    metrics[OUTPUT].append(output)

### Running Auction

We adjust bids $\{b_i\}_{i=1}^{n}$ according to scores coming from RAG, *i.e.*, $q_i$. This is implemented using pages 4 and 5 of the draft.

In [53]:
def randomized_selection(q: np.ndarray, b: np.ndarray):
    adjusted_bids = (q * b) / np.dot(q, b)
    n = b.shape[0] # number of ads
    
    k , u = None, np.random.uniform()
    
    for i in range(n):
        W_i = (np.sum(adjusted_bids[: i]), np.sum(adjusted_bids[: i+1]))
        if W_i[0] <= u <= W_i[1]:
            k = i
    
    all_other_than_k = np.sum(adjusted_bids) - adjusted_bids[k]
    aux_value = (np.sum(adjusted_bids[:k])) / all_other_than_k
    
    if aux_value >= u:
        payment = (np.sum(adjusted_bids[:k]) - u * all_other_than_k) / (u * q[k])
    else:
        payment = (np.sum(adjusted_bids[:k]) - u * all_other_than_k) / ((u - 1) * q[k])
    
    return k, payment
        

### Mechanisms
+ single-allocation segment auction with replacement
+ single-allocation segment auction without replacement
+ multi-allocation greedy mechanism
+ naive (i) mechanism where $y = y_{\text{orig}} + \text{ad}_1 + \text{ad}_2 + ... + \text{ad}_k$
+ naive (ii) mechanism where we run 2nd price auction in each round

For each of these mechanisms, we need to have a set of bids $\{b_i\}_{i=1}^{n}$ from all advertisers, as well as RAG-based relevancy metric $q_i^{(t)}$. We compute RAG-based relevancy by measuring the similarity of current generated response (or query $x$) with ads (documents). Mechanism is run for $k$ iterations. We store metrics meanwhile that are later reported as properties of different mechanisms.

In [58]:
def single_allocation_with_replacement(
    prompt: str,
    advertisers: list,
    ads: list, 
    bids: np.ndarray,
    num_of_segments: int,
    dependent: bool = False):
    
    curr, v = '', bids
    metrics = init_metrics()
    
    if not dependent:
        q = rag_based_relevance(prompt, ads, )
        for x in list(zip(q, bids, advertisers)):
            print(x)
    
    for t in range(num_of_segments):
        if dependent:
            q = rag_based_relevance(prompt + '\n' + curr, ads, )
        
        k, payment = randomized_selection(q, bids)
        print(f'time: {t}, k: {k}, advertiser: {advertisers[k]}')
        curr = segment_based_RAG_generation(prompt=prompt, advertiser=advertisers[k], ad=ads[k], curr_y=curr)
        # TODO: We ignore abobe line for now as we only care about metrics irrelevant to output.
        update_metrics(metrics, payment, v[k], q[k], curr)
    
    return metrics, curr
            
    

In [18]:
def single_allocation_without_replacement(
    prompt: str,
    advertisers: list,
    ads: list, 
    bids: np.ndarray,
    num_of_segments: int,
    dependent: bool = False):
    
    curr, v = '', bids
    metrics = init_metrics()
    selected_ads = np.zeros(n) # keep track of ads that are selected in previous rounds of auction.
    
    for t in range(num_of_segments):
        q = rag_based_relevance(prompt + '\n' + curr if dependent else prompt, ads, )
        k, payment = randomized_selection(q, bids)
        assert selected_ads[k] == 0 # shouldn't have been selected before.
        # curr = curr + '\n' + segment_based_RAG_generation(prompt=prompt, advertiser=advertisers[k], ad=ads[k], curr_y=curr)
        # TODO: We ignore abobe line for now as we only care about metrics irrelevant to output.
        selected_ads[k] = 1 # k is winner
        bids[k] = 0 # never gonna be winner again
        update_metrics(metrics, payment, v[k], q[k], curr)        
    
    return metrics, curr
            
    

In [19]:
def single_allocation_naive_i(
    prompt: str,
    advertisers: list,
    ads: list, 
    bids: np.ndarray,
    num_of_segments: int,
    dependent: bool = False):
    
    curr, v = '', bids
    metrics = init_metrics()
    
    for t in range(num_of_segments):
        q = rag_based_relevance(prompt + '\n' + curr if dependent else prompt, ads, )
        k, payment = randomized_selection(q, bids)
        curr = curr + ads[k]
        update_metrics(metrics, payment, v[k], q[k], curr)
        
    return metrics, curr
    

In [21]:
def single_allocation_naive_ii(
    prompt: str,
    advertisers: list,
    ads: list, 
    bids: np.ndarray,
    num_of_segments: int,
    dependent: bool = False):
    
    curr, v = '', bids
    metrics = init_metrics()

    for t in range(num_of_segments):
        q = rag_based_relevance(prompt + '\n' + curr if dependent else prompt, ads, )
        k, payment = randomized_selection(np.ones(bids.shape[0]), bids) # removing the effect of RAG -- second price auction
        # curr = curr + '\n' + segment_based_RAG_generation(prompt=prompt, advertiser=advertisers[k], ad=ads[k], curr_y=curr)
        # TODO: We ignore abobe line for now as we only care about metrics irrelevant to output.
        update_metrics(metrics, payment, v[k], q[k], curr)
        
    return metrics, curr
    

In [22]:
def get_relevency(y: str, A: list, ads: list):
    y_embedding = model.encode(y)
    ads_embedding = model.encode([ads[ad] for ad in A])
    return (1 + util.dot_score(y_embedding, ads_embedding)[0].numpy()) / 2
    

In [23]:
def multi_allocation_greedy(
    prompt: str,
    advertisers: list,
    ads: list, 
    bids: np.ndarray,
    num_of_segments: int,
    num_of_ads_in_each_segment: int):
    
    curr, v, n = '', bids, bids.shape[0]
    metrics = init_metrics()
    
    
    
    for t in range(num_of_segments):
        A = []
        
        while len(A) < num_of_ads_in_each_segment:
            adjusted_bids = np.zeros(n)
            for i in tqdm(range(n)):
                if i in A:
                    continue
            
                A_i = A + [i]
                y_A_i = segment_based_multi_ad_generation(prompt=prompt, advertisers=[advertisers[j] for j in A_i], ads=[ads[j] for j in A_i], curr_y=curr)
                
                q_A_i = get_relevency(y_A_i, A_i, ads)
                adjusted_bids[i] = np.dot(q_A_i, bids[np.array(A_i)])
                
            
            i_star = np.argmax(adjusted_bids)
            A = A + [i_star]
        
        y_A = segment_based_multi_ad_generation(prompt=prompt, advertisers=[advertisers[j] for j in A], ads=[ads[j] for j in A], curr_y=curr)
        q_A = get_relevency(y_A, A, ads)
        
        metrics[OUTPUT].append(y_A)
        metrics[SOCIAL_WELFARE].append(np.dot(q_A, bids[np.array(A)]))
        metrics[RELEVANCE].append(np.sum(q_A))
        curr = curr + '\n' + y_A
    
    return metrics, curr

## Sampling Bids and Running Auctions
we sample bids from $\log \text{normal}$ distribution, i.e., $b \sim \log \text{normal}(\mu, \sigma)$, we set $\mu = 1$ and $\sigma = 0.03$.

In [24]:
def sample_bids(num_of_advertisers, mu=1, sigma=0.03):
    return np.random.lognormal(mu, sigma, (num_of_advertisers,))

In [25]:
bids = sample_bids(len(ads))

In [59]:
y1, m1 = single_allocation_with_replacement(prompt=prompt, advertisers=advertisers, ads=ads, bids=bids, num_of_segments=3, dependent=False)

(0.69082683, 2.5848856954575905, 'AT&T')
(0.5200673, 2.5223198469809267, 'T-Mobile')
(0.62053806, 2.6660367930562083, 'Vodafone')
(0.52880275, 2.710690883169751, 'Huawei')
(0.18322447, 2.6714417746953365, 'Apple')
(0.37401527, 2.684990009230503, 'Samsung')
(0.5497335, 2.750882249676141, 'LG')
(0.16305208, 2.6275275071856763, 'Sony')
(0.33878297, 2.629014736623947, 'BMW')
(0.31042683, 2.5832462503897284, 'Costco')
(0.3547878, 2.7008032224985565, 'Starbucks')
(0.33118257, 2.793917580311574, 'ALDI')
(0.33572862, 2.782846278415154, 'Lidl')
time: 0, k: 0, advertiser: AT&T
time: 1, k: 3, advertiser: Huawei
time: 2, k: 6, advertiser: LG


In [56]:
m1

"\nTo activate International Roaming, typically you would need to contact your mobile service provider either through their website, customer service hotline, or by visiting a local store to ensure seamless connectivity on your travels. Just as you would choose a BMW for its unmatched quality and reliability on the roads, ensuring your mobile services are prepared with International Roaming is vital for a stress-free journey. BMW, renowned for transforming every drive into a luxurious experience with its dynamic capabilities and elegant designs, parallels the need for having reliable communication while abroad. So, before you set off on your next adventure, make sure your phone is as ready to roam internationally as your BMW is to conquer new roads.\nTo activate International Roaming, typically you would need to contact your mobile service provider either through their website, customer service hotline, or visiting a local store to ensure seamless connectivity on your travels. Just as 

In [40]:
m1

"\nJust like Lidl ensures you experience quality and convenience with their array of exceptional goods and services, activating international roaming is conveniently straightforward – designed to keep you connected effortlessly when you're abroad, just as Lidl connects you to the best in groceries and essentials. To activate international roaming, simply contact your mobile service provider before you travel. Ensure your phone is compatible with international networks, and choose a plan that suits your travel needs. This way, you can enjoy uninterrupted connectivity while exploring new destinations, much like how Lidl provides a seamless shopping experience, ensuring you find everything you need under one roof to prepare for your journey. So, before setting off, make a quick trip to Lidl for all your travel essentials, where quality meets affordability.\nJust like Lidl ensures you experience quality and convenience with their array of exceptional goods and services, activating internat