In [86]:
import pandas as pd
import numpy as np
import scipy
from scipy.optimize import curve_fit

## Part A: 

### Estimate the alpha and beta parameters for each of these four keywords for this firm. Hand-in: The eight numbers. No additional writeup required. Hint on checking your answers: For kw8322228, alpha should be between 70 and 76, beta should between 0.03 and 0.06, with a RSS of about 230.

### To estimate the alpha and beta for a keyword you need to run nonlinear regression n.clicks as a function of bid.value and using the appropriate function form.  Nonlinear regression can be run in R using nls() and in Python using scipy.optimize.curve_fit.

### As you may recall from previous HWs, nonlinear regression requires initial values for the parameters. For the initial value of alpha for a particular keyword, use the number of clicks at the highest bid in the dataset provided for that keyword. For the initial value of beta for each keyword, use the  reciprocal of the average bid  in the dataset provided for that keyword.

In [8]:
k1 = pd.read_csv("clicksdata.kw8322228.csv")
k1 = k1.iloc[:,1:]

k2 = pd.read_csv("clicksdata.kw8322392.csv")
k2 = k2.iloc[:,1:]

k3 = pd.read_csv("clicksdata.kw8322393.csv")
k3 = k3.iloc[:,1:]

k4 = pd.read_csv("clicksdata.kw8322445.csv")
k4 = k4.iloc[:,1:]

In [28]:
def find_alpha_beta(bid_value, alpha, beta):
    return alpha * (1 - np.exp(-beta * bid_value))

def est_params(k):
    x = k['bid.value']
    y = k['n.clicks']

    init_alpha = k.iloc[-1]['n.clicks']
    init_beta = 1 / np.mean(k['bid.value'])

    params, params_covariance = curve_fit(find_alpha_beta, x, y, p0 = [init_alpha, init_beta])

    return params[0], params[1]

In [29]:
print('kw8322228: alpha = ', est_params(k1)[0], ', beta = ', est_params(k1)[1])
print('kw8322392: alpha = ', est_params(k2)[0], ', beta = ', est_params(k2)[1])
print('kw8322393: alpha = ', est_params(k3)[0], ', beta = ', est_params(k3)[1])
print('kw8322445: alpha = ', est_params(k4)[0], ', beta = ', est_params(k4)[1])

kw8322228: alpha =  74.09086171091433 , beta =  0.03944902761835571
kw8322392: alpha =  156.4398020518242 , beta =  0.15008283809726447
kw8322393: alpha =  104.79929301898942 , beta =  0.07971659393008784
kw8322445: alpha =  188.1112794219063 , beta =  0.432291894875094


## Part B:

### Assume that you have no budget constraint. Using the alpha, beta parameters from Part A and the LTV and conversion rate values, estimate the optimal bids for each of the four keywords. Hand-in: the optimal bid value, the corresponding profit and the corresponding total expenditure for each of the four keywords. No additional writeup required. 

In [33]:
ltv = pd.read_excel('hw-kw-ltv-conv.rate-data.xlsx')

In [111]:
def profit(bid_value, alpha, beta, ltv, conv_rate):
    return alpha * (1 - np.exp(-beta * bid_value)) * (ltv * conv_rate - bid_value)

def optimal_bid_value(k, k_name):
    alpha, beta = est_params(k)
    
    ltv_ = ltv[ltv['keyword'] == k_name]['ltv'].iloc[0]
    
    conv_rate = ltv[ltv['keyword'] == k_name]['conv.rate'].iloc[0]
    bid_value = k['bid.value']

    x = bid_value
    y = profit(bid_value, alpha, beta, ltv_, conv_rate)

    fm = lambda x: -profit(x, alpha, beta, ltv_, conv_rate)
    optimal_bid = scipy.optimize.minimize_scalar(fm, bounds=(0,50)).x
    
    return alpha, beta, ltv_, conv_rate, optimal_bid

def expenditure(bid_value, alpha, beta):
    return bid_value * alpha * (1 - np.exp(-beta * bid_value))

In [65]:
alpha, beta, ltv_, conv_rate, optimal_bid = optimal_bid_value(k1, 'kw8322228')
expenditure_1 = expenditure(optimal_bid, alpha, beta)

print('kw8322228 optimal bid: ', optimal_bid)
print('kw8322228 profit: ', profit(optimal_bid, alpha, beta, ltv_, conv_rate))
print('kw8322228 expenditure: ', expenditure_1)

kw8322228 optimal bid:  34.127622021575114
kw8322228 profit:  3950.4569566203595
kw8322228 expenditure:  1870.6154231292371


In [66]:
alpha, beta, ltv_, conv_rate, optimal_bid = optimal_bid_value(k2, 'kw8322392')
expenditure_2 = expenditure(optimal_bid, alpha, beta)

print('kw8322392 optimal bid: ', optimal_bid)
print('kw8322392 profit: ', profit(optimal_bid, alpha, beta, ltv_, conv_rate))
print('kw8322392 expenditure: ', expenditure_2)

kw8322392 optimal bid:  13.563448218393251
kw8322392 profit:  6032.90219999242
kw8322392 expenditure:  1844.7546824448818


In [67]:
alpha, beta, ltv_, conv_rate, optimal_bid = optimal_bid_value(k3, 'kw8322393')
expenditure_3 = expenditure(optimal_bid, alpha, beta)

print('kw8322393 optimal bid: ', optimal_bid)
print('kw8322393 profit: ', profit(optimal_bid, alpha, beta, ltv_, conv_rate))
print('kw8322393 expenditure: ', expenditure_3)

kw8322393 optimal bid:  22.43386794519653
kw8322393 profit:  5451.614107514758
kw8322393 expenditure:  1957.8736021122334


In [68]:
alpha, beta, ltv_, conv_rate, optimal_bid = optimal_bid_value(k4, 'kw8322445')
expenditure_4 = expenditure(optimal_bid, alpha, beta)

print('kw8322445 optimal bid: ', optimal_bid)
print('kw8322445 profit: ', profit(optimal_bid, alpha, beta, ltv_, conv_rate))
print('kw8322445 expenditure: ', expenditure_4)

kw8322445 optimal bid:  5.816956121646157
kw8322445 profit:  4544.188935686043
kw8322445 expenditure:  1005.7186591361896


In [69]:
print('Total expenditure:', expenditure_1 + expenditure_2 + expenditure_3 + expenditure_4)


Total expenditure: 6678.962366822542


## Part C: 

### Assume now that you have a budget constraint of $\$3000$ across these four keywords. Compute the optimal bid amounts and the corresponding expenditures for the keywords. Note this optimization in its most obvious form involves nonlinear functions and  nonlinear constraints. Decide on the initial value vector x0 and the solver on your own.  Hand-in: the optimal bid value, the corresponding profit and the corresponding total expenditure for each of the four  keywords.

In [125]:
k = [k1, k2, k3, k4]
k_name = ['kw8322228', 'kw8322392', 'kw8322393', 'kw8322445']

def total_profit(X):
    profit_k = []
    for i in range(len(k)):
        alpha, beta = est_params(k[i])
        bid_value = X[i]
        ltv_ = ltv[ltv['keyword'] == k_name[i]]['ltv'].iloc[0]
        conv_rate = ltv[ltv['keyword'] == k_name[i]]['conv.rate'].iloc[0]
        profit_k_i = profit(bid_value, alpha, beta, ltv_, conv_rate)
        profit_k.append(-profit_k_i)
    return np.sum(profit_k)

def total_expend(X):
    expenditure_k = []
    for i in range(len(k)):
        alpha, beta = est_params(k[i])
        bid_value = X[i]
        ltv_ = ltv[ltv['keyword'] == k_name[i]]['ltv'].iloc[0]
        conv_rate = ltv[ltv['keyword'] == k_name[i]]['conv.rate'].iloc[0]
        expenditure_k_i = expenditure(bid_value, alpha, beta)
        expenditure_k.append(expenditure_k_i)
    return np.sum(expenditure_k)

In [127]:
budget = 3000
budget_constraint_object = scipy.optimize.NonlinearConstraint(total_expend, 0, budget)
bounds_object = ((0,None),(0,None),(0,None),(0,None))
x0 = [0,0,0,0]

with_constraint = scipy.optimize.minimize(total_profit, x0=x0, method='trust-constr', bounds=bounds_object, constraints=budget_constraint_object)

In [173]:
def profit_expenditure_with_constraints(i):
    alpha,beta = est_params(k[i])
    ltv_ = ltv[ltv['keyword'] == k_name[i]]['ltv'].iloc[0]
    conv_rate = ltv[ltv['keyword'] == k_name[i]]['conv.rate'].iloc[0]
    bid_value = with_constraint.x[i]
    
    profit_k = profit(bid_value, alpha, beta, ltv_, conv_rate)
    expenditure_k = expenditure(bid_value, alpha, beta)
    return bid_value, profit_k, expenditure_k

In [183]:
bid_value, profit_k, expenditure_k = profit_expenditure_with_constraints(0)

print(k_name[0], ':')
print('Optimal bid value: ', bid_value)
print('Profit: ', profit_k)
print('Expenditure: ', expenditure_k)

kw8322228 :
Optimal bid value:  17.924260684647198
Profit:  3315.5073988114336
Expenditure:  673.208962950447


In [182]:
bid_value, profit_k, expenditure_k = profit_expenditure_with_constraints(1)

print(k_name[1], ':')
print('Optimal bid value: ', bid_value)
print('Profit: ', profit_k)
print('Expenditure: ', expenditure_k)

kw8322392 :
Optimal bid value:  8.118451293151496
Profit:  5487.232095363949
Expenditure:  894.5068508342152


In [181]:
bid_value, profit_k, expenditure_k = profit_expenditure_with_constraints(2)

print(k_name[2], ':')
print('Optimal bid value: ', bid_value)
print('Profit: ', profit_k)
print('Expenditure: ', expenditure_k)

kw8322393 :
Optimal bid value:  12.82828798861683
Profit:  4836.614288442379
Expenditure:  860.8853494724577


In [180]:
bid_value, profit_k, expenditure_k = profit_expenditure_with_constraints(3)

print(k_name[3], ':')
print('Optimal bid value: ', bid_value)
print('Profit: ', profit_k)
print('Expenditure: ', expenditure_k)

kw8322445 :
Optimal bid value:  3.775699809199757
Profit:  4286.482741767184
Expenditure:  571.3988328468971


In [184]:
print("Total profit: ", -1 * with_constraint.fun)
print("Total expenditure: ", with_constraint.constr[0])

Total profit:  17925.836524384948
Total expenditure:  [2999.9999961]


## Part D (Optional for Extra Credit): 

### Look back at the results that you got for Part A and Part B above. You should notice that across the four keywords, there is a relationship between LTV and alpha, a relationship between LTV and beta, and a relationship between LTV and the optimal bid. What are these relationships? What are the likely reasons for each relationship? Hand-in: Your identification of the nature of these relationships and your likely reasons. Please do not spend more than 10 minutes on this part. The relationship is easy to spot but the explanation is much less obvious. If one cannot propose the explanation in under 10 minutes, it is  unlikely to happen by spending more time on this. This question is on marketing and consumer psychology rather that statistics. Hint: it has to do with consumer segments and the fact that these are generic, non-branded keywords.

Keywords with higher LTVs likely have higher alphas because they tend to attract customers with inherently higher conversion rates. These customers are more engaged and more interested in the offerings associated with these keywords.

Keywords with higher LTVs tend to have lower beta values because high LTV customers are more likely to convert based on the inherent appeal of the product or service rather than the bid amount. Thus, the increase in conversion rate with increasing bid (sensitivity) is less pronounced.

Keywords with higher LTVs generally have higher optimal bids because investing more in bidding for these keywords is profitable, as these customers bring more value over their lifetime. Higher bids can maximize exposure and conversions for these valuable customer segments, leading to higher overall profits.

Since these are generic, non-branded keywords, they target different consumer segments. High LTV customers might be looking for quality or specific features rather than being price-sensitive. Therefore, their conversion rates are less dependent on bid amounts, justifying higher optimal bids due to their long-term value.

## Part E: Using generative AI tools

## (a)
A. curve_fit
C. scipy.optimize.minimize, set method='trust-constr', set up the bounds so the lower limit is zero, pass the optimize.NonlinearConstraint  method to scipy.optimize.minimize via the constraints argument

## (b)
ChatGpt

## (c)

In [None]:
# A
params, params_covariance = curve_fit(linear_model, x_data, y_data)

print("Fitted parameters:", params)

In [None]:
# C
result = optimize.minimize(
    objective_function,
    initial_guess,
    method='trust-constr',
    bounds=bounds,
    constraints=[budget_constraint]
)