# DellaVigna and Pope, 2018, "What Motivates Effort? Evidence and Expert Forecasts", Tables 5 and 6, NLS

#### Authors:  

- Massimiliano Pozzi (Bocconi University, pozzi.massimiliano@studbocconi.it)
- Salvatore Nunnari (Bocconi University, salvatore.nunnari@unibocconi.it)

#### Description:

The code in this Jupyter notebook replicates columns 2 and 4 in Panel A of Table 5; columns 3 and 6 in Panel B of Table 5; and Panel A of Table 6. The estimates in these panels and columns are derived with Non-Linear-Least-Squares.

This notebook was tested with the following packages versions:
- Pozzi:   (Anaconda 4.10.3 on Windows 10 Pro) : python 3.8.3, numpy 1.18.5, pandas 1.0.5, sklearn 1.0
- Nunnari: (Anaconda 4.10.1 on macOS 10.15.7): python 3.8.10, numpy 1.20.2, pandas 1.2.4, scipy 1.6.2

In [2]:
# Import the necessary libraries

import numpy as np
import pandas as pd
from   scipy.stats import norm
import scipy.optimize as opt

## 1. Data Cleaning and Data Preparation

We import the relevant dataset containing data on the number of buttonpresses in the different treatments and for different piece rates wage that the participants received when completing the task. We then create a series of variables that are needed for estimation.

In [3]:
# import the dataset

dt = pd.read_stata('../input/mturk_clean_data_short.dta')

# Create new variables needed for estimation:

# Create piece-rate payoffs per 100 button presses (p)

dt['payoff_per_100'] = 0
dt.loc[dt.treatment == '1.1', 'payoff_per_100'] = 0.01
dt.loc[dt.treatment == '1.2', 'payoff_per_100'] = 0.1
dt.loc[dt.treatment == '1.3', 'payoff_per_100'] = 0.0
dt.loc[dt.treatment == '2'  , 'payoff_per_100'] = 0.001
dt.loc[dt.treatment == '1.4', 'payoff_per_100'] = 0.04
dt.loc[dt.treatment == '4.1', 'payoff_per_100'] = 0.01
dt.loc[dt.treatment == '4.2', 'payoff_per_100'] = 0.01
dt.loc[dt.treatment == '6.2', 'payoff_per_100'] = 0.02
dt.loc[dt.treatment == '6.1', 'payoff_per_100'] = 1

# (alpha/a) create payoff per 100 to charity and dummy charity

dt['payoff_charity_per_100'] = 0
dt.loc[dt.treatment == '3.1', 'payoff_charity_per_100'] = 0.01
dt.loc[dt.treatment == '3.2', 'payoff_charity_per_100'] = 0.1
dt['dummy_charity'] = 0
dt.loc[dt.treatment == '3.1', 'dummy_charity'] = 1
dt.loc[dt.treatment == '3.2', 'dummy_charity'] = 1

# (beta/delta) create payoff per 100 delayed by 2 weeks and dummy delay

dt['delay_wks'] = 0
dt.loc[dt.treatment == '4.1', 'delay_wks'] = 2
dt.loc[dt.treatment == '4.2', 'delay_wks'] = 4
dt['delay_dummy'] = 0
dt.loc[dt.treatment == '4.1', 'delay_dummy'] = 1
dt.loc[dt.treatment == '4.2', 'delay_dummy'] = 1

# probability weights to back out curvature and dummy

dt['prob'] = 1
dt.loc[dt.treatment == '6.2', 'prob'] = 0.5
dt.loc[dt.treatment == '6.1', 'prob'] = 0.01
dt['weight_dummy'] = 0
dt.loc[dt.treatment == '6.1', 'weight_dummy'] = 1

# dummy for gift exchange

dt['gift_dummy'] = 0
dt.loc[dt.treatment == '10', 'gift_dummy'] = 1

# generating effort and log effort. authors round buttonpressed to nearest 100 value. If 0 set it to 25.

dt['buttonpresses'] = dt['buttonpresses'] + 0.1 # python rounds 50 to 0, while stata to 100. by adding a small value we avoid this mismatch
dt['buttonpresses_nearest_100'] = round(dt['buttonpresses'],-2)
dt.loc[dt.buttonpresses_nearest_100 == 0, 'buttonpresses_nearest_100'] = 25
dt['logbuttonpresses_nearest_100']  = np.log(dt['buttonpresses_nearest_100'])

## 2. Model and Estimation Technique (Section 2 in the Paper)

The model is one of costly effort, where an agent needs to choose the optimal effort (in this case the number of buttons pressed in a 10 minute session) to solve a simple tradeoff problem between disutility of effort and consumption utility derived from the consequent payment. On top of this simple problem, the authors use 18 different treatments to examine the effects of standard monetary incentives, behavioral factors (e.g., social preferences and reference dependence) and non-monetary incentives. We briefly examine here the benchmark model and the solutions found when using non-linear-least-squares.

The model for treatment 1.1, 1.2 and 1.3 can be written as follows:

$$ \max_{e\geq0} \;\; (s+p)e-c(e) $$

Where e is the number of buttons pressed, p is the piece-rate that varies across treatments, s is a parameter that captures intrinsic motivation, and c(e) is an heterogeneous convex cost function, either of power or exponential form:

$$ c(e)=\frac{ke^{1+\gamma}}{1+\gamma}exp(-\gamma \epsilon_j) \qquad \qquad c(e)=\frac{kexp(\gamma e)}{\gamma}exp(-\gamma \epsilon_j)$$

The variable &epsilon;<sub>j</sub> is normally distributed, &epsilon;<sub>j</sub>~N(0,&sigma;<sub>j</sub>), so that the additional noise term exp(-&gamma;&epsilon;<sub>j</sub>) has a lognormal distribution. The first order condition implied by the maximization problem after taking logs is the following:

$$ log(e_j)=\frac{1}{\gamma}[log(s+p)-log(k)]+\epsilon_j \qquad \qquad e_j=\frac{1}{\gamma}[log(s+p)-log(k)]+\epsilon_j $$ 

where the first equation assumes a power cost function and the second equation assumes an exponential cost function. By using non-linear-least-squares, our goal is to minimize the sum of squared distances between the observed effort and the optimal effort computed above, namely:

$$ \min \sum_{j=1}^J(y_j-f(x_j,\theta))^2 $$

where j is a generic individual observation, y is the observed effort, and f(x,&theta;) is the function which computes the optimal effort (the first order condition) depending on the data and a set of parameters &theta;.

## 3. Estimation

### Point Estimates and Standard Errors

We now compute the NLS estimates for Tables 5 and 6. Since there are many different specifications (5 columns for the power cost function and 5 for the exponential cost function), we preferred to write each function to compute f(x,&theta;) separately instead of writing a single function with many if statements. Hopefully, this will make each specification clearer.

In [4]:
# Estimate procedure for s, k, gamma in benchmark case with exp cost function

# Define the benchmark sample by creating dummies equal to one if in treatment 1.1, 1.2, 1.3 

dt['t1.1']= (dt['treatment']=='1.1').astype(int)
dt['t1.2']= (dt['treatment']=='1.2').astype(int)
dt['t1.3']= (dt['treatment']=='1.3').astype(int)
dt['dummy1']= dt['t1.1']+dt['t1.2']+dt['t1.3']

# Set the initial values for the optimization procedure and scalers for k and s in the exp cost function case

gamma_init_exp, k_init_exp, s_init_exp =  0.015645717, 1.69443, 3.69198
st_values_exp = [gamma_init_exp, k_init_exp, s_init_exp]
k_scaler_exp, s_scaler_exp = 1e+16, 1e+6

In [5]:
# Define the function that computes the optimal effort, what we called f(x,θ) above
# pay100 is the column we created containing the piece rate for different treatments
# g, k, s are the parameters to estimate (our θ vector). g stands for gamma.

def benchmarkExp(pay100, g, k, s):
    
    check1 = k/k_scaler_exp            # 'first'  component to compute f(x,θ). We call it check1 since it will enter a log, so we need to be careful with its value being > 0
    check2 = s/s_scaler_exp + pay100   # 'second' component to compute f(x,θ)
    
    f_x = (-1/g * np.log(check1) +1/g * np.log(check2))   # f(x,θ) written above
    
    return f_x

# Find the solution to the problem by non-linear least squares 

sol = opt.curve_fit(benchmarkExp,
                    dt.loc[dt['dummy1']==1].payoff_per_100,
                    dt.loc[dt['dummy1']==1].buttonpresses_nearest_100,
                    st_values_exp)

be54 = sol[0]                        # sol[0] is the array containing our estimates
se54 = np.sqrt(np.diagonal(sol[1]))  # sol[1] is a 3x3 variance-covariance matrix of our estimates

In [6]:
# Estimate procedure for s, k, gamma in benchmark case with power cost function

gamma_init_power, k_init_power, s_init_power =  19.8117987, 1.66306e-10, 7.74996
st_values_power = [gamma_init_power, k_init_power, s_init_power]
k_scaler_power, s_scaler_power = 1e+57,1e+6

# Define f(x,θ) in the power case

def benchmarkPower(pay100, g, k, s):
    
    check1= max(k/k_scaler_power, 1e-115)                  # since check1 will enter log it must be greater than zero
    check2= np.maximum(s/s_scaler_power + pay100, 1e-10)   # np.maximum computes the max element wise. We do not want a negative value inside log
    
    f_x = (-1/g * np.log(check1) +1/g * np.log(check2))
    
    return f_x

# Find the solution to the problem by non-linear least squares. 
# We find some differences with respect to the results found by the authors in the case of the power cost function. Even by 
# changing the initial guesses or minimization algorithm we still end up with slightly different results.

sol = opt.curve_fit(benchmarkPower,
                    dt.loc[dt['dummy1']==1].payoff_per_100,
                    dt.loc[dt['dummy1']==1].logbuttonpresses_nearest_100,
                    st_values_power)
bp52 = sol[0]                       # sol[0] is the array containing our estimates
sp52 = np.sqrt(np.diagonal(sol[1])) # sol[1] is a 3x3 variance-covariance matrix of our estimates

In [7]:
# We try using a different package to find the NLS estimates to see if we get closer to the authors' but without success. opt.least_squares
# takes as input directly the squared residuals, so we need to specify a different objective function. 

def benchmarkPower_least_squares(params):
    
    pay100 = np.array(dt.loc[dt['dummy1']==1].payoff_per_100)
    logbuttonpresses = np.array(dt.loc[dt['dummy1']==1].logbuttonpresses_nearest_100)
    g, k, s = params
    
    check1= max(k/k_scaler_power, 1e-115)
    check2= np.maximum(s/s_scaler_power + pay100, 1e-10)   
    
    f_x = 0.5*((-1/g * np.log(check1) +1/g * np.log(check2))-logbuttonpresses)**2
    
    return f_x

sol_least_square = opt.least_squares(benchmarkPower_least_squares,
                        st_values_power,
                        xtol=1e-15,
                        ftol=1e-15,
                        gtol=1e-15,
                        method='lm')
bp52_least_square = sol_least_square.x  # sol.x is the array containing our estimates

# We tried minimizing the objective function also using a general framework and not a package specific for non-linear-least-square
# When using opt.minimize we need to use as input directly the function to minimize, in this case the sum of squared residuals

def benchmarkPower_opt(params):
    
    pay100 = np.array(dt.loc[dt['dummy1']==1].payoff_per_100)
    logbuttonpresses = np.array(dt.loc[dt['dummy1']==1].logbuttonpresses_nearest_100)
    g, k, s = params
    
    check1= max(k/k_scaler_power, 1e-115)
    check2= np.maximum(s/s_scaler_power + pay100, 1e-10)   
    
    f_x = np.sum(0.5*((-1/g * np.log(check1) +1/g * np.log(check2))-logbuttonpresses)**2)
    
    return f_x

sol_opt = opt.minimize(benchmarkPower_opt,
                       st_values_power,
                       method='Nelder-Mead',
                       options={'maxiter': 2500})
bp52_opt = sol_opt.x

# We create a table and show the results we obtained

from IPython.display import display
pn = ["Curvature γ of cost function","Level k of cost of effort", "Intrinsic motivation s","Min obj. function"]
bp52_aut = [20.546,5.12e-13,3.17]
r1 = pd.DataFrame({'parameters':pn,'curve_fit':np.round([*bp52,2*benchmarkPower_opt(bp52)],3),
                   'least_square':np.round([*bp52_least_square,2*benchmarkPower_opt(bp52_least_square)],3),
                   'minimize_nd':np.round([*bp52_opt,2*benchmarkPower_opt(bp52_opt)],3),
                   'authors':np.round([*bp52_aut,2*benchmarkPower_opt(bp52_aut)],3)})

We obtain estimates with different minimization algorithms implemented by different functions available in the scipy package. Note that the estimates for k and s are very small in absolute value: in the table below, the the estimates of k must be divided by 1e+57 and the estimates for s by 1e+6. We also show the authors' estimates.

In [8]:
display(r1)

Unnamed: 0,parameters,curve_fit,least_square,minimize_nd,authors
0,Curvature γ of cost function,21.194,21.787,21.266,20.546
1,Level k of cost of effort,0.0,0.0,0.0,0.0
2,Intrinsic motivation s,1.377,0.096,1.331,3.17
3,Min obj. function,670.61,1112.339,670.61,672.387


As we can see, the least_square function performs the worst, while the curve_fit function (which uses the Levenberg-Marquardt minimization algorithm) and the minimize function (which uses the Nelder-Mead minimization algorithm) return a  similar value for the objective function but slightly different estimates for the parameters. Since, in this case, different minimization algorithms implemented with the same programming language (python) result in different estimates and/or values of the objective function, it is not surprising that there are small discrepancies between our estimates and the authors' estimates (the authors use the Gauss-Newton minimization algorithm implemented in Stata). At the same time, the differences are small in absolute value (and limited to the NLSS estimation method, there are no discrepancies when using GMM) and the estimated values of k and s are always statistically indistinguishable from 0. More importantly, the economic implications of the estimated parameters and the qualitative conclusions on what motivates effort in the experiment are unaffected by the choice of programming language and minimization algorithm. Below, we report the results we obtained with the curve_fit function since this also returns an estimate for the variance-covariance matrix for the parameters. 

Next, we replicate Panel B of Table 5, where we estimate all parameters of interest without the weight on probability.

In [9]:
# Allnoweight Exp. Create dummies for this specification

dt['t3.1']= (dt['treatment']=='3.1').astype(int)
dt['t3.2']= (dt['treatment']=='3.2').astype(int)
dt['t4.1']= (dt['treatment']=='4.1').astype(int)
dt['t4.2']= (dt['treatment']=='4.2').astype(int)
dt['t10'] = (dt['treatment']=='10').astype(int)
dt['samplenw']= dt['dummy1']+dt['t3.1']+dt['t3.2']+dt['t4.1']+dt['t4.2']+dt['t10']

# Define the initial guesses for the exponential cost function case

alpha_init, a_init, beta_init, delta_init, gift_init = 0.003, 0.13, 1.16, 0.75, 5e-6
stvale_spec = [alpha_init, a_init, gift_init, beta_init, delta_init]

In [10]:
# Define the f(x,θ) to estimate all parameters but the probability weight in the exp case

# xdata is the vector containing the explanatory variables:

# gd is gift dummy
# dd is delay dummy
# dw is delay weeks
# paychar is pay in charity treatment
# dc is dummy charity

# parameters:

# g, k, s are the same parameters from before
# alpha is the pure altruism coefficient
# a is the warm glow coefficient
# gift is the gift exchange coefficient Δs
# beta is the present bias paramater
# delta is the (weekly) discount factor

def noweightExp(xdata, g, k, s, alpha, a, gift, beta, delta):
    
    pay100 = xdata[0]
    gd = xdata[1]
    dd = xdata[2]
    dw = xdata[3]
    paychar = xdata[4]
    dc = xdata[5]
    
    check1 = k/k_scaler_exp
    check2 = s/s_scaler_exp + gift*0.4*gd + (beta**dd)*(delta**dw)*pay100 + alpha*paychar +a*0.01*dc
    f_x = (-1/g * np.log(check1) + 1/g*np.log(check2))
    
    return f_x

# Find the solution to the problem by non-linear least squares 

st_valuesnoweight_exp = np.concatenate((st_values_exp,stvale_spec)) # starting values

args = [dt.loc[dt['samplenw']==1].payoff_per_100, dt.loc[dt['samplenw']==1].gift_dummy, dt.loc[dt['samplenw']==1].delay_dummy,
        dt.loc[dt['samplenw']==1].delay_wks, dt.loc[dt['samplenw']==1].payoff_charity_per_100, dt.loc[dt['samplenw']==1].dummy_charity]

sol = opt.curve_fit(noweightExp, 
                    args,
                    dt.loc[dt['samplenw']==1].buttonpresses_nearest_100,
                    st_valuesnoweight_exp)
be56 = sol[0]
se56 = np.sqrt(np.diagonal(sol[1]))

In [11]:
# Define the f(x,θ) to estimate all parameters but the probability weight in the power case

def noweightPower(xdata, g, k, s, alpha, a, gift, beta, delta):
    
    pay100 = xdata[0]
    gd = xdata[1]
    dd = xdata[2]
    dw = xdata[3]
    paychar = xdata[4]
    dc = xdata[5]
    
    check1= max(k/k_scaler_power, 1e-115)
    check2= np.maximum(s/s_scaler_power + gift*0.4*gd + (beta**dd)*(delta**dw)*pay100 + alpha*paychar + a*0.01*dc, 1e-10)  
    f_x = (-1/g * np.log(check1) + 1/g*np.log(check2))
    
    return f_x

# Find the solution to the problem by non-linear least squares 

st_valuesnoweight_power = np.concatenate((st_values_power,stvale_spec)) # starting values

sol = opt.curve_fit(noweightPower, 
                    args,
                    dt.loc[dt['samplenw']==1].logbuttonpresses_nearest_100,
                    st_valuesnoweight_power)
bp53 = sol[0] 
sp53 = np.sqrt(np.diagonal(sol[1])) 

In [12]:
# Create and save the dataframe for table 5 NLS estimates. We leave standard errors for all parameters instead of confidence intervals for some.
# Point estimates for power case do not coincide precisely as explained above. Standard errors do not coincide precisely because of 
# the differences in the point estimates and because we leave here non-robust standard errors provided by curve_fit. To see an implementation of 
# the formula for robust standard errors please refer to the python or julia notebooks for table_1 of augenblick-rabin or table_1 of bruhin-fehr-schunk.
# The formula is the same as in the cited notebooks without considering the clustering at the individual level.

from decimal import Decimal

params_name = ["Curvature γ of cost function", "Level k of cost of effort", "Intrinsic motivation s","Social preferences α",
                "Warm glow coefficient a","Gift exchange Δs", "Present bias β","(Weekly) discount factor δ"]

be5 = ['{0:.3}'.format(Decimal(be54[0])), '{0:.2e}'.format(Decimal(be54[1]/1e+16)), '{0:.2e}'.format(Decimal(be54[2]/1e+6)),
       round(be56[3],3), round(be56[4],3), '{0:.2e}'.format(Decimal(be56[5])), round(be56[6],2), round(be56[7],2)]
se5 = ['{0:.3}'.format(Decimal(se54[0])), '{0:.2e}'.format(Decimal(se54[1]/1e+16)), '{0:.2e}'.format(Decimal(se54[2]/1e+6)),
       round(se56[3],3), round(se56[4],3), '{0:.2e}'.format(Decimal(se56[5])), round(se56[6],2), round(se56[7],2)]

bp5 = ['{0:.5}'.format(Decimal(bp52[0])), '{0:.2e}'.format(Decimal(bp52[1]/1e+57)), '{0:.2e}'.format(Decimal(bp52[2]/1e+6)),
       round(bp53[3],4), round(bp53[4],4), '{0:.2e}'.format(Decimal(bp53[5])), round(bp53[6],4), round(bp53[7],4)]
sp5 = ['{0:.5}'.format(Decimal(sp52[0])), '{0:.2e}'.format(Decimal(sp52[1]/1e+57)), '{0:.2e}'.format(Decimal(sp52[2]/1e+6)),
       round(sp53[3],4), round(sp53[4],4), '{0:.2e}'.format(Decimal(sp53[5])), round(sp53[6],4), round(sp53[7],4)]

t5 = pd.DataFrame({'parameters':params_name,'power_est':bp5,'power_se':sp5,'exp_est':be5,'exp_se':se5})
t5.to_csv('../output/table5NLS_python.csv')

print('Table 5: non-linear-least-squares estimates of behavioural parameters')
display(t5)

Table 5: non-linear-least-squares estimates of behavioural parameters


Unnamed: 0,parameters,power_est,power_se,exp_est,exp_se
0,Curvature γ of cost function,21.194,7.399,0.0156,0.00415
1,Level k of cost of effort,5.95e-72,3.3300000000000005e-70,1.71e-16,1.49e-15
2,Intrinsic motivation s,1.38e-06,4.93e-06,3.72e-06,9.16e-06
3,Social preferences α,0.0132,0.0295,0.004,0.011
4,Warm glow coefficient a,0.2628,0.2869,0.143,0.143
5,Gift exchange Δs,3.17e-05,8e-05,2.35e-05,4.82e-05
6,Present bias β,1.6123,2.055,1.24,1.3
7,(Weekly) discount factor δ,0.75,0.2923,0.75,0.24


In [13]:
# Check for possible mistakes in the power case since authors' estimates are different from ours.
# We compare the sum of squared errors using our estimates and the authors'.
# By running the "1_NLS_main.do" do-file provided in the replication code they obtain an sse = 1542.141

# define the function that computes the sse

def noweight_sse(xdata, g, k, s, alpha, a, gift, beta, delta):

    pay100 = xdata[0]
    gd = xdata[1]
    dd = xdata[2]
    dw = xdata[3]
    paychar = xdata[4]
    dc = xdata[5]
    
    check1= max(k/k_scaler_power, 1e-115)
    check2= np.maximum(s/s_scaler_power + gift*0.4*gd + (beta**dd)*(delta**dw)*pay100 + alpha*paychar + a*0.01*dc, 1e-10)  
    f_x = (-1/g * np.log(check1) + 1/g*np.log(check2))
    sse = np.sum((f_x-dt.loc[dt['samplenw']==1].logbuttonpresses_nearest_100)**2) 
    
    return sse

nwest_aut = [20.51815, 5.17e-13, 3.26977, 0.0064462, 0.1818249, 0.0000204, 1.357934, 0.7494928] # authors'estimates
sse_our = round(noweight_sse(args,*bp53),3)
sse_aut = round(noweight_sse(args,*nwest_aut),3)

print('The sum of squared errors using our estimates is: ' + str(sse_our))
print("The sum of squared errors using the authors'estimates is: " + str(sse_aut))
print('The small difference between the Stata sse and the sse computed by us are most likely due to rounding.')

The sum of squared errors using our estimates is: 1539.446
The sum of squared errors using the authors'estimates is: 1543.057
The small difference between the Stata sse and the sse computed by us are most likely due to rounding.


Finally, we replicate the estimates from Panel A in Table 6.

In [14]:
# Create the sample used for Table 6 panel A
    
dt['t6.1']= (dt['treatment']=='6.1').astype(int)
dt['t6.2']= (dt['treatment']=='6.2').astype(int)
dt['samplepr']= dt['dummy1']+dt['t6.1']+dt['t6.2']

In [15]:
# Define f(x,θ) for the exponential cost function. Here we assume curvature of utility over piece rate = 1, (Column 4)
# wd is the weight_dummy
# prob is the prob_dummy
# g, k and s are the same parameters as before
# p_weight is the probability weighting coefficient under the assumption of linear value function in this case 
# curv is the curvature of the value function. Here curv = 1

def probweight4Exp(xdata, g, k, s, p_weight):
    
    pay100 = xdata[0]
    wd = xdata[1]
    prob = xdata[2]
    
    check1 = k/k_scaler_exp
    check2 = s/s_scaler_exp + p_weight**wd*prob*pay100
    
    f_x = (-1/g * np.log(check1) + 1/g*np.log(check2))
    
    return f_x

# Find the solutions for column 4 exponential cost function

prob_weight_init = [0.2]
st_valuesprobweight_exp = np.concatenate((st_values_exp,prob_weight_init))
args = [dt.loc[dt['samplepr']==1].payoff_per_100, dt.loc[dt['samplepr']==1].weight_dummy, dt.loc[dt['samplepr']==1].prob]

sol = opt.curve_fit(probweight4Exp,
                    args,
                    dt.loc[dt['samplepr']==1].buttonpresses_nearest_100,
                    st_valuesprobweight_exp)
be64 = sol[0] 
se64 = np.sqrt(np.diagonal(sol[1])) 

# Define f(x,θ). Here we assume curvature of utility over piece rate = 0.88, Column (5)

def probweight5Exp(xdata, g, k, s, p_weight):
    
    pay100 = xdata[0]
    wd = xdata[1]
    prob = xdata[2]
    
    check1=k/k_scaler_exp
    check2=s/s_scaler_exp + p_weight**wd*prob*pay100**0.88
    f_x = (-1/g * np.log(check1) + 1/g*np.log(check2))
    
    return f_x

# Find the solutions for column 5 exponential cost function

sol = opt.curve_fit(probweight5Exp,
                    args,
                    dt.loc[dt['samplepr']==1].buttonpresses_nearest_100,
                    st_valuesprobweight_exp)
be65 = sol[0]
se65 = np.sqrt(np.diagonal(sol[1])) 

# Define f(x,θ). Here we we also estimate the curvature of utility over piece rate, Column (6)

def probweight6Exp(xdata, g, k, s, p_weight, curv):
    
    pay100 = xdata[0]
    wd = xdata[1]
    prob = xdata[2]
    
    check1=k/k_scaler_exp
    check2=s/s_scaler_exp + p_weight**wd*prob*pay100**curv
    
    f_x = (-1/g * np.log(check1) + 1/g*np.log(check2))
    
    return f_x

# Find the solutions for column 6 exponential cost function

curv_init = [0.5]
st_valuesprobweight6_exp = np.concatenate((st_valuesprobweight_exp,curv_init))

sol = opt.curve_fit(probweight6Exp,
                    args,
                    dt.loc[dt['samplepr']==1].buttonpresses_nearest_100,
                    st_valuesprobweight6_exp)
be66 = sol[0]
se66 = np.sqrt(np.diagonal(sol[1])) 

In [16]:
# We do the same for the power cost function specification

# column 4

def probweight4Power(xdata, g, k, s, p_weight):
    
    pay100 = xdata[0]
    wd = xdata[1]
    prob = xdata[2]
    
    check1 = max(k/k_scaler_power, 1e-115)
    check2 = np.maximum(s/s_scaler_power + p_weight**wd*prob*pay100, 1e-10)
    f_x = (-1/g * np.log(check1) + 1/g*np.log(check2))
    
    return f_x

st_valuesprobweight_power = np.concatenate((st_values_power,prob_weight_init))

sol = opt.curve_fit(probweight4Power,
                    args,
                    dt.loc[dt['samplepr']==1].logbuttonpresses_nearest_100,
                    st_valuesprobweight_power)
bp61 = sol[0]
sp61 = np.sqrt(np.diagonal(sol[1]))

# column 5

def probweight5Power(xdata, g, k, s, p_weight):
    
    pay100 = xdata[0]
    wd = xdata[1]
    prob = xdata[2]
    
    check1 = max(k/k_scaler_power, 1e-115)
    check2 = np.maximum(s/s_scaler_power+p_weight**wd*prob*pay100**0.88, 1e-10)
    f_x = (-1/g * np.log(check1) + 1/g*np.log(check2))
    
    return f_x

sol = opt.curve_fit(probweight5Power,
                    args,
                    dt.loc[dt['samplepr']==1].logbuttonpresses_nearest_100,
                    st_valuesprobweight_power)
bp62 = sol[0]
sp62 = np.sqrt(np.diagonal(sol[1])) 

# column 6

def probweight6Power(xdata, g, k, s, p_weight, curv):
    
    pay100 = xdata[0]
    wd = xdata[1]
    prob = xdata[2]
    
    check1 = max(k/k_scaler_power, 1e-115)
    check2 = np.maximum(s/s_scaler_power+p_weight**wd*prob*pay100**curv, 1e-10)
    f_x = (-1/g * np.log(check1) + 1/g*np.log(check2))
    
    return f_x

st_valuesprobweight6_power = np.concatenate((st_valuesprobweight_power,curv_init))

sol = opt.curve_fit(probweight6Power,
                    args,
                    dt.loc[dt['samplepr']==1].logbuttonpresses_nearest_100,
                    st_valuesprobweight6_power)
bp63 = sol[0]
sp63 = np.sqrt(np.diagonal(sol[1]))

In [17]:
# Create the dataframe relative to table 6 and save it as a csv file

# To create arrays of the same length
bp61 = np.append(bp61,1)
sp61 = np.append(sp61,0)
bp62 = np.append(bp62,0.88)
sp62 = np.append(sp62,0)
be64 = np.append(be64,1)
se64 = np.append(se64,0)
be65 = np.append(be65,0.88)
se65 = np.append(se65,0)

pnames = ["Curvature γ of cost function", "Level k of cost of effort", "Intrinsic motivation s", "Probability weighting π (1%) (in %)",
          "Curvature of utility over piece rate"]

t6 = pd.DataFrame({'parameters':pnames,'p_est1':bp61,'p_se1':sp61,'p_est2':bp62,'p_se2':sp62,
                   'p_est3':bp63,'p_se3':sp63,
                   'e_est4':be64,'e_se4':se64,'e_est5':be65,'e_se5':se65,'e_est6':be66,
                   'e_se6':se66})

t6.to_csv('../output/table6_python.csv', index=False)

In [18]:
# Print table 6

# Formatting nicely the results:

columns = [bp61, sp61, bp62, sp62, bp63, sp63]
vs = []
for col in columns:
    col = [round(col[0],2), '{0:.2e}'.format(Decimal(col[1]/1e+57)), '{0:.2e}'.format(Decimal(col[2]/1e+6)),
           round(col[3],2), round(col[4],2)]
    vs.append(col)

columns = [be64, se64, be65, se65, be66, se66]
for col in columns:
    col = [round(col[0],4), '{0:.2e}'.format(Decimal(col[1]/1e+16)), '{0:.2e}'.format(Decimal(col[2]/1e+6)),
           round(col[3],2), round(col[4],2)]
    vs.append(col)
    
t6 = pd.DataFrame({'parameters':pnames,'p_est1':vs[0],'p_se1':vs[1],'p_est2':vs[2],'p_se2':vs[3],'p_est3':vs[4],'p_se3':vs[5],
                   'e_est4':vs[6],'e_se4':vs[7],'e_est5':vs[8],'e_se5':vs[9],'e_est6':vs[10], 'e_se6':vs[11]})

# There are some differences in the standard errors since we leave here non robust standard errors provided by curve_fit.
# Point estimates for the power cost function are again a little different from the authors', while they are the same for the 
# exponential cost function

print('Table 6: Estimate of model on effort in three benchmark treatments and two probability treatments')
display(t6)
print('Nr. of observations: ' + str('{0:,}'.format(len(dt.loc[dt['samplepr']==1].logbuttonpresses_nearest_100))))

Table 6: Estimate of model on effort in three benchmark treatments and two probability treatments


Unnamed: 0,parameters,p_est1,p_se1,p_est2,p_se2,p_est3,p_se3,e_est4,e_se4,e_est5,e_se5,e_est6,e_se6
0,Curvature γ of cost function,20.95,5.78,18.96,5.27,19.64,17.32,0.0134,0.0026,0.0119,0.0023,0.0072,0.0029
1,Level k of cost of effort,3.89e-71,1.7e-69,1.9600000000000002e-64,7.68e-63,1.01e-66,1.3600000000000001e-64,2.42e-14,1.29e-13,7.5e-13,3.56e-12,5.46e-08,3.7e-07
2,Intrinsic motivation s,1.57e-06,4.23e-06,5.96e-06,1.47e-05,3.75e-06,4.17e-05,1.64e-05,2.4e-05,5.55e-05,7.2e-05,0.00314,0.0075
3,Probability weighting π (1%) (in %),0.19,0.17,0.37,0.3,0.3,1.57,0.24,0.14,0.47,0.25,4.3,5.46
4,Curvature of utility over piece rate,1.0,0.0,0.88,0.0,0.92,0.93,1.0,0.0,0.88,0.0,0.47,0.24


Nr. of observations: 2,787
