Stata vs. Matlab Code #8

pmgblz · 2019-04-02T16:39:01Z

I have a question regarding the Stata vs. the Matlab code. The two different programs seem to give different results for me. I am using your example code and example data and the same variables for both commands (the difference in means is the same). Discrepancies also seem to persist when using a larger number of bootstrap iterations in both programs (e.g. 10,000).

To generate reproducible code, please see below for the code I ran using your first example:

Here is the code I ran in Stata (from mhtexp_examples.do):

clear all
insheet using data.csv, comma names
//Creating outcome variable
gen amountmat = amount * (1+ratio)
gen groupid = (redcty==1 & red0 == 1) + (redcty==0 & red0 == 1)*2 + (redcty==0 & red0 == 0)*3 + (redcty==1 & red0 == 0)*4
replace groupid = . if groupid == 0

mata: mata mlib index

// help mhtexp

//Example 1: Hypothesis testing with multiple outcomes: 
//  We consider four outcome variables: response rate, dollars given not including
//  match, dollars given including match, and amount change.
mhtexp gave amount amountmat amountchange, treatment(treatment) 
// results are the same when explicitly using bootstrap(3000) as it is the default
mhtexp gave amount amountmat amountchange, treatment(treatment) bootstrap(3000)

This is the code I ran in Matlab (from mhtexp_examples.m):

data = importdata('data.csv');
data = data.data;               % read the dataset
B=3000;                         % the number of simulated samples


%% Hypothesis testing with multiple outcomes: 
% We consider four outcome variables: response rate, dollars given not including
% match, dollars given including match, and amount change.
amountmat = data(:,1).*(data(:,10)+ones(size(data,1),1));             % dollars raised per letter including match
Y = [data(:,[12,1]) amountmat data(:,35)];     % the matrix of outcomes
D = data(:,8);                                 % the vector of treatment statuses
sub = ones(size(D,1),1);                       % the subgroup ID's
numoc = size(Y,2);                             % the number of outcomes
numsub = size(unique(sub),1);                  % the number of subgroups
numg = size(unique(D),1)-1;                    % the number of treatment groups (not including the control group)
combo = [zeros(numg,1) (1:numg)'];             % We compare each treatment to the control.
numpc =size(combo,1);                          % the number of pairs of treatment (control) groups of interest
select = ones(numoc,numsub,numpc);             % We are interested in all the numoc*numsub*numpc hypotheses.
[example1] = mhtexp(B,Y,sub,D,combo,select)

I have also attached screenshots of the results I am getting. Could you please clarify?

Thank you!

Stata Output:

Matlab Output:

The text was updated successfully, but these errors were encountered:

seidelj · 2019-04-02T17:37:58Z

Hi, I wrote this code for Yang Xu, who is the corresponding author on the paper detailing this method. Others have raised similar questions. Here is quoted text from Yang on the issue

It seems like even if you set the same seed, Stata and MATLAB have different initial states. So it would be a very involved process to generate the same simulated the samples. The real discrepancy lies in Table 4. I think the 4th and the 6th hypotheses have almost the same p-value for single testing. So a little bit discrepancy between the single p-values due to different “seeds" may lead to different orderings in multiple testing. But this issue will only show up when the single testing p-values are very large so that it won’t qualitatively change our results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stata vs. Matlab Code #8

Stata vs. Matlab Code #8

pmgblz commented Apr 2, 2019

seidelj commented Apr 2, 2019

Stata vs. Matlab Code #8

Stata vs. Matlab Code #8

Comments

pmgblz commented Apr 2, 2019

seidelj commented Apr 2, 2019