new command - ieddtable - neat tables for diff-in-diff regressions #135

kbjarkefur · 2018-04-26T15:52:31Z

This new command was suggested by Esteban J. Quiñones (@estebanjq).

There are many commands (estout, outreg, etc.) that can create neat tables from regressions. This command will target the diff-in-diff (or double difference) regression model only and create tables tailored for exactly this model specification.

We want the table be of the format shown in the mock-up below:

The command is expected to be specified on the following format:

ieddtable varlist, dummies(D T DT)

Where varlist is a list of outcome variables and D is the treatment dummy, T is the time dummy and DT is the interaction of the two. The command will test that the regression is valid in the sense that there is at least some observations in each group, that D * T actually equals DT and so fourth.

Only the last column showing the double difference mean will be taken from the diff-in-diff regression. The other four means will be taken from regular means calculated separately. The reason for this is that we want to allow the user to include fixed effects, control variables etc., and when that is used the intercept, D, T D*T dummy betas can no longer be used by themselves to calculate means in the four groups. We do not want the means in the four leftmost columns in the mock-up to be impacted by fixed effects and control variables as it can create odd values such as negative harvest etc. A note will be included at the bottom of the table when control variables or fixed effects are used, which will explain why the mean in the fifth column cannot be calculated from the first four when FE and control variables are used.

The command should also be able to display the number of observations and the variation for each group for each outcome variable in addition to only the mean as in the mock-up. We do not know yet what should be the default. The number of observations should also be possible to display at the bottom of the table for each group, and then the command will test that the number is the same for that group in all outcome variables.

The table will be possible to export in CSV or in TeX.
The default labels in the table will be those in the mock-up, but all of them should be possible to specify manually.
The variable labels for the outcome variables should be possible to set to varname, var label or to be specified manually.
We have not decided yet if we want stars in a separate column.
Star intervals should be possible to set manually.

This is just a first draft of the specifications for this command. Please comment blow if you have any additional options you want us to include.

bbdaniels · 2018-05-02T21:46:01Z

For first differences I have previously written a command with a similar reporting layout – you can see it at https://github.com/worldbank/stata/tree/development/dev/Statistics/RandomTrialRegression and the corresponding formatted Table 1 of http://science.sciencemag.org/content/354/6308/aaf7384/tab-figures-data

kbjarkefur · 2018-05-03T09:29:21Z

That's a really cool command. Can you post picture here in thread of Table 1 in the science link? It requires log in to view (might log in automatically when browsing from WB IP).

It is in many sense similar to what we want to do, but I think we should write our own for the following reasons (this is not a list against your command, it is just my reflections when comparing your implementation to the one I had envisioned for ieddtable that I wanted to documents somewhere):

We want something that output in both LaTeX, and in Excel as well as output in Stata's result window. Your command needs some work to not only write to Excel.
The way you write to Excel requires putexcel. That would require us to change the lowest level of Stata needed for ietoolkit which we do not yet have an intention to do. (Everyone in well funded institutions have newer versions of Stata, but that's not the only audience we are targeting)
We want to test something on this command that we intend to use for a re-write of iebaltab. That re-write would make the section where stats are generated output type agnostic. As in, that section only creates a matrix with all output values, and then different sections for different outputs types (Excel, LaTeX etc.) reads that matrix. The code for iebaltab is starting to get very difficult to follow as we are writing the output in between the code that generates the stats.

@luizaandrade , let me know what you think!

We will let you know if we intend to borrow something from your code.

kbjarkefur · 2018-05-03T15:32:39Z

In commit f26dd5c I have made a quick but documented draft of what I meant with the stats section being agnostic to the output format by creating a matrix of all stats that then can be passed to sub-command that creates the outputs

bbdaniels · 2018-05-03T15:35:35Z

Totally agree with all of the above! The reason I did this one using putexcel is that I wanted to write confidence intervals and CIs with ( ) so it couldn't go in a matrix. I have since decided that it is a terrible idea especially since putexcel has major backwards compatibility issues even between Stata 13 and 14.

You may also be interested to look at the regression output handling commands I wrote recently for working with CSV tables in TeX if you haven't already (mat2csv and reg2csv here). These leave all the line styling out currently but have the useful convention of building two underlying matrices: results and results_STARS, which can be sensibly looped over to add non-numeric characters to a table like this before exporting to CSV.

estebanjq · 2018-05-03T16:35:41Z

Wow @bbdaniels , rctreg looks like a great command, thanks for sharing it. Hopefully, ietoolkit can further generalize it across input and output formats, as well as providing additional flexibility.

The option of being able to present SEs or CIs in an appropriate format would certainly be appreciated.

One thing I mentioned in a previous (off thread) conversation with @luizaandrade and @kbjarkefur is that it would be great if a single command could handle and present the relevant information for single differences, single differences controlling for group means at baseline (i.e., ancova), and difference in differences (aka, double difference).

Looking forward to seeing the fruits of this labor!

kbjarkefur · 2018-05-03T16:52:30Z

Showing the first difference instead of simple means for all group was also the main feedback when we showed this to some of the economists at our unit. So that will definitely be included. Either as default or as an option, we have not decided yet what will be the default.

- The first difference are here calcualted properly - The N are included - se also for mean instead of sd - All stats are restricted to the same sample

luizaandrade · 2018-05-07T18:31:44Z

I've presented the idea for this command in our lightning seminar, here's some of the feedback:

When there's attrition, we should only include complete observations, i.e., those in the double difference regression, in the table. I think we can also add an option to include all observations, as long as the complete observations are the default.
It was suggested that it would be more intuitive to display the baseline levels, the baseline to endline change and then the double difference. That would look something like the figure below. The argument for this is that it may be confusing for a less technical audience if the dd coefficient is not the difference of the means displayed. This could either be an option or the default, and we would probably need to give some thought as to whether we want the two main columns to be the rounds or the treatment arms (i.e. Control and Treatment with subcolumns Baseline and Endline, or Baseline and Endline with subcolumns for Control and Treatment).
People like both the single difference and the ANCOVA options. Single difference would be something like the image below, and ANCOVA would be similar to diff-in-diff, but with a different title for the regression coefficient in the last column.

estebanjq · 2018-05-08T20:32:04Z

Sounds good @luizaandrade . I find the means more informative and intuitive than showing the differences, but I can imagine how others would feel differently. It is fair to that the options to show either (or both, i.e. means followed by the differences) may be quite useful regardless of the default that is chosen.

kbjarkefur · 2018-10-19T22:02:01Z

This command was merged to the development branch in merge #159. We will finalize this version of ietoolkit and submit to SSC.

kbjarkefur · 2018-10-19T22:02:51Z

@estebanjq , thanks again for suggesting this command!

Please let us know if you do NOT want to be mentioned in the help file where we currently give you credit for suggesting this command.

We are looking forward to any feedback you might have once this is published, unless you want to sync the files from this repository and try out the command before it is online on SSC. Let us know if you want any advice on how to do that.

We have not implemented any more advanced estimation model yet like ANCOVA as you suggested. We might do that later, but we will first collect feedback of the first version before decided on what to do next with this command.

Thanks again!

estebanjq · 2018-10-20T07:17:02Z

@kbjarkefur

It is great to hear that this idea has come to fruition.

Please feel free to mention me as you see fit. FYI, my affiliation is the University of Wisconsin-Maryland.
I can wait for it to be available via SSC, unless you think that will take a long time. If so, let me know the best way to sync the files.
It makes sense to start with the most straightforward approach. Additional capabilities can be added later on.

Thanks again for creating this public good!!!

kbjarkefur · 2018-10-20T22:47:43Z

Great! Thanks!

I am spelling your name Esteban J. Quinnones as the ñ does not display properly in earlier versions of Stata. I hope that is OK. When I went to your GitHub profile page to get the spelling of your name I saw that your affiliation was listed there as University of Wisconsin-Madison, and unless you are doing some cross program with University of Maryland then I think that is what you meant to write.

We intend to submit the new version next week and then it usually take a day or two, or at least not more than a week.

estebanjq · 2018-10-20T23:04:06Z

Kristoffer, Yes, I think we can thank autocorrect for Maryland. It is definitely University of Wisconsin - Madison. If you can’t spell my surname as Quiñones, then just leave it as Quinones (replace the ñ with an n). I hurry on access to the package. Thanks again, Esteban

…

On Oct 20, 2018, at 5:47 PM, Kristoffer Bjärkefur ***@***.***> wrote: Great! Thanks! I am spelling your name Esteban J. Quinnones as the ñ does not display properly in earlier versions of Stata. I hope that is OK. When I went to your GitHub profile page to get the spelling of your name I saw that your affiliation was listed there as University of Wisconsin-Madison, and unless you are doing some cross program with University of Maryland then I think that is what you meant to write. We intend to submit the new version next week and then it usually take a day or two, or at least not more than a week. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

luizaandrade · 2018-10-20T23:11:02Z

@estebanjq, publishing the command on SSC may take a few more days, but you can already use the version in the develop branch. You can find instruction here on how to use it.

estebanjq · 2018-10-21T01:03:54Z

Thanks Luiza!

…

On Oct 20, 2018, at 6:11 PM, Luiza ***@***.***> wrote: @estebanjq, publishing the command on SSC may take a few more days, but you can already use the version in the develop branch. You can find instruction here on how to use it. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

Version 6.0 - merge from Develop Addressing issue #135, , #137, #139, #141, #142. #145, #146, #153. #158 and partially addressing #152.

kbjarkefur · 2018-10-29T00:14:56Z

ietoolkit is now updated and ieddtab is now released. Type adoupdate, update to install all available updates to all SSC commands you have previously installed,or type ssc install ietoolkit, replace to update only ietoolkit.

I will now close this issue.

bajwaih · 2019-04-22T08:27:58Z

Thank you all it is great help

kbjarkefur · 2019-04-22T10:50:36Z

We are happy you found it helpful!

kbjarkefur added the new command label Apr 26, 2018

kbjarkefur assigned kbjarkefur and luizaandrade Apr 26, 2018

kbjarkefur added a commit that referenced this issue May 4, 2018

ieddtable : proper 1st diff, and N. Issue #135

a17cbd6

- The first difference are here calcualted properly - The N are included - se also for mean instead of sd - All stats are restricted to the same sample

luizaandrade mentioned this issue May 9, 2018

Feature request: Attrition table #136

Open

kbjarkefur added this to Issues in progress in Version 6.0 Sep 27, 2018

kbjarkefur moved this from Issues in progress to Issues waiting to be tested in Version 6.0 Oct 17, 2018

kbjarkefur moved this from Issues waiting to be tested to Update documentation in Version 6.0 Oct 17, 2018

kbjarkefur added a commit that referenced this issue Oct 18, 2018

ieddtab : change name of command #135

ba37003

kbjarkefur added the resolved but not yet published Issue is fixed, but not yet published on SSC label Oct 19, 2018

kbjarkefur mentioned this issue Oct 19, 2018

Develop ieddtab - new command #159

Merged

kbjarkefur added a commit that referenced this issue Oct 20, 2018

ieddtab help : update credit #135

15731e2

kbjarkefur moved this from Update documentation to Issues ready to be published in Version 6.0 Oct 22, 2018

kbjarkefur added a commit that referenced this issue Oct 22, 2018

Merge pull request #164 from worldbank/develop

e4ee5f4

Version 6.0 - merge from Develop Addressing issue #135, , #137, #139, #141, #142. #145, #146, #153. #158 and partially addressing #152.

kbjarkefur closed this as completed Oct 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

new command - ieddtable - neat tables for diff-in-diff regressions #135

new command - ieddtable - neat tables for diff-in-diff regressions #135

kbjarkefur commented Apr 26, 2018 •

edited

bbdaniels commented May 2, 2018

kbjarkefur commented May 3, 2018

kbjarkefur commented May 3, 2018

bbdaniels commented May 3, 2018

estebanjq commented May 3, 2018

kbjarkefur commented May 3, 2018

luizaandrade commented May 7, 2018 •

edited

estebanjq commented May 8, 2018

kbjarkefur commented Oct 19, 2018

kbjarkefur commented Oct 19, 2018

estebanjq commented Oct 20, 2018

kbjarkefur commented Oct 20, 2018

estebanjq commented Oct 20, 2018 via email

luizaandrade commented Oct 20, 2018

estebanjq commented Oct 21, 2018 via email

kbjarkefur commented Oct 29, 2018

bajwaih commented Apr 22, 2019

kbjarkefur commented Apr 22, 2019

new command - ieddtable - neat tables for diff-in-diff regressions #135

new command - ieddtable - neat tables for diff-in-diff regressions #135

Comments

kbjarkefur commented Apr 26, 2018 • edited

bbdaniels commented May 2, 2018

kbjarkefur commented May 3, 2018

kbjarkefur commented May 3, 2018

bbdaniels commented May 3, 2018

estebanjq commented May 3, 2018

kbjarkefur commented May 3, 2018

luizaandrade commented May 7, 2018 • edited

estebanjq commented May 8, 2018

kbjarkefur commented Oct 19, 2018

kbjarkefur commented Oct 19, 2018

estebanjq commented Oct 20, 2018

kbjarkefur commented Oct 20, 2018

estebanjq commented Oct 20, 2018 via email

luizaandrade commented Oct 20, 2018

estebanjq commented Oct 21, 2018 via email

kbjarkefur commented Oct 29, 2018

bajwaih commented Apr 22, 2019

kbjarkefur commented Apr 22, 2019

kbjarkefur commented Apr 26, 2018 •

edited

luizaandrade commented May 7, 2018 •

edited