# News Hour - Part II

In this notebook, we handle what is essentially the second half of the setup process. This involves rendering all of our data into a shape suitable for a more explicitly dynamic panel analysis. Basically, it amounts to doing some shifting operations in mata so that our data is set up a bit better than it was before. 

We also go ahead and estimate a dynamic version of the viewership model based on this. Estimation of this model is time-consuming, and to save time and trouble, we use the method described in [this paper](http://papers.ssrn.com/sol3/papers.cfm?abstract_id=420371), which is also described in [this paper](http://www.stata-journal.com/article.html?article=st0354). 

To estimate things using this method, you can read the latter paper above, and should download and install the `AMCMC` `Stata` module from `SSC`. It can be found [here](https://ideas.repec.org/c/boc/bocode/s457613.html).  

In [1]:
import ipystata
import os

CWD = os.getcwd()
print(CWD)

Terminated 1 unattached Stata session(s).


C:\Users\mjbaker\Documents\GitHub\NewsHour


In [2]:
%%stata
clear all
cd C:\Users\mjbaker\Documents\GitHub\NewsHour


C:\Users\mjbaker\Documents\GitHub\NewsHour


Get our data (which we assembled in Part I) - it is an averaged data set that should be ready to use:

In [5]:
%%stata
use "Data\AveragedData.dta", clear
set more off
set seed 5150
tsset stationid timeslot


       panel variable:  stationid (strongly balanced)
        time variable:  timeslot, 3 to 8
                delta:  1 unit


## Wrangling the Data into Dynamic form (i.e., the usual panel-like setup)

One of the first things to do is get dynamic leads and lags to estimate some simple models that will somewhat resemble what we are going to do later with a complete set of random effects. One thing that is nice about doing this is that it provides somewhat of a guide as to how one might think about our more complex `Mata` code in `Stata` - we can also compare our random effects models with a more traditional two-way fixed effects model. 

Anyways, here are some lagged variables. We have lagged shares, and dummies indicating whether or not a broadcast of type $b$ in period $t$ follows a broadcast of type $b'$ in period $t-1$:

In [6]:
%%stata
gen double lsi=l.lnsi
replace lsi=0 if lsi==.

gen lnewslnews=lnews*l.lnews
gen lnewsnnews=lnews*l.nnews
gen nnewslnews=nnews*l.lnews
gen nnewsnnews=nnews*l.nnews
replace lnewslnews=0 if lnewslnews==.
replace lnewsnnews=0 if lnewsnnews==.
replace nnewslnews=0 if nnewslnews==.
replace nnewsnnews=0 if nnewsnnews==.



(16622 missing values generated)
(16622 real changes made)
(16622 missing values generated)
(16622 missing values generated)
(16622 missing values generated)
(16622 missing values generated)
(16622 real changes made)
(16622 real changes made)
(16622 real changes made)
(16622 real changes made)


We now have interactions of shares in the current period with each of the $t,t-1$ combinations of broadcasts that we care about - i.e., what the lagged share was last period, in the event that a station broadcast, say, national news last period, and is following it with local news this period. 

In [7]:
%%stata
gen double siXlnln=lnewslnews*lsi
gen double siXlnnn=lnewsnnews*lsi
gen double siXnnln=nnewslnews*lsi
gen double siXnnnn=nnewsnnews*lsi





We also generate the total cumulative shares of local news broadcast up to a particular time slot during the night, and the total cumulative shares of national news broadcast up to a particular time slot during the night. Here goes:

In [8]:
%%stata
gen double totslnews=0
sort stationid timeslot
bysort stationid: replace totslnews=slnews[_n-1] if _n==2
bysort stationid: replace totslnews=totslnews[_n-1]+slnews[2] if _n==3
bysort stationid: replace totslnews=totslnews[_n-1]+slnews[3] if _n==4
bysort stationid: replace totslnews=totslnews[_n-1]+slnews[4] if _n==5
bysort stationid: replace totslnews=totslnews[_n-1]+slnews[5] if _n==6

gen double totsnnews=0
sort stationid timeslot
bysort stationid: replace totsnnews=snnews[_n-1] if _n==2
bysort stationid: replace totsnnews=totsnnews[_n-1]+snnews[2] if _n==3
bysort stationid: replace totsnnews=totsnnews[_n-1]+snnews[3] if _n==4
bysort stationid: replace totsnnews=totsnnews[_n-1]+snnews[4] if _n==5
bysort stationid: replace totsnnews=totsnnews[_n-1]+snnews[5] if _n==6


(16402 real changes made)
(16476 real changes made)
(16622 real changes made)
(16622 real changes made)
(16622 real changes made)
(15225 real changes made)
(16554 real changes made)
(16622 real changes made)
(16622 real changes made)
(16622 real changes made)


Because these variables can be zero, and to keep the coefficients small and stable, we will typically use a transformed version of these variables in estimation:

In [10]:
%%stata
gen double lnewstot=lnews*ln(1+totslnews)
gen double nnewstot=nnews*ln(1+totsnnews)

gen l_ACS_HH=ln(ACS_HH)





[Ackerberg and Ryman (2006)](http://www.econ.ucla.edu/ackerber/pdfinal2.pdf) show that a multinomial logit can be really restrictive, and that it is helpful to include (functions of) group counts in estimation. Accordingly, we will include counts of each of the different types of broadcasts at given times in our estimation. These counts are easy to calculate, which can then be converted into functions as Ackerberg and Ryman (2006) suggest:

In [12]:
%%stata
bysort mt: egen total_lnews=total(lnews)
bysort mt: egen total_nnews=total(nnews)
bysort mt: egen total_otherl=total(otherl)
bysort mt: egen total_otherc=total(otherc)

gen double lnewsn=lnews*ln(1+total_lnews)
gen double nnewsn=nnews*ln(1+total_nnews)
gen double otherln=otherl*ln(1+total_otherl)
gen double othercn=otherc*ln(1+total_otherc)





## Some preliminary preliminary estimates
### The Viewership model

We can basically now estimate a proto-viewership model, which doesn't worry about endogeneity or anything, but has the basic shape as the specification we shall employ. One needs the `a2reg` package to do this, as we include market-time and station-level fixed effects in this estimation. 

Here goes (without the A/R control variables):

In [15]:
%%stata
a2reg dln ln_swg ln_swgXslnews ln_swgXsotherl ln_swgXsnnews lnews otherl nnews lnewslnews lnewsnnews nnewslnews nnewsnnews lsi siXlnln siXlnnn siXnnln siXnnnn lnewstot nnewstot l_ACS_HH, individual(stationid) unit(mt)


> ws nnewslnews nnewsnnews lsi siXlnln siXlnnn siXnnln siXnnnn lnewstot nnewstot l_ACS_HH, individua
> l(stationid) unit(mt)
99732 observations, 19 covariates, 16622 individuals, 1206 units, 99732 cells
Beginning Iterations
Starting Conjugate Gradient Algorithm
Iteration 0, norm of residual 4260.23108, relative error 1
Iteration 1, norm of residual 226.981112, relative error .053279061
Iteration 2, norm of residual 38.3122062, relative error .008992988
Iteration 3, norm of residual 18.9410761, relative error .004446021
Iteration 4, norm of residual 20.967133, relative error .004921595
Iteration 5, norm of residual 26.1236141, relative error .006131971
Iteration 6, norm of residual 12.3186342, relative error .002891541
Iteration 7, norm of residual 10.9890762, relative error .002579455
Iteration 8, norm of residual 6.56774712, relative error .001541641
Iteration 9, norm of residual 5.54913896, relative error .001302544
Iteration 10, norm of residual 4.10267068, relative error .00096301

And here is a version with the Ackerberg/Ryman controls:

In [17]:
%%stata
a2reg dln ln_swg ln_swgXslnews ln_swgXsotherl ln_swgXsnnews lnews otherl nnews lnewslnews lnewsnnews nnewslnews nnewsnnews lsi siXlnln siXlnnn siXnnln siXnnnn lnewstot nnewstot l_ACS_HH lnewsn otherln nnewsn othercn, individual(stationid) unit(mt)


> ws nnewslnews nnewsnnews lsi siXlnln siXlnnn siXnnln siXnnnn lnewstot nnewstot l_ACS_HH lnewsn oth
> erln nnewsn othercn, individual(stationid) unit(mt)
99732 observations, 23 covariates, 16622 individuals, 1206 units, 99732 cells
Beginning Iterations
Starting Conjugate Gradient Algorithm
Iteration 0, norm of residual 4269.64854, relative error 1
Iteration 1, norm of residual 177.372896, relative error .041542739
Iteration 2, norm of residual 25.5208496, relative error .005977272
Iteration 3, norm of residual 16.5499861, relative error .003876194
Iteration 4, norm of residual 19.206008, relative error .004498264
Iteration 5, norm of residual 28.7209617, relative error .006726774
Iteration 6, norm of residual 7.87969834, relative error .001845515
Iteration 7, norm of residual 7.47044031, relative error .001749662
Iteration 8, norm of residual 4.9641564, relative error .001162662
Iteration 9, norm of residual 4.05031447, relative error .000948629
Iteration 10, norm of residual 2.42099