## Project Name: Dementia Probabilities in HRS

Project Synopsis: Follows Hurd's (2013) method of dementia probability
assignment using separate ordered probits for self- and proxy-respondents on the
ADAMS Wave A cohort as described in the online appendix.  Recreates tables from
the appendix and examines sensitivity/specificity and AUC of model vs. Hurd's
publicly available probabilities in manner of Gianattasio, et al. (2019).  Final
probabilities are correlated with Hurd’s at .96, with slightly higher sensitivity
and lower specificity, and marginally higher AUC in and out of sample than
Hurd’s original probabilities.<br>

Principal Investigator: Amy Kelley <br> <br>
Created By: Evan Bollens-Lund<br>
Date Created: 12/10/15<br>
Last Modified By: Evan Bollens-Lund<br>
Date Last Modified: 2/25/19<br>

Data: HRS Core public files, 1998-2014
Software: SAS and Stata

Papers that use data derived from this code:
Ornstein, K.A., Garrido, M.M., Siu, A.L., Bollens‐Lund, E., Langa, K.M. and Kelley, A.S., 2018. Impact of In‐Hospital Death on Spending for Bereaved Spouses. Health Services Research, 53, pp.2696-2717.

Ornstein, K.A., Garrido, M.M., Siu, A.L., Bollens-Lund, E., Rahman, O.K. and Kelley, A.S., 2019. An Examination of Downstream Effects of Bereavement on Healthcare Utilization for Surviving Spouses in a National Sample of Older Adults. Forthcoming from PharmacoEconomics.

Wachterman, M.W., O’Hare, A.M., Rahman,O.K., Lorenz, K.A., Marcantonio, E.R., Alicante, G.K. and Kelley, A.S., 2019. Framing Prognostic Expectations: One-year Mortality After Dialysis Initiation Among Older Adults.  Forthcoming from JAMA Internal Medicine.

Using the original Hurd probabilities:
Kelley, A.S., McGarry, K., Gorges, R. and Skinner, J.S., 2015. The burden of health care costs for patients with dementia in the last 5 years of life. Annals of internal medicine, 163(10), pp.729-736.

Reference papers:
Hurd, Michael D., Paco Martorell, Adeline Delavande, Kathleen J. Mullen,
 and Kenneth M. Langa. "Monetary costs of dementia in the United States."
 New England Journal of Medicine 368, no. 14 (2013): 1326-1334.

Gianattasio, Kan Z., Qiong Wu, M. Maria Glymour, and Melinda C. Power. "
Comparison of methods for algorithmic classification of dementia status
 in the Health and Retirement Study." Epidemiology 30, no. 2 (2019): 291-302.

## Stata

Run this set of code after running the SAS code

In [None]:
use "E:\data\serious_ill\int_data\proxy.dta", clear


local fvars 1389 1394 1399 1404 1409 1414 1419 1424 1429 1434 ///
 1439 1444 1448 1451 1454 1457

local i=506
foreach f of local fvars {
    gen fd`i'=f`f'
    gen fd`=`i'+1'=f`=`f'+1'
    gen fd`=`i'+2'=f`=`f'+2'
    local i=`i'+3
}

local gvars 1543 1548 1553 1558 1563 1568 1573 1578 1583 1588 1593 ///
1598 1602 1605 1608 1611

local i=506
foreach g of local gvars {
    gen gd`i'=g`g'
    gen gd`=`i'+1'=g`=`g'+1'
    gen gd`=`i'+2'=g`=`g'+2'
    local i=`i'+3
}
keep hhid pn *d* core_year
tokenize f g h j k l m n o
gen cy2=(core_year-1996)/2
levelsof cy2, local(levels)

local j=1
local k=506
forvalues i=1/16 {
    gen base=.
    gen bet=.
    gen worse=.
    gen pc`i'_notdone=0
    foreach j of local levels {
        qui replace pc`i'_notdone=1 if ``j''d`k'==4 & cy2==`j' 
        qui replace base=``j''d`k' if cy2==`j'
        qui replace bet=``j''d`=`k'+1' if cy2==`j'
        qui replace worse=``j''d`=`k'+2' if cy2==`j'
}
    gen pc`i'=3 if base==2
    qui replace pc`i'=bet if inlist(bet,1,2)
    qui replace pc`i'=worse if inlist(worse,4,5)
    drop base bet worse
    local k=`k'+3
    local j=`j'+1
}

forvalues i=1/16 {
    local pc `pc' pc`i'
}
foreach x in miss total mean {
    egen iq`x'=row`x'(`pc')

}





drop if iqmiss==16
save E:/data/serious_ill/int_data/proxycog.dta, replace



H="get pdem for all and export dataset"
use "E:\data\hrs_public_2014\rand2014\main\trk2014tr_r.dta", clear


tokenize a b c d e f g h j k l m n o

local yr=1
foreach x in 1992 1993 1994 1995 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 {
	rename ``yr''age age`x'
	local yr=`yr'+1
}	
keep hhid pn age* gender
reshape long age, i(hhid pn) j(core_year)
tempfile track
save `track'

use "E:\data\hrs_cleaned\core_00_to_14.dta", clear
gen prediction_year=core_year+1
merge 1:1 hhid pn core_year using "E:\data\hrs_cleaned\working\tics_9814.dta" , nogen keep(match master)
merge 1:1 hhid pn prediction_year using  "E:\data\hrs_public_2012\dementia\pdem_withvarnames.dta", keepusing(prob)  keep(match master) nogen
merge 1:1 hhid pn core_year using "E:\data\hrs_public_2012\dementia\ADAMS\dementia_dx_adams_wave1_only.dta", nogen
merge m:1 hhid  pn core_year using `track', keepusing(age gender) keep(match master) nogen
merge 1:1 hhid pn core_year using "E:\data\serious_ill\int_data\proxycog.dta", nogen keep(match master) keepusing(iq*)

replace female=gender-1 if !missing(gender)

replace c_ivw_date=td(31dec2014) if core_year==2014 & missing(c_ivw_date)

keep if age>=65
gen age_cat=1
replace age_cat=2 if age>=70
replace age_cat=3 if age>=75
replace age_cat=4 if age>=80
replace age_cat=5 if age>=85
replace age_cat=6 if age>=90
tab age_cat, gen(age_cat)
label define age_cat 1 "Age<70" 2 "Age 70-74" 3 "Age 75-79" 4 "Age 80-84" ///
5 "Age 85-89" 6 "Age>=90"
label values age_cat age_cat

gen ed_hs_only=educ==2
gen ed_gt_hs=educ>2
gen n=1
sort id core_year
by id: gen hasn1=!missing(core_year[_n-1])
egen adl_diff_index=rowtotal(adl_diff*)
egen adlmiss=rowmiss(adl_diff*)
local iadl iadl_diff_mp iadl_diff_gr iadl_diff_ph iadl_diff_rx iadl_diff_m 
sum n `iadl'
egen iadl_diff_index=rowtotal(`iadl')
egen iadlmiss=rowmiss(`iadl')
replace adlmiss=1 if adlmiss>1
replace iadlmiss=1 if iadlmiss>1
*replace adl_diff_index=. if missing(adl_index_core)
gen dates=mo+dy+yr+dw
gen dates_imp=mo_imp==1 |dy_imp==1 | yr_imp==1 | dw_imp==1
gen iqmissany=iqmiss>0 if !missing(iqmiss)
gen iqmissgt2=iqmiss>2 if !missing(iqmiss)
local cogvars proxy_core  imrc dlrc ser7 bwc20 dates scis cact pres vp adl_diff_index ///
iadl_diff_index adlmiss iadlmiss iqmean iqmissgt2 iqmissany
sum n `cogvars' if !proxy & !missing(dx_a)
sum n `cogvars' if proxy & !missing(dx_a)


foreach x of local cogvars {
sort id core_year
qui by id: gen prev`x'=`x'[_n-1]
qui gen ch_`x'=`x'-prev`x'
if inlist("`x'","imrc","dlrc","ser7","bwc20","dates","scis","cact","pres","vp") ///
qui gen miss`x'=`x'_imp==1
if !inlist("`x'","imrc","dlrc","ser7","bwc20","dates","scis","cact","pres","vp") ///
qui gen miss`x'=`x'==.
qui gen missprev`x'=prev`x'==.
if inlist("`x'","imrc","dlrc","ser7","bwc20","dates","scis","cact","pres","vp") ///
& proxy[_n-1]==0 qui replace missprev`x'=`x'_imp[_n-1]==1

qui replace ch_`x'=0 if ch_`x'==.
qui replace prev`x'=`x' if prev`x'==.
sum prev`x' if proxy==1
replace prev`x'=0 if missprev`x' & proxy==1 & prevproxy==1 & "`x'" !="proxy_core"
}
gen adliadlmiss=adlmiss | iadlmiss
egen cogmissany=rowmax(missbwc missser7 missscis misscact misspres missimrc missdlrc ///
missdates adliadlmiss)

egen missprev=rowmax(missprevimrc missprevdlrc missprevser7 missprevbwc20 missprevdates ///
missprevscis missprevcact missprevpres missprevvp)

/*note-most missing variables excluded due to collinearity*/
local regvars  age_cat3 age_cat4 age_cat5 age_cat6 ed_hs_only ed_gt_hs female ///
adl_diff_index iadl_diff_index ch_adl_diff ch_iadl_diff  dates bwc20 ///
ser7 scis cact pres imrc dlrc ch_dates ch_bwc20 ch_ser7 ch_scis ch_cact ch_pres ///
ch_imrc ch_dlrc 

local proxyvars  age_cat3 age_cat4 age_cat5 age_cat6 ed_hs_only ed_gt_hs female ///
adl_diff_index iadl_diff_index ch_adl_diff ch_iadl_diff iqmean  ///
prevproxy c.ch_iqmean prevdates prevser7 prevpres previmrc prevdlrc



sum `regvars' if core_year==2012
sum `regvars' if core_year==2014
sum `regvars' if !proxy & !missing(dx_a)
sum `proxyvars' if proxy & !missing(dx_a)


oprobit dx `regvars' if proxy==0
estimate store oprob1
predict pself if proxy==0
predict pself2 if proxy==0, outcome(#2)
predict pself3 if proxy==0, outcome(#3)
oprobit dx `proxyvars' if proxy==1
estimates store oprob2
predict pdem if proxy==1
predict pdem2 if proxy==1, outcome(#2)
predict pdem3 if proxy==1, outcome(#3)

replace pdem=pself if proxy==0
replace pdem2=pself2 if proxy==0
replace pdem3=pself3 if proxy==0
/*/note--this gets most-likely diagnosis, but we more commonly use a cutoff for 
probable dementia of pdem>.5*/
gen ldem=1 if !missing(pdem)
replace ldem=2 if pdem2>pdem
replace ldem=3 if pdem3>pdem2 & pdem3>pdem
gen likely_dem=ldem==1 if !missing(pdem)
gen likely_cind=ldem==2 if !missing(pdem)
gen likely_normal=ldem==3 if !missing(pdem)
preserve
keep id hhid pn proxy pdem* prob_dem core_year ldem 
rename prob_dem prob_hurd
save E:\data\hrs_public_2014\dementia\pdem_withvarnames_00_14, replace
restore
gen dem=dx_adams==1 if !missing(dx_adams)
*logit dem pdem
*lroc
gen mdem=missing(pdem)
tab mdem
gen mhurd=missing(prob) if core_year<=2006 & age>=70
tab mdem mhurd if !missing(dx_a)

sum `regvars' if !proxy & core_year>=2000
sum `proxyvars' if proxy & core_year>=2000



H="***********************************"


H="recreate hurd tables"
capture log close
set more off

cd "E:\data\hrs_public_2014\dementia"


use "E:\Files to move out\New Data\ucsf code for dementia in hrs\cogvars_gdr_20170518.dta", clear
rename hhidpn id
reshape long memimp dementpimp, i(id) j(wave)

gen core_year=wave*2+1992
tempfile wu
save `wu'


use  E:\data\hrs_public_2014\dementia\ADAMS\dementia_dx_adams if !missing(dx), clear
rename dx_a dx1
gen dx_adams=dx1 if adams_wave==1
rename core_year core_prior_to_adams
gen core_year=core_prior if adams_wave==1
replace core_year=2*(floor(adams_year/2)) if adams_wave>1
replace core_year=2006 if adams_wave==3
duplicates tag id core_year, gen(dup)
replace core_year=core_year+2 if dup==1 & adams_wave>1
merge 1:1 id core_year using E:\data\hrs_public_2014\dementia\pdem_withvarnames_00_14, nogen keep(match)
merge m:1 hhid pn  using "E:\data\hrs_public_2014\rand2014\main\trk2014tr_r.dta", ///
nogen keep(match) keepusing(gender)

gen female=gender-1

gen stratum=.
forvalues i=0/9 {
replace stratum =`=`i'+1' if pdem>0.`i' & pdem<=0`=.`i'+.1'
}

xtile bin=pdem if adams_wave==1, nq(10)
mat tab=J(11,6,.)
local r=1
local c=1
forvalues i=1/10 {
    mat tab[`r',1]=`i'
    sum pdem if bin==`i'
    mat tab[`r',2]=r(N)
    mat tab[`r',3]=r(mean)
    mat tab[`r',5]=r(mean)*r(N)
    sum deme if bin==`i'
    mat tab[`r',4]=r(mean)
    mat tab[`r',6]=r(mean)*r(N)
    local r=`r'+1
}
foreach i in 0 {
    mat tab[`r',1]=`i'
    sum pdem if !missing(dx_a)
    mat tab[`r',2]=r(N)
    mat tab[`r',3]=r(mean)
    mat tab[`r',5]=r(mean)*r(N)
    sum deme if !missing(dx_a) & !missing(pdem)
    mat tab[`r',4]=r(mean)
    mat tab[`r',6]=r(mean)*r(N)
    local r=`r'+1
}

frmttable using "E:\data\hrs_public_2014\dementia\hurd_replication_tables`c(current_date)'.rtf", replace statmat(tab) ctitles("Bin" "N" "Fitted Prob" "Actual prob" ///
"Estimated cases" "Actual cases") sdec(0,0,3,3,3,0) ///
title(Replication Hurd Table S1)

rename dx1 dx_inwave
sort id adams_wave
by id: egen lw=max(adams_wave)
levelsof adams_wave, local(levels)
foreach l of local levels {
    by id: egen dx`l'=max(cond(adams_wave==`l',dx_inwave,.))
}

by id, sort: egen adx=max(dx_a)
label values adx dx_adams


preserve
keep if adams_wave==3 & !missing(ldem) & !missing(prob_)

tab adx dx1, row nofreq
tab adx ldem, row nofreq

mat tab=J(3,4,.)
local r=1
local c=2

foreach x in 3 2 "3,2" {
    sum adx if inlist(adx,`x')
    mat tab[`r',1]=r(N)
    local denom=r(N)
    foreach y in 3 2 1 {
        sum adx if dx_inwave==`y' & inlist(adx,`x')
        mat tab[`r',`c']=(r(N)/`denom')*100
        local c=`c'+1
}
    local r=`r'+1
    local c=2
}

mat rownames tab=Normal CIND Total

frmttable using "E:\data\hrs_public_2014\dementia\hurd_replication_tables`c(current_date)'.rtf", addtable statmat(tab) ctitles("" "" "Wave C" "" \"Wave A" "N" "Normal" "CIND" ///
"Demented") title(Replication Hurd table S4) sdec(0,1,1,1)


gen likely_normal=ldem==3 if !missing(ldem)
gen likely_cind=ldem==2 if !missing(ldem)
gen likely_dem=ldem==1 if !missing(ldem)
mat tab=J(4,3,.)
local r=1
local c=1

foreach x in normal cind dem {
    foreach y in 3 2 1 "1,2,3" {
        sum likely_`x' if inlist(dx3,`y') 
        mat tab[`r',`c']=r(mean)*100
        local r=`r'+1
}
    local c=`c'+1
    local r=1
}


mat rownames tab=Normal CIND Demented Total

frmttable using "E:\data\hrs_public_2014\dementia\hurd_replication_tables`c(current_date)'.rtf", ///
addtable statmat(tab) ctitles("Wave C" "" "Model" "" \"" "Normal" "CIND" ///
"Demented") title(Replication Hurd table S5) sdec(1)

restore

        di "Answers set to `x' if not applicable/not done"
        di "IQCode missing if >`miss' unanswered questions"


        destring id, replace
        merge 1:1 id core_year using `wu', keep(match master)

        sort id adams_wave
        qui gen init =adams_wave==1 & !missing(prob) & !missing(dementpimp)
        qui by id, sort: egen fdem=min(cond(dementia==1,adams_wave,.))

        gen followup=adams_wave>1 & !missing(prob) & adams_wave<=fdem & !missing(dementpimp)
        qui gen mydem=pdem>.5 if !missing(pdem)

        di "Test sample"
        qui logit dementia pdem if init==1
        lroc, nograph
        roctab dementia mydem if init==1 , detail
        di "Validation sample"
        qui logit dementia pdem if followup==1
        lroc, nograph
        roctab dementia mydem if followup==1 , detail

        di "Direct comparison to Hurd in Adams Wave C"
        qui gen hdem=prob>.5 if !missing(prob) 
        roctab dementia mydem if adams_wave==3 & !missing(prob) , detail
        roctab dementia hdem if adams_wave==3 & !missing(pdem), detail

        di "Follow-up comparison among proxies"
        roctab dementia mydem if proxy==1 & followup==1, detail
        roctab dementia hdem if proxy==1 & followup==1, detail
        di "Follow-up comparison among self-interviews"
        roctab dementia mydem if proxy==0 & followup==1, detail
        roctab dementia hdem if proxy==0 & followup==1, detail

        di "Follow-up comparison among women"
        roctab dementia mydem if female==1 & followup==1, detail
        roctab dementia hdem if female==1 & followup==1, detail
        di "Follow-up comparison among men"
        roctab dementia mydem if female==0 & followup==1, detail
        roctab dementia hdem if female==0 & followup==1, detail

        di "Follow-up comparison among all"
        roctab dementia hdem if followup==1, detail
        roctab dementia mydem if followup==1, detail
