Multimorbidity in households and health care costs
Status: In Progress
As the number of people with multiple long-term conditions grows, meeting their needs will be one of the biggest challenges facing the NHS. Patients with two or more conditions account for over half of primary and secondary care activity and costs.
The responsibility for managing these conditions falls mostly on the individuals themselves and on their informal carers but people who are less able to effectively manage their conditions need more care from the NHS. It is important to understand the wider context of people living with multiple conditions and any impact this has on health care activity and costs.
This project examines whether the health care activity and costs of a person with multiple conditions depends on their household health context.
We used data from the Clinical Practice Research Datalink (CPRD) linked to Hospital Episode Statistics (HES), ONS mortality and Indices of Multiple Deprivation. ISAC protocol number 17_150RMn2.
From an original random sample, we selected a subsample of two-person households. We counted the nunber of conditions they had at baseline (1st April 2014) and calculated their health care activity and cost described in over two years of follow-up.
How does it work?
As the data used for this analysis is not publically available, the code cannot be used to replicate the analysis on this dataset. However, with modifications the code will be able to be used on other patient-level CPRD extracts.
These scripts were written in SAS Enterprise Guide Version 7.12, RStudio Version 1.1.383 and Stata MP15. The following R packages are used:
The SAS folder contains:
- Derive_twopersonhouseholds - Uses the denominator file from CPRD to select patients in two-person households. We limited the sample to people aged 50+ that registered at their practice within 1 year of each other. The resulting list of patient IDs can then be merged with other variables for analysis. We merged with data on multiple conditions and primary and secondary care activity and costs.
The R folder contains:
01_Derive_variables_for_regression - Prepare variables for the two-part regression models
02_Two_part_regression_models - Combines logistic model for whether has a non-zero cost with a gamma distribution for cost where this is non-zero. Predicted values are estimated for each level of household multimorbidity. One option for obtaining confidence intervals is to use bootstrapping. We opted instead to switch to Stata and use the twopm command.
The Stata folder contains:
- twopm_costs - Runs two part models and obtains estimated mean (s.e.) costs across both parts of the model.
Authors - please feel free to get in touch
This project is licensed under the MIT License.