Skip to content

spatialstatisticsupna/Multivariate_confounding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 

Repository files navigation

A simplified spatial+ approach to mitigate spatial confounding in multivariate spatial areal models

This repository contains the R code to implement the methods described in the paper entitled "A simplified spatial+ approach to mitigate spatial confounding in multivariate spatial areal models" (Urdangarin et al., 2023) as well as the R code to create the figures and tables presented in the paper.

Table of contents

Data

Rapes and dowry deaths data in Uttar Pradesh in 2011 (Vicente et al., 2020)

The data_UttarPradesh_2011.Rdata file contains the following objects:

  • data: contains the data set used. It is a dataframe with the following variables:

    • ID_area: numeric identifiers of districts
    • dist: names of the districts of Uttar Pradesh
    • year: year in which the data is gathered
    • pop: population of each district in 2011
    • obs: number of rapes and dowry deaths in each district in 2011
    • exp: number of expected cases of rapes and dowry deaths in each district in 2011
    • Crime: 1=rapes, 2=dowry deaths
    • X1: standardized sex ratio covariate (number of females per 1000 males)
    • X5: standardized murder (per 100000 people) covariate
    • X6: standardized burglary (per 100000 people) covariate
    • X3: standardized female literacy rate (%) covariate
  • carto_UP: cartography of the 70 districts of Uttar Pradesh

Simulated data

The Simulated_data encompasses a collection of 11 .Rdata files, with each file corresponding to a distinct scenario employed in Simulation Study 1 and Simulation Study 2. Each .Rdata file contains the same objects as data_UttarPradesh_2011.Rdata (data and carto_UP). However, in Simulation study 1, a simulated covariate $X=(X_2, X_3)'$ is added to data, whereas Simulation Study 2 involves the inclusion of both the simulated covariate $X_1^{*}$ and the spatial effects $\theta=(\theta_1, \theta_2)'$. Additionally, every .Rdata file also accommodates the subsequent objects:

  • log.risk: a vector that contains the simulated log risks for both crimes
  • log.risk.crime1: a vector that contains the simulated log risks for crime 1
  • log.risk.crime2: a vector that contains the simulated log risks for crime 2
  • simu.O: a list with 300 simulated counts data sets for both crimes
  • simu.O.crime1: a list with 300 simulated counts data sets for crime 1
  • simu.O.crime2: a list with 300 simulated counts data sets for crime 2

R code

The folder labeled R holds the necessary R code for executing the M-models using the simplified spatial+ approach. The folder contains the code to fit all the models and reproduce the tables and figures of the paper.

  • R/Real_data_analysis folder contains the R code used in the real data analysis.

    The main files to fit the M-Spatial and M-SpatPlus models with ICAR, PCAR and BYM2 priors are run_MICAR.R, run_MPCAR.R and run_MBYM2.R respectively.

    • Figure1.R: R script to reproduce Figure 1 of the paper.
    • functions: folder that contains the functions of M-models implemented using rgeneric function of INLA.
    • Tables_2_3_4.R: R code to reproduce Table 2, 3 and 4 of the paper. Before running the models, the spatial argument (one of either "ICAR", "PCAR" or "BYM2") must be defined at the top of the code.
  • R/Simulation_Study_1 folder comprises the R code employed during Simulation Study 1. Before running the scripts, the scenario argument (one of either "Scenario1", "Scenario2", "Scenario3", "Scenario4", "Scenario5" or "Scenario6") must be defined.

    • SimuStudy1_simulate_data.R: R code to simulate the 300 counts datasets for crime 1 and crime 2.
    • Figure2.R: R code to reproduce Figure 2 of the paper.
    • run_MICAR_Spat.R: R script to fit the M-Spatial model with ICAR prior to the 300 simulated datasets.
    • run_MICAR_SpatPlus.R: R script to fit the M-SpatPlus models with ICAR prior to the 300 simulated datasets.
    • run_MPCAR_Spat.R: R script to fit the M-Spatial model with PCAR prior to the 300 simulated datasets.
    • run_MPCAR_SpatPlus.R: R script to fit the M-SpatPlus models with PCAR prior to the 300 simulated datasets.
    • run_MBYM2_Spat.R: R script to fit the M-Spatial model with BYM2 prior to the 300 simulated datasets.
    • run_MBYM2_SpatPlus.R: R script to fit the M-SpatPlus models with BYM2 prior to the 300 simulated datasets.
    • SimuStudy1_merge_results.R: R code to combine the models fitted across 300 simulated datasets into a single list.
    • Tables_5_6_7_8_9.R: R code to reproduce Table 5, 6, 7, 8 and 9 of the paper for each scenario and prior. Before running the code, the spatial argument (one of either "MICAR", "MPCAR" or "MBYM2") must be defined at the top of the code.
    • Figure4.R: R code to reproduce the boxplots in Figure 4.
    • Figure5.R: R code to reproduce the boxplots in Figure 5.
    • FigureA1_supplementary.R: R code to reproduce Figure A.1 in the supplementary material.
    • Tables_A1toA11_supplementary.R: R code to reproduce Tables A.1 to A.11 of the supplementary material A for each scenario and prior.
    • FigureA2_supplementary.R: R code to reproduce the boxplots in Figure A.2.
    • FigureA3_supplementary.R: R code to reproduce the boxplots in Figure A.3.
  • R/Simulation_Study_2 folder comprises the R code employed during Simulation Study 2. Before running the scripts, the scenario argument (one of either "Scenario1", "Scenario2", "Scenario3", "Scenario4" or "Scenario5") must be defined.

    • SimuStudy2_simulate_data.R: R code to simulate the 300 counts datasets for crime 1 and crime 2.
    • Figure3.R: R code to reproduce Figure 3 of the paper.
    • run_MICAR_Spat.R: R script to fit the M-Spatial model with ICAR prior to the 300 simulated datasets.
    • run_MICAR_SpatPlus.R: R script to fit the M-SpatPlus models with ICAR prior to the 300 simulated datasets.
    • run_MPCAR_Spat.R: R script to fit the M-Spatial model with PCAR prior to the 300 simulated datasets.
    • run_MPCAR_SpatPlus.R: R script to fit the M-SpatPlus models with PCAR prior to the 300 simulated datasets.
    • run_MBYM2_Spat.R: R script to fit the M-Spatial model with BYM2 prior to the 300 simulated datasets.
    • run_MBYM2_SpatPlus.R: R script to fit the M-SpatPlus models with BYM2 prior to the 300 simulated datasets.
    • SimuStudy2_merge_results.R: R code to combine the models fitted across 300 simulated datasets into a single list.
    • Tables_10_11_12_13_14.R: R code to reproduce Table 10, 11, 12, 13 and 14 of the paper for each scenario and prior. Before running the code, the spatial argument (one of either "MICAR", "MPCAR" or "MBYM2") must be defined at the top of the code.
    • Figure6.R: R code to reproduce the boxplots in Figure 6.
    • Figure7.R: R code to reproduce the boxplots in Figure 7.
    • FigureB4_supplementary.R: R code to reproduce Figure B.4 in the supplementary material.
    • Tables_B12toB22_supplementary.R: R code to reproduce Tables B.12 to B.22 of the supplementary material B for each scenario and prior. Before running the code, the spatial argument (one of either "MICAR", "MPCAR" or "MBYM2") must be defined at the top of the code.
    • FigureB5_supplementary.R: R code to reproduce the boxplots in Figure B.5.
    • FigureB6_supplementary.R: R code to reproduce the boxplots in Figure B.6.

Computations were run using R-4.2.1, INLA version 22.12.16 (dated 2022-12-23).

Acknowledgements

This work has been supported by Project PID2020-113125RB-I00/ MCIN/ AEI/ 10.13039/501100011033.

image

References

Urdangarin, A., Goicoa, T. , Kneib, T. and Ugarte, M.D. (2024). A simplified spatial+ approach to mitigate spatial confounding in multivariate spatial areal models. Spatial Statistics 59, 100804, DOI: 10.1016/j.spasta.2023.100804.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages