<img src="figs/rise_logo.png" alt="Estructura Carpeta" width="300" height="300" align="left">

## $S$-maup: Statistical Test to Measure the Sensitivity to the Modifiable Areal Unit Problem


Juan C. Duque$^{1,2}$, Henry Laniado$^{1}$, Adriano Polo$^{2,3}$


$^{1}$ Department of Mathematical Sciences, Universidad EAFIT, Medellin, Colombia

$^{2}$ RiSE-group, Universidad EAFIT, Medellin, Colombia

$^{3}$ Department of Economics, Universidad EAFIT, Medellin, Colombia


__maintainer__ = "RiSE Group"  (http://www.rise-group.org/). Universidad EAFIT

__Corresponding author__ = jduquec1@eafit.edu.co (JCD)

### Abstract 

This work presents a nonparametric statistical test, $S$-maup, to measure the sensitivity of a spatially intensive variable to the effects of the Modifiable Areal Unit Problem (MAUP). To the best of our knowledge, $S$-maup is the first statistic of its type and focuses on determining how much the distribution of the variable, at its highest level of spatial disaggregation, will change when it is spatially aggregated.  Through a computational experiment, we obtain the basis for the design of the statistical test under the null hypothesis of non-sensitivity to MAUP.  We performed an exhaustive simulation study for approaching the empirical distribution of the statistical test, obtaining its critical values, and computing its power and size. The results indicate that, in general, both the statistical size and power improve with increasing sample size. Finally, for illustrative purposes, an empirical application is made using the Mincer equation in South Africa, where starting from 206 municipalities, the $S$-maup statistic is used to find the maximum level of spatial aggregation that avoids the negative consequences of the MAUP.

# Run the $S$-maup.

[<span style="color:red">Download the code</span>](data/results.csv)

In [14]:
import smaup as smaup

In [None]:
N = 150
k = 90
rhoEst = 0.801
testSmaup(N, k, rhoEst)

# Computational Experiment on MAUP effects

<img src="figs/scheme.png" alt="Estructura Carpeta" width="900" height="300" align="left">

## Folder: <span style="color:red">1_SAR_realizations</span>

ID, SAR1_0.9, SAR2_0.9, SAR3_0.9,..., SAR50_0.9,..., SAR1_-0.9, SAR2_-0.9, SAR3_-0.9,..., SAR48_-0.9, SAR49_-0.9, SAR50_-0.9

Fields description:

ID: Area ID

SAR$<$realization ID$>$_$<rho$ value$>$


In [None]:
#ir a folder

In [None]:
# leer un csv

In [None]:
# visualizar primeros registros

# Tables:

### Table 2. Effect on mean.

[<span style="color:red">code</span>](data/results.csv)

[data](data/results.csv)

### Table 3. Critical Values ($M_{\alpha;\rho, N}$).

### Table 4. Example $S$-maup.

### Table 5. Estimated power of $S$-maup.

### Table 6. Estimated size of $S$-maup.

### Table 7. Descriptive Statistics.

### Table 8. Mincer Model Estimate: South Africa.

### Table 9. Estimator of the statistic $S$-maup: South Africa.

# Figures:

### Figure 3. Relative change in mean - Average effect. (a) $N=25$; (b) $N=100$; (c) $N=225$; (d) $N=400$; (e) $N=625$; (f) $N=900$.

### Figure 4. Relative change in variance - Average effect. (a) $N=25$; (b) $N=100$; (c) $N=225$; (d) $N=400$; (e) $N=625$; (f) $N=900$.

### Figure 5. Proportion of instances for which the Levene test rejects the null hypothesis of equality of variance, with a level of significance $\alpha=0.05$. (a) $N=25$; (b) $N=100$; (c) $N=225$; (d) $N=400$; (e) $N=625$; (f) $N=900$.    

### Figure 6. MAUP effects at three levels of spatial autocorrelation, (a) $\rho=-0.9$, (b) $\rho=0$, and (c) $\rho=0.9$. Solid line: original variable with $N=900$; dashed lines: 30 aggregations with $k=240$. The vertical lines indicate $\mu_{o}$ and $\mu_{ag}$.

### Figure 7. Median $\overline{RCM}$ for $N=100$.

### Figure 8. Adjustments of robust linear regression models: (a) Linearized logistic function ($L$); (b) Linearized power function ($\eta$); (c) Linear function ($\tau$).

### Figure 9. Municipalities: (a), (b) and (c). Districts: (d), (e) and (f). Provinces:(g), (h) and (i).

### Figure 10. Distribution of coefficients, $k=136$: (a) YRSCHOOL; (b) EXP; (c) EXP2. horizontal black line: coefficient (206 municipalities), dashed lines are the respective confidence intervals 95\%.

### Figure 11. Distribution of coefficients. line:$k=136$, dotted line:$k=52$: (a) YRSCHOOL; (b) EXP; (c) EXP2. horizontal black line: coefficient (206 municipalities). horizontal dotted line: coefficient (52 districts).