**Cross Section Spatial Models in Julia**

This is a brief tutoral detailing the use of some functions in the SpatialEconometrics.jl package. 

Initially, we load some packages required to this exercise and afterwards we install the SpatialEconometrics.jl packages

1. Loading and installing required packages

In [18]:
using Pkg,DataFrames,Shapefile,SpatialDependence
Pkg.add(url="https://github.com/alanleal-econ/SpatialEconometrics.jl")
using SpatialEconometrics

[32m[1m    Updating[22m[39m git-repo `https://github.com/alanleal-econ/SpatialEconometrics.jl`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Changes[22m[39m to `~/.julia/environments/v1.8/Project.toml`
[32m[1m  No Changes[22m[39m to `~/.julia/environments/v1.8/Manifest.toml`


2. Reading and detailing data

We use data on zika cases in the Brazilian state of Ceará as available in: Amaral, P., Resende de Carvalho, L., Hernandes Rocha, T. A., da Silva, N. C., & Vissoci, J. R. N. (2019). Geospatial modeling of microcephaly and zika virus spread patterns in Brazil. PloS one, 14(9), e0222668.

In [19]:
W=polyneigh(ceara_zika.geometry)
W=Matrix(W) # transforma a matriz espacial em uma matrix usual
n=184 # número de observações
y = ceara_zika.inc_zik_3q # jeito de selecionar uma variável do shapefile ceara_zika
df = DataFrame(
    constant=ones(n),
    ln_gdp = ceara_zika.ln_gdp,
    ln_pop=ceara_zika.ln_pop,
    mobility = ceara_zika.mobility,
    environ = ceara_zika.environ,
    sanitation = ceara_zika.sanitation
)
X = Matrix(select(df, [:constant,:ln_gdp,:ln_pop,:mobility, :environ, :sanitation]))
X

184×6 Matrix{Union{Missing, Float64}}:
 1.0  4.55599  4.02102  0.958  0.835  0.538
 1.0  4.83451  4.18577  0.941  0.644  0.728
 1.0  5.49065  4.76005  0.971  0.973  0.535
 1.0  5.31389  4.70893  0.975  0.885  0.687
 1.0  4.73446  4.2096   0.99   0.96   0.68
 1.0  4.57723  4.03226  1.0    0.932  0.798
 1.0  4.3909   3.83607  0.968  0.538  0.644
 1.0  4.88703  4.21376  0.972  0.905  0.573
 1.0  5.26288  4.59364  0.933  0.877  0.524
 1.0  4.45245  3.8441   0.956  0.926  0.752
 ⋮                                    ⋮
 1.0  4.71922  4.1586   0.95   0.859  0.564
 1.0  5.29099  4.50225  0.981  0.788  0.607
 1.0  4.46604  3.87766  0.976  0.837  0.625
 1.0  4.85375  4.2742   0.933  0.8    0.589
 1.0  5.14761  4.2959   0.943  0.906  0.612
 1.0  4.68884  4.11002  0.978  0.938  0.659
 1.0  4.99156  4.24534  0.98   0.917  0.586
 1.0  5.20945  4.58472  0.956  0.823  0.679
 1.0  5.3319   4.74001  0.973  0.77   0.557

3. SAR Model

Initially, we run a simple SAR model (spatially lagged independent variable)

In [20]:
sar_model=sar(X,y,W)

(coefs = [16.627062538573206 39.88016540681598 … 95.33194035633201 0.6772408888156662; -2.599588111262541 6.923082061885459 … 11.063352406779705 0.7077439429713919; … ; 5.229096817467874 11.160015891891737 … 27.253771947462845 0.6399665543158717; 3.2474414030528043 10.118056359099391 … 23.215773564483893 0.748625237236324], sigma2 = 176.21419089914, rho = [0.23790174741605125, 0.11315758630908089], nobs = 184, dof = 178, ll = -737.9017538201629)

Next, we present these results in a formatted table

In [21]:
names_col=names(df)
sar_summary(sar_model,names_col)

.------------.---------.---------.----------.----------.-----------.
Maximum Likelihood Estimation of SAR Model
.------------.---------.---------.----------.----------.-----------.
Log-Likelihood: -737.902
Number of observations: 184
σ2: 176.214
.------------.---------.---------.----------.----------.----------.
|[1m  Variable  [0m|[1m    β    [0m|[1m Std Dev [0m|[1m Lower CI [0m|[1m Upper CI [0m|[1m p-value  [0m|
:------------+---------+---------+----------+----------+----------:
|  constant  | 16.627  | 39.880  | -62.078  |  95.332  | 0.677241 |
|   ln_gdp   | -2.600  |  6.923  | -16.263  |  11.063  | 0.707744 |
|   ln_pop   |  1.228  |  8.970  | -16.475  |  18.931  | 0.89124  |
|  mobility  | -14.590 | 35.789  | -85.221  |  56.041  | 0.684016 |
|  environ   |  5.229  | 11.160  | -16.796  |  27.254  | 0.639967 |
| sanitation |  3.247  | 10.118  | -16.721  |  23.216  | 0.748625 |
'------------'---------'---------'----------'----------'----------'
ρ: 0.238, Standard Deviati

Finally, we calculate their direct, indirect and total effects

In [22]:
beta_complet=vcat(sar_model.sigma2,sar_model.rho[1],sar_model.coefs[:,1])
sar_effects=effects_sar(y,X,W,beta_complet)

5×3 Matrix{Float64}:
  -2.62955  -0.781538   -3.41109
   1.24248   0.369282    1.61176
 -14.7581   -4.3863    -19.1444
   5.28938   1.57207     6.86145
   3.28488   0.976308    4.26118

Now, we print these effects in a proper table

In [23]:
effects_summary(sar_effects,names_col[2:end])

.------------.----------------.------------------.---------------.
|[1m  Variable  [0m|[1m Direct Effects [0m|[1m Indirect Effects [0m|[1m Total Effects [0m|
:------------+----------------+------------------+---------------:
|   ln_gdp   |     -2.630     |      -0.782      |    -3.411     |
|   ln_pop   |     1.242      |      0.369       |     1.612     |
|  mobility  |    -14.758     |      -4.386      |    -19.144    |
|  environ   |     5.289      |      1.572       |     6.861     |
| sanitation |     3.285      |      0.976       |     4.261     |
'------------'----------------'------------------'---------------'


4. SEM Model

Now, we run a simple SEM model (spatially lagged error)

In [24]:
sem_model=sem(X,y,W)

(coefs = [16.627065894473088 39.88016536263677 … 95.3319436250427 0.6772408270409711; -2.5995886217459963 6.923082054339273 … 11.063351881403591 0.7077438879374547; … ; 5.22909751107866 11.160015879537013 … 27.25377261669116 0.6399665095992313; 3.247440152045304 10.118056348367233 … 23.21577229229611 0.7486253305186223], sigma2 = 176.21419051045956, lambda = [0.23790174680532192 0.11315758627772404], nobs = 184, dof = 178, ll = -737.9017538201629)

Now, we print these results in a formatted table

In [25]:
names_col=names(df)
sem_summary(sem_model,names_col)

.------------.---------.---------.----------.----------.-----------.
Maximum Likelihood Estimation of SEM Model
.------------.---------.---------.----------.----------.-----------.
Log-Likelihood: -737.902
Number of observations: 184
σ2: 176.214
.------------.---------.---------.----------.----------.----------.
|[1m  Variable  [0m|[1m    β    [0m|[1m Std Dev [0m|[1m Lower CI [0m|[1m Upper CI [0m|[1m p-value  [0m|
:------------+---------+---------+----------+----------+----------:
|  constant  | 16.627  | 39.880  | -62.078  |  95.332  | 0.677241 |
|   ln_gdp   | -2.600  |  6.923  | -16.263  |  11.063  | 0.707744 |
|   ln_pop   |  1.228  |  8.970  | -16.475  |  18.931  | 0.89124  |
|  mobility  | -14.590 | 35.789  | -85.221  |  56.041  | 0.684015 |
|  environ   |  5.229  | 11.160  | -16.796  |  27.254  | 0.639967 |
| sanitation |  3.247  | 10.118  | -16.721  |  23.216  | 0.748625 |
'------------'---------'---------'----------'----------'----------'
λ: 0.238, Standard Deviati

5. SARAR (or SAC) model

Now, we run a simple SEM model (spatially lagged error and independent variable)

In [26]:
M=W
sarar_model=sarar(X,y,W,M)

(coefs = [15.360436670683146 40.52477342713172 … 95.34063452434818 0.705118998004481; -2.890154467951806 7.13547317056372 … 11.19250430341391 0.6859425500289715; … ; 5.9644537947777305 11.522584631271286 … 28.7055704169788 0.6053690504944904; 4.277448559999447 10.677467064749516 … 25.35063062109659 0.6891994799473498], sigma2 = 176.6763776861746, rho = [0.08497190564358824 0.32230592861399615], lambda = [0.17106463701354635 0.3114059827534564], nobs = 184, dof = 178, ll = -737.7635471482148)

Now, we print these results in a formatted table

In [27]:
sarar_summary(sarar_model,names_col)

.------------.---------.---------.----------.----------.-----------.
Maximum Likelihood Estimation of SAC Model
.------------.---------.---------.----------.----------.-----------.
Log-Likelihood: -737.764
Number of observations: 184
σ2: 176.676
.------------.---------.---------.----------.----------.----------.
|[1m  Variable  [0m|[1m    β    [0m|[1m Std Dev [0m|[1m Lower CI [0m|[1m Upper CI [0m|[1m p-value  [0m|
:------------+---------+---------+----------+----------+----------:
|  constant  | 15.360  | 40.525  | -64.620  |  95.341  | 0.705119 |
|   ln_gdp   | -2.890  |  7.135  | -16.973  |  11.193  | 0.685943 |
|   ln_pop   |  1.486  |  9.109  | -16.491  |  19.463  | 0.870596 |
|  mobility  | -13.968 | 37.093  | -87.175  |  59.240  | 0.706961 |
|  environ   |  5.964  | 11.523  | -16.777  |  28.706  | 0.605369 |
| sanitation |  4.277  | 10.677  | -16.796  |  25.351  | 0.689199 |
'------------'---------'---------'----------'----------'----------'
λ: 0.085, Standard Deviati

Now, we estimate the direct, indirect and total effects of this estimator

In [28]:
beta_complet=vcat(sarar_model.sigma2,sarar_model.rho[1],sarar_model.lambda[1],sarar_model.coefs[:,1])
SARAR_effects=effects_sarar(y,X,W,M,beta_complet)

5×3 Matrix{Float64}:
  -2.89411  -0.264432   -3.15854
   1.48802   0.135959    1.62398
 -13.9866   -1.27794   -15.2646
   5.97262   0.545712    6.51833
   4.2833    0.391361    4.67466

Finally, we present theses results in a formatted table

In [29]:
effects_summary(SARAR_effects,names_col[2:end])

.------------.----------------.------------------.---------------.
|[1m  Variable  [0m|[1m Direct Effects [0m|[1m Indirect Effects [0m|[1m Total Effects [0m|
:------------+----------------+------------------+---------------:
|   ln_gdp   |     -2.894     |      -0.264      |    -3.159     |
|   ln_pop   |     1.488      |      0.136       |     1.624     |
|  mobility  |    -13.987     |      -1.278      |    -15.265    |
|  environ   |     5.973      |      0.546       |     6.518     |
| sanitation |     4.283      |      0.391       |     4.675     |
'------------'----------------'------------------'---------------'
