# Spatial Dependence Index

> Is dataset worth modeling?

## Table of Contents:

1. What is the spatial dependcy index?
2. Why do we use spatial dependency index?
3. Example 1: Weak spatial dependence.
4. Example 2: Strong spatial dependence.
5. API links.

## Level: Basic

## Changelog

| Date | Change description | Author |
|------|--------------------|--------|
| 2023-04-08 | The first version of the tutorial | @SimonMolinsky |

## Introduction

In this tutorial, we will learn how to estimate **spatial dependency index**. Algorithm is based on the work:

> [1] CAMBARDELLA, C.A.; MOORMAN, T.B.; PARKIN, T.B.; KARLEN, D.L.; NOVAK, J.M.; TURCO, R.F.; KONOPKA, A.E. Field-scale variability of soil properties in central Iowa soils. Soil Science Society of America Journal, v. 58, n. 5, p. 1501-1511, 1994.


## 1. What is the spatial dependency index?

The spatial dependency index (SDI) measures the strength of a spatial process we are modeling. SDI is normalized to the interval between 0 and 1. Therefore, we can transform it into percentages and assign an order of spatial dependency from weak to strong.

The SDI is a ratio of the nugget to the total variance (sill) of a model:

$$SDI = \frac{nugget}{sill} * 100$$

Whenever we perform fitting of a theoretical variogram with the `pyinterpolate` package, then SDI is calculated, and we will take advantage of it in the examples. SDI is represented by two values:

- **numeric**, a ratio of nugget and sill in percent,
- **categorical**, a description of a spatial dependency strength.

There are four levels of spatial dependency.

| Lower Limit (included) | Upper Limit (excluded) | Strength |
|------------------------|------------------------|----------|
| 0                      | 25                     | strong   |
| 25                     | 75                     | moderate |
| 75                     | 95                     | weak     |
| 95                     | _inf_                  | no spatial dependence |


**The lower the ratio, the strongest spatial dependence**. If the ratio is greater than 75 percent, we should be cautious with spatial modeling because spatial similarities may not explain the process.

## 2. Why do we use spatial dependency index?

In a world where Tobler’s Law can be applied to every spatial phenomenon, we might use kriging all the time without consideration. We know that spatial dependence exists, and close neighbors are always similar.

This world is not our world! Not every process follows Tobler’s Law. We can find examples that are sampled over the same area and the same scale but their spatial dependence indexes are different. In [1] (SDI in `pyinterpolate` is based on this publication), there are multiple chemical compounds sampled from the same field. Every compound has its spatial distribution and variogram. There are soil parameters that are not spatially dependent at all (for example, _Mg_ or _Ca_ that are randomly distributed).

The spatial dependence index level marks the next decision on what to do with the data.

- Strong: just krige it!
- Moderate: there might be some other thing that explains process variation.
- Weak: the other non-spatial process has more influence on data than spatial similarities.
- No spatial dependence: the process is random, or spatial dependencies cannot explain variance.

**Note**: Be careful, because the last two points are red flags, BUT sometimes processes with a low variogram variability at one scale may be explained with spatial relations at a changed scale. A good practical example is a comparison of rental apartment prices per night: if you look at the scale of hundreds of kilometers, then the spatial dependence may be extremely weak. On the other hand, the spatial similarity between prices of apartments close to each other (up to 10 kilometers, 6 miles) tends to show a _classic variogram curve_. The reason is simple: most managers and algorithms use information about the pricing of the closest neighbors and neighbourhood prices are affected by the same external objects or events.

In summary, we should look into the spatial dependence index to ensure that our path is not a dead end.