## Inverse Distance Weighting (IDW) Interpolation

Let us suppose we have a data that shows the variation of one quantity of interest across space.
This could be equivalently viewed as { ($\vec{x_1}, y_1)$,$(\vec{x_2}, y_2)$,$(\vec{x_3}, y_3)$, ...}, where the $\vec{x_i}$'s represent the coordinates of the points where we have data and the $y_i$'s are the actual data at those points. <br><br>
We would like to perform an interpolation using these data points such that a few things are satisifed.
1. The interpolation is exact - the value at the known data points is the same as the estimated value, and 
2. We would want far away points from a given source data point to receive less importance than nearby points.
3. Wikipedia has an excellent article on IDW. I am linking it [here](https://en.wikipedia.org/wiki/Inverse_distance_weighting).

In [5]:
def (X, y, extent,exponent = 2,  resolution='high', coordinate_type='euclidean'):
    """
        Here X is the set of spatial locations - Usually assumed to be Lat-Long
        To be extended to higher dimenstions y - estimated value , exponenet - how
        much weight to assign to far off locations to be estimated for each data point, 
        extent - interpolate over a grid - what is xmax xmin ymax ymin
    """
    if coordinate_type = 'latlong_small':
        """
            Assume that the Earth is a Sphere, and use polar coordinates
            $| \vec{r_2}− \vec{r_1}| ≈ \text{R }\times \sqrt[]{(Lat_2 - Lat_1)^{2} + (Long_2 - Long_1)^{2}}$
        """
    if coordinate_type = 'latlong_large':
        """
            Code to be written after understanding all the projections.
        """
        return
    
    
    return 

$| \vec{r_2}− \vec{r_1}| ≈ \text{R }\times \sqrt[]{(Lat_2 - Lat_1)^{2} + (Long_2 - Long_1)^{2}}$

In [6]:
import numpy as np
x_arr = np.linspace(0,10,11)
y_arr = np.linspace(0,10,10)
import pandas as pd

In [7]:
df = pd.read_csv('30-03-18.csv')

In [9]:
df = df.drop('Unnamed: 0',axis=1)

In [10]:
df.head()

Unnamed: 0,location,parameter,value,latitude,longitude
0,"Jawaharlal Nehru Stadium, Delhi - DPCC",pm25,194.0,28.581197,77.234291
1,"Sonia Vihar, Delhi - DPCC",pm25,267.0,28.739434,77.245721
2,"Narela, Delhi - DPCC",pm25,273.0,28.822931,77.101961
3,"Najafgarh, Delhi - DPCC",pm25,129.0,28.620806,76.991463
4,"NSIT Dwarka, New Delhi - CPCB",pm25,176.0,28.60909,77.032541


In [11]:
y_min = df['latitude'].min()-0.2
y_max = df['latitude'].max()+0.2
x_min = df['longitude'].min()-0.2
x_max = df['longitude'].max()+0.2

In [12]:
x_arr = np.linspace(x_min,x_max,100)

In [13]:
y_arr = np.linspace(y_min,y_max,100)

In [14]:
xx,yy = np.meshgrid(x_arr,y_arr)

In [15]:
data = np.zeros((len(xx),len(yy)))

In [16]:
new_arr = np.array(df[['longitude','latitude','value']])

In [17]:
new_arr[0]

array([ 77.234291,  28.581197, 194.      ])

In [18]:
new = []
for points in new_arr:
    mindist = np.inf
    val = 0
    for j in range(len(yy)):
        temp = yy[j][0]
        for i in range(len(xx[0])):
            dist = np.linalg.norm(np.array([xx[0][i],temp]) - points[:2])
            if dist<mindist:
                mindist = dist
                val = (i,j)
    new.append((points,val))

In [20]:
new_grid = np.zeros((len(xx),len(yy)))
for i in range(len(new)):
    x = new[i][1][0]
    y = new[i][1][1]
    new_grid[x][y] = new[i][0][2]

In [39]:
points

array([77.22445, 28.63576, 96.     ])

In [37]:
xx

array([[76.791463  , 76.79879251, 76.80612201, ..., 77.50242499,
        77.50975449, 77.517084  ],
       [76.791463  , 76.79879251, 76.80612201, ..., 77.50242499,
        77.50975449, 77.517084  ],
       [76.791463  , 76.79879251, 76.80612201, ..., 77.50242499,
        77.50975449, 77.517084  ],
       ...,
       [76.791463  , 76.79879251, 76.80612201, ..., 77.50242499,
        77.50975449, 77.517084  ],
       [76.791463  , 76.79879251, 76.80612201, ..., 77.50242499,
        77.50975449, 77.517084  ],
       [76.791463  , 76.79879251, 76.80612201, ..., 77.50242499,
        77.50975449, 77.517084  ]])

In [38]:
yy

array([[28.29968   , 28.29968   , 28.29968   , ..., 28.29968   ,
        28.29968   , 28.29968   ],
       [28.30698557, 28.30698557, 28.30698557, ..., 28.30698557,
        28.30698557, 28.30698557],
       [28.31429113, 28.31429113, 28.31429113, ..., 28.31429113,
        28.31429113, 28.31429113],
       ...,
       [29.00831987, 29.00831987, 29.00831987, ..., 29.00831987,
        29.00831987, 29.00831987],
       [29.01562543, 29.01562543, 29.01562543, ..., 29.01562543,
        29.01562543, 29.01562543],
       [29.022931  , 29.022931  , 29.022931  , ..., 29.022931  ,
        29.022931  , 29.022931  ]])

We are storing the values in the grid. 
How do we interpolate next?
Oh! Of course! we know the distance between latitude and longitude

In [30]:
new_grid[x][y]

96.0