# Uniform Point Density in a Circle
This notebook investigates the spatial homogeneity of the heat flow data.

In [None]:
import numpy as np
from pyproj import Proj
from pickle import Unpickler
from cache import cached_call
import matplotlib.pyplot as plt
from scipy.integrate import quad, dblquad
from sklearn.neighbors import KernelDensity
from matplotlib.patches import Circle, Wedge
from loaducerf3 import Polygon, PolygonSelector
from reheatfunq.data import distance_distribution
from matplotlib.patches import Polygon as MPolygon
from reheatfunq.coverings import random_global_R_disk_coverings

In [None]:
%config InlineBackend.figure_format = 'retina'

In [None]:
from math import acos, cos, sin, degrees, pi

## Data

In [None]:
hf_continental = np.load('intermediate/heat-flow-selection-mW_m2.npy')

In [None]:
with open('intermediate/02-Geometry.pickle','rb') as f:
    saf_geometry = Unpickler(f).load()

proj_saf = Proj(saf_geometry["proj_str"])

In [None]:
mask = np.ones(hf_continental.shape[1], dtype=bool)
hf_xy = np.stack(proj_saf(*hf_continental[1:3,:]), axis=1)

for poly in saf_geometry["selection_polygons_xy"]:
    select = PolygonSelector(Polygon(*poly[:-1].T))
    mask &= ~select.array_mask(hf_xy)
hf_independent = (hf_continental.T)[mask]

In [None]:
with open('intermediate/03-Buffered-Poly.pickle','rb') as f:
    buffered_poly = Unpickler(f).load()

## Distance distribution for uniform points in a circle
First, we determine the probability density of the distance $d$ of a pair
of points, both drawn from a uniform distribution in a circle. We illustrate
the derivation using a sketch:

In [None]:
def plot_circle_sketch(ax, x, d, R, fontsize=8):
    """
    Create a sketch illustrating the computation of the density.
    """
    # Location of the large R-circle:
    xR, yR = 0, 0

    # Location of the small circle
    xd, yd = 0, -x
    
    # The azimuth from the small circle center to
    # the intercept:
    if x + d > R:
        alpha = acos((x**2 + d**2 - R) / (2*x*d))
    else:
        alpha = pi
    
    # Location of the left intercept of the two:
    xint = -d * sin(alpha)
    yint = d * cos(alpha) - x
    
    if x + d > R:
        ax.add_patch(Wedge((xd,yd), 0.2*d, 90.0, 90.00+degrees(alpha), width=0,
                           facecolor='none', edgecolor='k'))
    ax.add_patch(Circle((xR,yR), R, facecolor='none', edgecolor='k'))
    ax.add_patch(Circle((xd,yd), d, facecolor='none', edgecolor='gray'))
    ax.add_patch(Wedge((xd,yd), d, 90.00-degrees(alpha), 90.0+degrees(alpha), width=0,
                       facecolor='none', edgecolor='tab:orange', linewidth=2.0))
    ax.scatter(xR, yR, marker='*', color='tab:blue')
    ax.scatter(xd, yd, marker='o', color='tab:blue')
    ax.scatter(xint, yint, marker='.', color='tab:red')
    if x + d > R:
        ax.plot([xR, xd, xint, xR], [yR, yd, yint, yR], color='k', linestyle='--', zorder=0, linewidth=1.0)
    else:
        ax.plot([xR, xint], [yR, yint], color='k', linestyle='--', zorder=0, linewidth=1.0)
    ax.text(0.5*(xd + xint), 0.5*(yd+yint), "$d$", ha='left' if degrees(alpha) > 90 else 'right', va='top',
            fontsize=8)
    ax.text(10e-3*R, 0.5*(yd+yR), "$x$", ha='left', va='center', fontsize=8)
    if x + d > R:
        ax.text(xd-0.05*d, yd, "$\\alpha$", ha='right', va='bottom', fontsize=8)
        ax.text(0.5*(xR + xint), 0.5*(yR+yint), "$R$", ha='right' if degrees(alpha) > 90 else 'left',
                va='bottom', fontsize=8)

In [None]:
fig = plt.figure(figsize=(7.510, 2.8))
#ax_bg = fig.add_axes((0,0,1,1))
ax0 = fig.add_axes((0.0, 0.0, 0.33, 1.0))
ax0.set_xlim(-1.9, 1.9)
ax0.set_ylim(-2.8, 1.2)
ax0.set_aspect('equal')
ax0.set_axis_off()
ax0.text(-1.9, 1.2, "(a)", ha='left', va='top')
plot_circle_sketch(ax0, 0.4, 0.7, 1.0)

ax1 = fig.add_axes((0.34, 0.0, 0.33, 1.0))
ax1.set_xlim(-1.9, 1.9)
ax1.set_ylim(-2.8, 1.2)
ax1.set_aspect('equal')
ax1.set_axis_off()
ax1.text(-1.9, 1.2, "(b)", ha='left', va='top')
plot_circle_sketch(ax1, 0.9, 1.85, 1.0)

ax2 = fig.add_axes((0.67, 0.0, 0.33, 1.0))
ax2.set_xlim(-1.9, 1.9)
ax2.set_ylim(-2.8, 1.2)
ax2.set_aspect('equal')
ax2.set_axis_off()
ax2.text(-1.9, 1.2, "(c)", ha='left', va='top')
plot_circle_sketch(ax2, 0.7, 0.2, 1.0)

Suppose that we have two points $p_0$ and $p_1$ drawn randomly from the uniform distribution on the disk.
Without loss of generality, we can rotate the disk as indicated in (a) so that the first point $p_0$ is
at distance $x$ from the center of the disk. A random point drawn from the disk would have a distance $x$
following the distribution
$$
    p(x) = \frac{2x}{R^2}\,.
$$
Now the orange circle wedge shows the set of points within the disk that are located at distance $d$ from $p_0$.
For the configuration shown in (a), the wedge intersects the disk's border in the red dot. This dot can be
parameterized by the angle $\alpha$, measured counterclockwise from the line from $p_0$ to the center. The angle
$\alpha$ can be computed from the law of cosines:
$$
    \alpha = \arccos\left(\frac{x^2 + d^2 - R^2}{2xd}\right)\,.
$$
The configuration of (a) is valid only for a limited set of $d$ depending on the value of $x$ (and vice versa).
The figure (b) shows that as $d$ increases, the wedge converges to a point. This yields an upper bound for $d$
for a given $x$:
$$
    d \leq R + x\,.
$$
The panel (c) illustrates another limit of the case shown in (a). If the sum of $d$ and $x$ is lower than the
disk radius, $d + x < R$, the full circle at distance $d$ around $p_0$ is always part of the disk. This can be
modeled by
$$
    \alpha(x,d,R) = \left\lbrace
                    \begin{array}
                        \,\arccos \left(\frac{x^2 + d^2 - R^2}{2xd}\right)\, & : x > R - d \\
                        \pi &: x \leq R - d
                    \end{array}\right. \,.
$$

We can now construct the density of point pairs $p_0$ and $p_1$ at distance $d$. Conditional on $p_0$ and its $x$,
the density of points $p_1$ at distance $d$ is proportional to circle wedge length (orange), $L=\alpha d$. Integrated over all possible $p_0$, we find:
$$
f(d) = \frac{1}{F} \int\limits_{\max\{d-R,\, 0\}}^R \!\!\!\!\!\!\!\mathrm{d}x\; p(x) \alpha(x,d,R) d
$$
for $0 \leq d \leq 2R$. The normalization constant is therefore
$$
F = \int\limits_0^{2R}\!\!\!\mathrm{d}y \int\limits_{\max\{y-R,\, 0\}}^R \!\!\!\!\!\!\!\mathrm{d}x\; p(x) \alpha(x,y,R) y\,.
$$

In [None]:
def integrand(x,d):
    if x == 0 or x == 1:
        return 0.0
    if x <= 1.0 - d:
        return 2 * pi * d * x
    c = (x**2 + d**2 - 1) / (2*x*d)
    if c > 1:
        # Should not happen, but might be due to numerics.
        if c - 1.0 < 1e-5:
            return 0.0
    if c < -1:
        # Should not happen, but might be due to numerics.
        if abs(c+1) < 1e-5:
            return 2 * pi * d * x
        
    return 2 * x * d * acos(c)

In [None]:
I = dblquad(integrand, 0, 2, lambda d : max(0.0, d-1.0), 1.0)[0]

In [None]:
def pdf(d):
    return quad(integrand, max(0.0, d-1.0), 1.0, args=(d,))[0] / I

def cdf(d):
    return dblquad(integrand, 0, d, lambda x : max(0.0, x-1.0), 1.0)[0] / I

In [None]:
D = np.linspace(0, 2, 100)
Ypdf = np.array([pdf(d) for d in D])
Ycdf = np.array([cdf(d) for d in D])

In [None]:
fig = plt.figure(figsize=(10,4))
ax = fig.add_subplot(121)
ax.plot(D,Ypdf)
ax = fig.add_subplot(122)
ax.plot(D,1.0 - Ycdf)
fig.tight_layout()

## Real-World Data Point Distance
Now we evaluate the real-world data from the NGHF (with the data filtering of notebook 01
applied and the data from the study area excluded).

In [None]:
R = 80e3
MIN_POINTS = 10

In [None]:
dist_lola_dmin = {}
DMIN_RANGE = [0.001, 0.1,0.5,1,2,10,20, 30]
for dmin in DMIN_RANGE:
    valid_points, _, distributions, distribution_lola, distribution_indices \
       = cached_call(random_global_R_disk_coverings, R, MIN_POINTS, hf_independent,
                     buffered_poly, saf_geometry["proj_str"], dmin=dmin*1e3, seed=908392)
    dist_lola_dmin[dmin] = distribution_lola

In [None]:
distance_distribution_dmin = {}
for dmin in DMIN_RANGE:
    ddd_i = []
    for lola in dist_lola_dmin[dmin]:
        #ddd_i.append(nearest_neighbor_distance_brute(lola[:,0], lola[:,1]))
        ddd_i.append(distance_distribution(lola[:,0], lola[:,1]))
    distance_distribution_dmin[dmin] = np.concatenate(ddd_i)

Rescaling the limit density by simply cutting of the $d < d_\mathrm{min}$ part:

In [None]:
Ycdf_dmin = np.array([cdf(dmin / 80) for dmin in DMIN_RANGE])

Investigate the CDFs:

In [None]:
fig = plt.figure(figsize=(8,8))
xmax = 0.0
axes = []
for i,dmin in enumerate(DMIN_RANGE):
    ax = fig.add_subplot(3,3,i+1)
    #ax.hist(distance_distribution_dmin[dmin], density=False, bins='auto')
    xi = np.sort(distance_distribution_dmin[dmin])
    ax.plot(xi, (np.arange(xi.size)+1)[::-1] / xi.size)
    #dfr = dflat_redux[dmin]
    x = 80e3 * D
    mask = x >= dmin * 1e3
    ax.plot(x[mask], (1.0-Ycdf[mask]) / (1.0 - Ycdf_dmin[i]))
    xmax = max(ax.get_xlim()[1],xmax)
    axes.append(ax)
    ax.set_title(dmin)

xplot = np.linspace(min(DMIN_RANGE),xmax)
yplot = xplot*np.exp(-3e-9*xplot**2)
yplot /= (yplot * (xplot[1]-xplot[0])).sum()
yplot *= 2e6
for ax in axes:
    ax.set_xlim(0,xmax)
fig.tight_layout();

##### Final results plot:

In [None]:
with plt.rc_context({'axes.labelsize' : 'small',
                     'axes.titlesize' : 'medium',
                     'xtick.labelsize': 'small',
                     'ytick.labelsize': 'small'}):
    fig = plt.figure(figsize=(7.510, 2.5), dpi=300)
    #ax_bg = fig.add_axes((0,0,1,1))
    xmax = 0.0
    ymax=0.0
    axes = []
    ax_leg = fig.add_axes((0,0,1.0, 0.14))
    ax_leg.set_axis_off()
    for i,dmin in enumerate([1,10,20, 30]):
        ax = fig.add_axes((0.07 + 0.24*i, 0.33, 0.2, 0.58))
        if i == 0:
            ax.set_ylabel('Neighbor density ($10^{-3}\mathrm{km}^{-1}$)')
        ax.set_xlabel('Distance $d$ (km)')
        xi = np.sort(1e-3*distance_distribution_dmin[dmin])

        kd = KernelDensity(kernel='gaussian', bandwidth=1.0).fit(xi.reshape((-1,1)))

        x0 = 80 * D
        x0 = x0[x0 >= dmin]
        y0 = 1e3*np.exp(kd.score_samples(x0.reshape((-1,1))))
        h0 = ax.plot(x0, y0, color='tab:blue', linewidth=1.0)

        x1 = 80 * D
        mask = x1 >= dmin
        x1 = x1[mask]
        y1 = Ypdf[mask] / (1.0 - Ycdf_dmin[i]) * 1e3 * 2/(160-dmin)
        h1 = ax.plot(x1, y1, color='k', linestyle='--', linewidth=0.8,
                     label='Uniform')

        # Polygon showing the integral difference between the two curves:
        xpoly = np.concatenate((x0,x1[::-1]))
        ypoly = np.concatenate((y0,y1[::-1]))
        h2 = ax.add_patch(MPolygon(np.stack((xpoly, ypoly),axis=1), color='lightgray'))

        xmax = max(ax.get_xlim()[1],xmax)
        ymax = max(ax.get_ylim()[1],ymax)
        axes.append(ax)
        ax.set_title(f"$d_\mathrm{{min}}={dmin}\,\mathrm{{km}}$", color='k' if dmin == 20 else '#505050')


    ax_leg.legend(handles=(h0[0],h1[0],h2),
                  labels=('Disk-covered NGHF','Uniform points in disk','Difference'),
                  ncol=3, loc='center')

    xplot = np.linspace(min(DMIN_RANGE),xmax)
    yplot = xplot*np.exp(-3e-9*xplot**2)
    yplot /= (yplot * (xplot[1]-xplot[0])).sum()
    yplot *= 2e6
    for ax in axes:
        ax.set_xlim(0,xmax)
        ax.set_ylim(0,ymax)
        ax.set_xticks([0,80,160])

    fig.savefig('figures/A5-NGHF-Neighbor-Density.pdf')

### License
```
A notebook to investigate the spatial uniformity of the heat flow
data base and determine the minimum data distance d_min.

This file is part of the REHEATFUNQ model.

Author: Malte J. Ziebarth (ziebarth@gfz-potsdam.de)

Copyright © 2019-2022 Deutsches GeoForschungsZentrum Potsdam,
            2022 Malte J. Ziebarth
            

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <https://www.gnu.org/licenses/>.
```