# Lab 4: Point Pattern Analysis

#### In this lab, you will practise to use influential statistical methods to analyze point patterns. 

#### Specifically, the following methods will be used.

- Nearest Neighbor Distance
- Reply's K-function

Before the lab, please download the data from [here]()

## 1. Read and process data

Read required packages for this lab

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

import scipy.spatial
import libpysal as ps
import numpy as np
from pointpats import PointPattern,PoissonPointProcess, as_window, G, F, J, K, L, Genv, Fenv, Jenv, Kenv, Lenv

Change path of PROJ_LIB and import basemap.
You can copy the path from your work in Lab 3.

In [None]:
import os
# You need to replace the path with the folder of Anaconda in your computer
os.environ['PROJ_LIB'] = 'C:/ProgramData/Anaconda3/Library/share/'

from mpl_toolkits.basemap import Basemap

Read the chicago supermarkets as a data frame

In [None]:
df = pd.read_csv("C:/Users/yi/Documents/UH_work/Teaching/GEOG389/labs/lab4_data/chicago_stores.csv")

Preview data

In [None]:
df.head()

Import the basemap package and fix the bug

Please change the path of 'PROJ_LIB' to anaconda folder in your computer (you may copy this part from [Lab3_part2](http://localhost:8888/notebooks/Documents/UH_work/Teaching/GEOG389/git/Jupyter/Lab3_part2.ipynb))

Plot the store locations in a basemap

In [None]:
f, ax1 = plt.subplots(1, figsize=(15, 10))

map = Basemap(llcrnrlon=-87.956941,llcrnrlat=41.630464,urcrnrlon=-87.429476,urcrnrlat=42.039636, epsg=4269, ax=ax1)
#https://www.bdmweather.com/2018/04/python-m-arcgisimage-basemap-options/

map.arcgisimage(service='ESRI_StreetMap_World_2D', xpixels = 2000, verbose= True)

#ct.plot(color='white', edgecolor='black', linewidth = .1,ax=ax1)
ax1.plot(df['LONGITUDE'],df['LATITUDE'],'b*',markersize=5)


plt.show()

Extract stores within a rectangle. The coordinates of the bottom-left corner are [-87.7449, 41.7220] and the coordinates of the top-right corner are [-87.6247,41.8727]

In [None]:
df = df[(df['LATITUDE']> 41.721999) &(df['LATITUDE']<41.872709)]
df = df[(df['LONGITUDE']> -87.744988) &(df['LONGITUDE']<-87.624699)]

## 1. Convert coordinate system

In [None]:
from pyproj import Proj

Create the projected coordinate system of UTM zone 16T (the zone where Chicago is in)

Click [here](https://jswhit.github.io/pyproj/) for information of other projections.

![alt_text](images/fig12.jpg)

In [None]:
# create an projection object for UTM 16T zone.
myProj =  Proj("+proj=utm +zone = 16T, +north +ellps=GRS80 +datum=NAD83 +units=m +no_defs")

Convert the lat,lon of the points to northing and easting. 

Please write code to extract the columns of LONGITUDE and LATITUDE into two list

In [None]:
lat = 
lon = 

In [None]:
easting, northing = myProj(lat, lon)

Convert the lists of easting and northing coordinates into a data frame, because PointPattern only accept dataframe as the input

Note: when manipulating data in Python, you often needs do data conversion among different types to fit the input of different functions.

In [None]:
df = pd.DataFrame({'e': easting, 'n': northing})

In [None]:
pp = PointPattern(df)

Print the indices of the nearest neighbor of each point and the distance to the nearest neighbor

In [None]:
pp.knn()
#pp.knn(2)

Calculate the NNDs

In [None]:
pp.nnd

Calculate the NNDs

Mean NND

In [None]:
pp.mean_nnd

Maximum nearest neighbor distance

In [None]:
pp.max_nnd

Minimum nearest neighbor distance

In [None]:
pp.nnd.mean()

Plot the points

In [None]:
pp.plot()
#plt.axis('equal')

## 3. Statistic Test of Nearest Neighbor Distance

Simulate 999 randomly distributed point sets in the window of the store locations.

Note: this process may take a while to run

In [None]:
csr_process = PoissonPointProcess(pp.window, pp.n, 999, asPP=True)

Get the random point sets.

In [None]:
csr_real = csr_process.realizations

Plot points in one of the set (realization).

In [None]:
csr_real[6].plot()

In [None]:
csr_real[4].mean_nnd

Use a for loop to get the mean NND of the 999 random sets and store them in a list `csr_mean_ls`

Get the mean of the mean NNDs

In [None]:
np.mean(csr_mean_ls)

Let's check the mean Nearest Neighbord Distance of the observed sample (supermarket locations).

In [None]:
pp.mean_nnd

Get the mean and standard deviation of NNDs of the 999 random sets.

In [None]:
csr_mean = np.mean(csr_mean_ls)
csr_std = np.std(csr_mean_ls)

In [None]:
pd.DataFrame(csr_mean_ls).hist(bins=50)

plt.axvline([csr_mean-2*csr_std], color='b')
plt.axvline([csr_mean-1*csr_std], color='b')
plt.axvline([csr_mean], color='b')
plt.axvline([csr_mean+csr_std], color='b')
plt.axvline([csr_mean+2*csr_std], color='b')

plt.axvline([pp.mean_nnd], color='r')

In [None]:
from scipy import stats
stats.ttest_1samp(csr_mean_ls,pp.mean_nnd)

In [None]:
z = (pp.mean_nnd - np.mean(csr_mean_ls))/np.std(csr_mean_ls)
z

In [None]:
stats.norm.cdf(z)

Refer to the following illustration, how would you interpret the result?

![](images/fig11.jpg)

### K-function

In [None]:
realizations = PoissonPointProcess(pp.window, pp.n, 100, asPP=True) # simulate CSR 100 times

In [None]:
type(realizations)

In [None]:
kenv = Kenv(pp, intervals=20, realizations=realizations)
kenv.plot()

Remember how to analyze the K-function result.
![alt_text](images/fig13.png)