# Point Pattern Analysis

## Introduction

This notebook analyzes the spatial distribution of PM2.5 monitoring stations to determine if they are clustered, random, or dispersed across the study area.

## 1. Setup

Before running this script, you will need to install and load the following packages into your R environment:

...

If you are working in the I-GUIDE environment, these packages should be already be installed.  However you will still need to load the packages into your workspace using *library* base R function.

In [None]:
library(sf)
library(dplyr)

## 2. Data Acquisition

Briefly describe the data sources (e.g., PM2.5 monitoring station locations).
Load the data and convert it to an sf object for spatial analysis.

In [None]:
pm25_sf <- readRDS("pm25_sf.rds")
pm25_sf <- st_to_sf(pm25_sf)

## 3. Data Preparation

Transform coordinates to a suitable projected CRS (e.g., meters) to ensure accurate distance calculations.
Define the study area boundary to constrain the analysis.

In [None]:
# Load data and convert to 'ppp' (point pattern) object
# Assuming 'pm25_sf' is your sf object with PM2.5 monitoring stations
pm25_points <- as(pm25_sf, "Spatial")  # Convert to Spatial
pm25_ppp <- as(ppp(coordinates(pm25_points)[,1], coordinates(pm25_points)[,2],
                   window = owin(xrange = range(coordinates(pm25_points)[,1]), 
                                 yrange = range(coordinates(pm25_points)[,2]))), 

## 4. Exploratory Visualization of Points

Plot the points to visually assess their spatial distribution and prepare for further analysis.

## 5. Quadrat Analysis

Objective: Test if points are evenly distributed across the study area by dividing it into a grid and counting points in each cell.
Use spatstat functions to create quadrats, calculate the observed variance-to-mean ratio, and interpret the results.

## 6. Nearest Neighbor Analysis

Objective: Measure distances to the nearest neighbor for each point and compare to an expected random distribution.
Calculate the mean nearest neighbor distance, then assess clustering or dispersion based on whether observed values differ from expected values.

## 7. Ripley’s K-Function

Objective: Test clustering at multiple scales.
Use spatstat or spatialEco to compute Ripley’s K, which assesses clustering across varying distances, and plot the results for interpretation.

## 8. Kernel Density Estimation (KDE)

Objective: Generate a density map showing regions of high and low point concentrations.
Use spatstat or ggplot2 with stat_density_2d() for KDE to identify hotspot areas.

## 9. Interpretation of Results

Discuss insights from each analysis, noting evidence of clustering, dispersion, or randomness.
Example insight: "High-density areas of PM2.5 stations could indicate regions where monitoring is prioritized, potentially due to known sources of pollution."

## Conclusion

Summarize findings and propose additional analyses or refinements (e.g., adding demographic data to study population exposure in dense PM2.5 station areas).

## Next Steps

From here, we recommend exploring the following notebooks:

* ...