# Distance Calculations

## Introduction

Proximity analysis is a critical aspect of spatial analysis, enabling researchers to evaluate distances between spatial features, assess accessibility, and identify relationships between locations.  Proximity analysis is widely used in accessibility studies, urban planning, environmental research, and many other fields.

This notebook demonstrates how to perform proximity analysis using R, covering common scenarios and applications.

## 1. Setup
This section will guide you through the process of installing essential packages and setting your IPUMS API key.

##### Required Packages

[**geosphere**](https://cran.r-project.org/web/packages/geosphere/index.html) · Spherical Trigonometry . Spherical trigonometry for geographic applications. That is, compute distances and related measures for angular (longitude/latitude) locations.  This notebook uses the following functions from *geosphere*.

* [*distm*](https://rdrr.io/rforge/geosphere/man/distm.html) · Distance matrix
* [*distHaversine*](https://rdrr.io/rforge/geosphere/man/distHaversine.html) · Haversine great circle distance

[**leaflet**](https://cran.r-project.org/web/packages/leaflet/index.html) · Create Interactive Web Maps with the [*JavaScript Leaflet library*](https://leafletjs.com).  Create and customize interactive maps using the 'Leaflet' JavaScript library and the [*htmlwidgets*](https://www.htmlwidgets.org) package. These maps can be used directly from the R console, from 'RStudio', in Shiny applications and R Markdown documents.  This notebook uses the following functions from the *leaflet* package.

* *addControl* · Graphics elements and layers
  * *addTiles* · Add a tile layer to the map
  * *addCircleMarkers* · Add circle markers to the map
  * *addPolylines* · Add polylines to the map
* [*addLayersControl](https://rdrr.io/cran/leaflet/man/addLayersControl.html) · Add UI controls to switch layers on and off
* [*leaflet*](https://rdrr.io/cran/leaflet/man/leaflet.html) · Create a Leaflet map widget.

[**rnaturalearth**](https://cran.r-project.org/web/packages/rnaturalearth/index.html) · World Map Data from [Natural Earth](https://www.naturalearthdata.com).  Facilitates mapping by making natural earth map data from more easily available to R users.  This notebook uses the folloing functions from *rnationalearth*.

* [*ne_download*](https://rdrr.io/cran/rnaturalearth/man/ne_download.html) · Download data from Natural Earth and (optionally) read into R

[**rnaturalearthdata**](https://cran.r-project.org/web/packages/rnaturalearthdata/index.html) · World Vector Map Data from [Natural Earth](https://www.naturalearthdata.com) Used in [rnaturalearth](https://cran.r-project.org/web/packages/rnaturalearth/index.html). Access functions are provided in the accompanying package [rnaturalearth](https://cran.r-project.org/web/packages/rnaturalearth/index.html).

[**sf**](https://cran.r-project.org/web/packages/sf/index.html) · Support for simple features, a standardized way to encode spatial vector data. Binds to 'GDAL' for reading and writing data, to 'GEOS' for geometrical operations, and to 'PROJ' for projection conversions and datum transformations. Uses by default the 's2' package for spherical geometry operations on ellipsoidal (long/lat) coordinates.  This notebook uses the following functions from *sf*.

* [*geos_measures*](https://rdrr.io/cran/sf/man/geos_measures.html) · Compute geometric measurements
  * *st_distance* · Compute distance
* [*st_coordinates*](https://rdrr.io/cran/sf/man/st_coordinates.html) · retrieve coordinates in matrix form
* [*st_nearest_feature*](https://rdrr.io/cran/sf/man/st_nearest_feature.html) · get index of nearest featur
* [*st_transform*](https://rdrr.io/cran/sf/man/st_transform.html) · Transform or convert coordinates of simple feature

### 1a. Install and Load Required Packages
If you have not already installed the required packages, uncomment and run the code below:

In [None]:
# install.packages(c("geosphere", "leaflet", "rnaturalearth", "rnaturalearthdata", "sf"))

Load the packages into your workspace.

In [None]:
library(geosphere)
library(leaflet)
library(rnaturalearth)
library(rnaturalearthdata)
library(sf)

## 2. Load and Explore the Data

For this notebook, we’ll use the [*rnaturalearth*](https://cran.r-project.org/web/packages/rnaturalearth/index.html) package to access and load datasets from [*Natural Earth*](https://www.naturalearthdata.com). These packages provide direct access to Natural Earth’s geographic data without requiring an API key, making it simple to bring boundaries and point data directly into R.

We’ll import the following data files:

* Populated Places (points): locations of major cities and towns
* Coastline (polygons): ocean coastline

Each dataset will be returned as an sf object, allowing us to work easily with the files in R using the [*sf*](https://cran.r-project.org/web/packages/sf/index.html) package.

In [None]:
# populated places point locations
cities <- ne_download(scale = "medium", type = "populated_places", category = "cultural", returnclass = "sf")

# coastline
coastline <- ne_download(scale = "medium", type = "coastline", category = "physical", returnclass = "sf")

### Visualize the Data

To better understand our datasets, let's visualize them using an interactive map.

In [None]:
leaflet() %>%
    addTiles() %>%
    addCircleMarkers(data = cities, color = "blue", radius = 3, label = ~NAME, group = "Cities") %>%
    addPolylines(data = coastline, color = "green", group = "Coastline") %>%
    addLayersControl(overlayGroups = c("Cities", "Coastline"))

For this notebook, we will analyze only the subset of populated places which are located in the United States - i.e. the list of major cities in the United States.

In [None]:
# Ensure both layers have the same CRS
cities <- st_transform(cities, crs = st_crs(united_states))

# Spatial join to select cities within the United States
cities_within_us <- cities[cities$SOV0NAME == "United States" & cities$POP_MAX >= 1000000,]

# Display the selected cities
cities_within_us[order(cities_within_us$NAME), c("NAME", "POP_MAX")]

In [None]:
# Visualize the result
leaflet() %>%
  addTiles() %>%
  addCircleMarkers(data = cities_within_us, color = "blue", radius = 3, label = ~NAME, group = "Cities in US")

## 3. Basic Distance Calculations

### 3a. Euclidean Distance

Euclidean distance is the "straight-line" distance between two points in Cartesian space. It is calculated based on projected coordinates and is often used for smaller areas.

In [None]:
# Transform to projected CRS for accurate Euclidean distance
cities_projected <- st_transform(cities_within_us, crs = 26915)  # UTM Zone 15N

# Select a sample of cities for analysis
cities_projected <- cities_projected[1:10, ]

# Distance matrix (Euclidean)
dist_matrix_euclidean <- st_distance(cities_projected)

# Display the distance matrix (in meters)
dist_matrix_euclidean

### 3b. Geodesic Distance

Geodesic distance is the shortest path between two points on the Earth's surface. It is more accurate for larger areas or when working with lat/lon coordinates.

In [None]:
### requires the geosphere package ###

# Calculate geodesic distances using geosphere package
coords <- st_coordinates(cities)
#dist_matrix_geodesic <- distm(
#  x = coords[1:10, ],
#  fun = distHaversine  # Haversine formula for geodesic distance
#)

# Display the distance matrix (in meters)
#dist_matrix_geodesic

### **3c. Distance from Cities to Coastline**
To calculate the distance from each city to the nearest point on the coastline.

In [None]:
# Calculate distance from each city to the nearest coastline point
city_to_coast_distances <- st_distance(cities_within_us, coastline)

# Find the minimum distance for each city
min_distances <- apply(city_to_coast_distances, 1, min)

# Convert distances to kilometers
min_distances_km <- round(min_distances / 1000, 1)

# Create a data frame with city names and nearest coastline distances
city_distance_to_coastline <- data.frame(
  city_name = cities_within_us$NAME,
  state = cities_within_us$ADM1NAME,
  distance_to_nearest_coastline_km = min_distances_km
)

# Display the results
city_distance_to_coastline[order(city_distance_to_coastline$distance_to_nearest_coastline_km),]

Based this analysis, San Juan, Puerto Rico is the closest major city to the coast while Denver, Colorado is the furthest major city from the coast.

## 4. Nearest Neighbor Analysis

Nearest neighbor analysis identifies the closest spatial feature for each point in a dataset. This is useful for applications like finding the nearest service center or analyzing spatial clustering.

In [None]:
# Find the index of the nearest coastline feature for each city
nearest_indices <- st_nearest_feature(cities, coastline)

# Extract the corresponding distances
nearest_distances <- st_distance(cities, coastline[nearest_indices, ])

# Convert distances to kilometers
nearest_distances_km <- as.numeric(nearest_distances) / 1000

# Combine results into a data frame
nearest_results <- data.frame(
  city_name = cities$NAME,
  nearest_coastline_index = nearest_indices,
  distance_to_coastline_km = nearest_distances_km
)

# Display nearest neighbor results
nearest_results

## 5. Buffer Analysis

Buffer analysis creates zones of influence around spatial features, which can be used to analyze proximity impacts, such as areas within a certain distance of cities.

# Create buffer zones around cities
buffers <- st_buffer(cities_projected[1:10, ], dist = 50000)  # 50 km buffer

# Check which coastlines intersect with buffers
intersections <- st_intersects(buffers, coastline)

# Summarize results
buffer_results <- data.frame(
  city_name = cities$NAME[1:10],
  num_coastline_intersections = sapply(intersections, length)
)

# Display buffer analysis results
buffer_results

# Visualize buffers on a map
leaflet() %>%
  addTiles() %>%
  addPolygons(data = st_transform(buffers, crs = 4326), color = "green", fillOpacity = 0.2, group = "Buffers") %>%
  addCircleMarkers(data = cities, color = "blue", radius = 5, group = "Cities") %>%
  addPolylines(data = coastline, color = "red", group = "Coastline") %>%
  addLayersControl(overlayGroups = c("Buffers", "Cities", "Coastline"))