# Sampling Points and Building Grids

Learn how to sample grids and random points using GeoPandas. 

The example below shows you how to sample random locations and grid points from shapes in GeoPandas GeoDataFrames. We will discuss both how to sample points at random *inside of* individual shapes, and also how to build grids that *cover* an entire GeoDataFrame. These are contained within two separate functions: the `.sample_points()` method of GeoDataFrames and GeoSeries allows us to sample points *within* geometries, and the `geopandas.tools.grids.make_grid()` function allows us to *cover* geometries with grids.  

## Import Packages

To begin with, we need to import packages we'll use: 

In [None]:
import matplotlib.pyplot as plt
import geopandas

## Get example data

For this example, we will use the new york borough example data (`nybb`) provided with GeoPandas. 

In [None]:
nybb = geopandas.read_file(geopandas.datasets.get_path("nybb"))

To see what this looks like, view the dataframe:

In [None]:
nybb

Or visualize the data:

In [None]:
nybb.explore()

## Sampling random points from within geometries

To sample points from within a GeoDataFrame, use the `sample_points()` method:

In [None]:
sampled_points = nybb.sample_points()
sampled_points.explore()

By default, ten points are sampled from within each feature. To specify different sizes, provide an explicit number of points to sample. For example, we can sample 200 points randomly from each feature: 

In [None]:
n200_sampled_points = nybb.sample_points(200)
n200_sampled_points.explore()

This functionality also works for line geometries. For example, let's look only at the boundary of Manhattan Island:

In [None]:
manhattan_parts = nybb.iloc[[3]].explode()
manhattan_island = manhattan_parts.iloc[[30]]
manhattan_island.boundary.explore()

Sampling randomly from along this boundary can use the same `sample_points()` method:

In [None]:
manhattan_border_points = manhattan_island.boundary.sample_points(200)
m = manhattan_island.explore()
manhattan_border_points.explore(m=m, color='red')

Keep in mind that sampled points are returned as a single multi-part geometry, and that the distances over the line segments are calculated *along* the line. 

In [None]:
manhattan_border_points

If you want to separate out the individual sampled points, use the `.explode()` method on the dataframe:

In [None]:
manhattan_border_points.explode().reset_index(drop=True).head()

## Building fixed grids over an entire GeoDataFrame

By default, points are sampled within each geometry from what's known as a *Poisson point process* ([Wikipedia](https://en.wikipedia.org/wiki/Poisson_point_process)). Practically speaking, this means that the locations of points are drawn uniformly at random from within the geometry, and their locations do not depend on one another. 

A different kind of sampling can use *gridding*, where points are spaced evenly across a shape. We support two kinds of grids, "square" and "hexagonal". We can build both of them to cover an entire study area using the `geopandas.grids.make_grid` function: 

In [None]:
from geopandas.tools import grids

In [None]:
squaregrid_cover = grids.make_grid(nybb, size=30, method='square')

For a square grid, points are arranged in squares. By default, grids are also clipped to the extent of the shapes being gridded:

In [None]:
squaregrid_cover.explore(color='red')

For a "hexagonal" grid, points are arranged in hexagons (or triangles, depending on your perspective): 

In [None]:
hexgrid_cover = grids.make_grid(nybb, size=30, method='hex')
hexgrid_cover.explore(color='blue')

To sample a grid like this from *within* a shape, you can use the `.sample_points()` method with the option `method='grid'`. This grids *each* shape separately, and supports similar options to `geopandas.grids.make_grid()` above. However, this time, the style of grid is controlled using a `tile` argument, since `method` is used to choose between "random" and "grid":

In [None]:
nybb.sample_points(method='grid').explore()

Note how the densities of the grid varies by shape in this case. The `sample_points()` is attempting to sample the same "size" of grid for each shape, so smaller shapes will have more points. 

In [None]:
nybb.sample_points(method='grid', tile='hex').explore()

## Sampling random grids over polygons

In addition to so-called "fixed" grids, which are the same every time they are sampled, GeoPandas also supports "random" grids. These experience a different shift and rotation each time they are sampled. 

For example, we can build two "random" grids over New York City using the same `sample_points()` function, but requesting `method='random'` and setting a `tile` shape: 

In [None]:
sample_1 = nybb.sample_points(method='random', tile='square')
sample_2 = nybb.sample_points(method='random', tile='square')

Then, to see that the two grids are distinct, we can overlay them:

In [None]:
m = sample_1.explore(color='red', style_kwds=dict(opacity=.4))
sample_2.explore(m=m, color='blue', style_kwds=dict(opacity=.4))

To sample random grids from over an entire study area, use the bounding box of the GeoDataFrame:

In [None]:
from shapely.geometry import box

In [None]:
bounds = box(*nybb.total_bounds)
nybb_bbox = geopandas.GeoSeries(bounds, crs=nybb.crs)

In [None]:
sample_1 = nybb_bbox.sample_points(method='random', tile='hex', size=30)
sample_2 = nybb_bbox.sample_points(method='random', tile='hex', size=30)

In [None]:
m = sample_1.explore(color='red', style_kwds=dict(opacity=.4))
sample_2.explore(m=m, color='blue', style_kwds=dict(opacity=.4))

## Sampling from more complicated point pattern processes

Finally, the `sample_points()` method can use different sampling processes than those described above, so long as they are implemented in the `pointpats` package for spatial point pattern analysis. For example, a "cluster-poisson" process is a spatially-random cluster process where the "seeds" of clusters are chosen randomly, and then points around these clusters are distributed according again randomly. 

To see what this looks like, consider the following, where ten points will be distributed around four seeds within each of the boroughs in New York City:

In [None]:
sample_t = nybb.sample_points(method='cluster_poisson', size=50, n_seeds=4)

In [None]:
sample_t.explore()

## Conclusion

- use the `.sample_points()` methods to build random samples, random grids, or fixed grids within the geometries of each row of a dataframe. 
- use `geopandas.tools.grids.make_grid()` to build grids that *cover* a given geopandas dataframe. 