# Spatial Joins

## Introduction

Spatial joins are the bread-and-butter of spatial databases. They allow you to combine information from different tables by using spatial relationships as the join key. Much of what we think of as “standard GIS analysis” can be expressed as spatial joins.

## Installation

Uncomment the following cell to install the required packages if needed.

In [None]:
# %pip install duckdb leafmap

## Library Import and Configuration

In [2]:
import duckdb
import leafmap

## Sample Data

The datasets in the database are in NAD83 / UTM zone 18N projection, EPSG:26918.

In [3]:
url = "https://open.gishub.org/data/duckdb/nyc_data.db.zip"
leafmap.download_file(url, unzip=True)

nyc_data.db.zip already exists. Skip downloading. Set overwrite=True to overwrite.


'c:\\Users\\vance\\OneDrive\\1 Consulting\\Spatial\\geog-414-copy\\book\\duckdb\\nyc_data.db.zip'

## Connecting to DuckDB

Connect jupysql to DuckDB using a SQLAlchemy-style connection string. You may either connect to an in memory DuckDB, or a file backed db.

In [4]:
con = duckdb.connect('nyc_data.db')

In [5]:
con.install_extension('spatial')
con.load_extension('spatial')

In [6]:
con.sql("SHOW TABLES;")

┌─────────────────────┐
│        name         │
│       varchar       │
├─────────────────────┤
│ nyc_census_blocks   │
│ nyc_homicides       │
│ nyc_neighborhoods   │
│ nyc_streets         │
│ nyc_subway_stations │
└─────────────────────┘

## Intersection

In the previous section, we explored spatial relationships using a two-step process: first we extracted a subway station point for ‘Broad St’; then, we used that point to ask further questions such as “what neighborhood is the ‘Broad St’ station in?”

Using a spatial join, we can answer the question in one step, retrieving information about the subway station and the neighborhood that contains it. 

Let's start by looking at the subway stations and neighborhoods separately.

In [6]:
# in UTM in meters. If import a lat lon database have to convert to UTM to calculate distance
con.sql("FROM nyc_neighborhoods SELECT * LIMIT 5;")

┌───────────┬──────────────────────┬───────────────────────────────────────────────────────────────────────────────────┐
│ BORONAME  │         NAME         │                                       geom                                        │
│  varchar  │       varchar        │                                     geometry                                      │
├───────────┼──────────────────────┼───────────────────────────────────────────────────────────────────────────────────┤
│ Brooklyn  │ Bensonhurst          │ MULTIPOLYGON (((582771.4257198056 4495167.427365481, 584651.2943549604 4497541.…  │
│ Manhattan │ East Village         │ MULTIPOLYGON (((585508.7534890148 4509691.267208001, 586826.3570590394 4508984.…  │
│ Manhattan │ West Village         │ MULTIPOLYGON (((583263.2776595836 4509242.626023987, 583276.8199068634 4509378.…  │
│ The Bronx │ Throggs Neck         │ MULTIPOLYGON (((597640.0090688139 4520272.719938631, 597647.7457808304 4520617.…  │
│ The Bronx │ Wakefield-Williams

In [7]:
con.sql("FROM nyc_subway_stations SELECT * LIMIT 5;")

┌──────────┬────────┬──────────────┬─────────────────┬───┬─────────┬─────────┬─────────┬──────────────────────┐
│ OBJECTID │   ID   │     NAME     │    ALT_NAME     │ … │  COLOR  │ EXPRESS │ CLOSED  │         geom         │
│  double  │ double │   varchar    │     varchar     │   │ varchar │ varchar │ varchar │       geometry       │
├──────────┼────────┼──────────────┼─────────────────┼───┼─────────┼─────────┼─────────┼──────────────────────┤
│      1.0 │  376.0 │ Cortlandt St │ NULL            │ … │ YELLOW  │ NULL    │ NULL    │ POINT (583521.8544…  │
│      2.0 │    2.0 │ Rector St    │ NULL            │ … │ RED     │ NULL    │ NULL    │ POINT (583324.4866…  │
│      3.0 │    1.0 │ South Ferry  │ NULL            │ … │ RED     │ NULL    │ NULL    │ POINT (583304.1823…  │
│      4.0 │  125.0 │ 138th St     │ Grand Concourse │ … │ GREEN   │ NULL    │ NULL    │ POINT (590250.1059…  │
│      5.0 │  126.0 │ 149th St     │ Grand Concourse │ … │ GREEN   │ express │ NULL    │ POINT (590454.7

In [8]:
# When can't see all the columns convert to a data frame
# or add .columns at the very end, e.g. to get a list of columns
#con.sql("SELECT * FROM nyc_subway_stations;").to_df().columns

con.sql("SELECT * FROM nyc_subway_stations;").to_df()

Unnamed: 0,OBJECTID,ID,NAME,ALT_NAME,CROSS_ST,LONG_NAME,LABEL,BOROUGH,NGHBHD,ROUTES,TRANSFERS,COLOR,EXPRESS,CLOSED,geom
0,1.0,376.0,Cortlandt St,,Church St,"Cortlandt St (R,W) Manhattan","Cortlandt St (R,W)",Manhattan,,"R,W","R,W",YELLOW,,,"[0, 0, 24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,..."
1,2.0,2.0,Rector St,,,Rector St (1) Manhattan,Rector St (1),Manhattan,,1,1,RED,,,"[0, 0, 24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,..."
2,3.0,1.0,South Ferry,,,South Ferry (1) Manhattan,South Ferry (1),Manhattan,,1,1,RED,,,"[0, 0, 24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,..."
3,4.0,125.0,138th St,Grand Concourse,Grand Concourse,"138th St / Grand Concourse (4,5) Bronx","138th St / Grand Concourse (4,5)",Bronx,,45,45,GREEN,,,"[0, 0, 24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,..."
4,5.0,126.0,149th St,Grand Concourse,Grand Concourse,149th St / Grand Concourse (4) Bronx,149th St / Grand Concourse (4),Bronx,,4,245,GREEN,express,,"[0, 0, 24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
486,487.0,909.0,JFK Terminal 8,,,"JFK Terminal 8, Queens",JFK Terminal 8,Queens,,,,AIR-BLUE,,,"[0, 0, 24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,..."
487,488.0,903.0,Federal Circle,Rental Car,,"Federal Circle / Rental Car, Queens",Federal Circle / Rental Car,Queens,,,,AIR-BLUE,,,"[0, 0, 24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,..."
488,489.0,902.0,Long Term Parking,,,"Long Term Parking, Queens",Long Term Parking,Queens,,,,AIR-BLUE,,,"[0, 0, 24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,..."
489,490.0,901.0,Howard Beach,,159th Ave,"Howard Beach, Queens",Howard Beach,Queens,,,A,AIR-BLUE,,,"[0, 0, 24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,..."


Let's find out what neighborhood the `Broad St` station is in:

In [9]:
con.sql("""
SELECT
  subways.name AS subway_name,
  neighborhoods.name AS neighborhood_name,
  neighborhoods.boroname AS borough
FROM nyc_neighborhoods AS neighborhoods
JOIN nyc_subway_stations AS subways
ON ST_Intersects(neighborhoods.geom, subways.geom)
WHERE subways.NAME = 'Broad St';
""")

┌─────────────┬────────────────────┬───────────┐
│ subway_name │ neighborhood_name  │  borough  │
│   varchar   │      varchar       │  varchar  │
├─────────────┼────────────────────┼───────────┤
│ Broad St    │ Financial District │ Manhattan │
└─────────────┴────────────────────┴───────────┘

Note that the subway stations table has a `color` column.

In [10]:
con.sql("""
SELECT DISTINCT COLOR FROM nyc_subway_stations;
""")

┌───────────────┐
│     COLOR     │
│    varchar    │
├───────────────┤
│ RED-GREEN     │
│ BLUE-GREY     │
│ GREEN-ORANGE  │
│ GREEN-RED     │
│ BLUE-LIME     │
│ BLUE-BROWN    │
│ ORANGE        │
│ BROWN-ORANGE  │
│ AIR-BLUE      │
│ YELLOW        │
│  ·            │
│  ·            │
│  ·            │
│ GREY          │
│ BLUE          │
│ RED           │
│ BLUE-ORANGE   │
│ CLOSED        │
│ ORANGE-YELLOW │
│ BROWN         │
│ MULTI         │
│ GREY-ORANGE   │
│ PURPLE        │
├───────────────┤
│    29 rows    │
│  (20 shown)   │
└───────────────┘

Let's find out what neighborhood the `RED` subway stations are in:

In [11]:
con.sql("""
SELECT
  subways.name AS subway_name,
  subways.express AS express,
  neighborhoods.name AS neighborhood_name,
  neighborhoods.boroname AS borough
FROM nyc_neighborhoods AS neighborhoods
JOIN nyc_subway_stations AS subways
ON ST_Intersects(neighborhoods.geom, subways.geom)
WHERE subways.color = 'RED';
""")

┌──────────────┬─────────┬──────────────────────────┬───────────┐
│ subway_name  │ express │    neighborhood_name     │  borough  │
│   varchar    │ varchar │         varchar          │  varchar  │
├──────────────┼─────────┼──────────────────────────┼───────────┤
│ 242nd St     │ NULL    │ Riverdale                │ The Bronx │
│ 241st St     │ NULL    │ Wakefield-Williamsbridge │ The Bronx │
│ 238th St     │ NULL    │ Kings Bridge             │ The Bronx │
│ 231st St     │ NULL    │ Kings Bridge             │ The Bronx │
│ 225th St     │ NULL    │ Inwood                   │ Manhattan │
│ 215th St     │ NULL    │ Inwood                   │ Manhattan │
│ 207th St     │ NULL    │ Inwood                   │ Manhattan │
│ Dyckman St   │ NULL    │ Washington Heights       │ Manhattan │
│ 191st St     │ NULL    │ Washington Heights       │ Manhattan │
│ 181st St     │ NULL    │ Washington Heights       │ Manhattan │
│    ·         │  ·      │    ·                     │     ·     │
│    ·    

In [12]:
con.sql("""
SELECT
  subways.name AS subway_name,
  subways.express AS express,
  neighborhoods.name AS neighborhood_name,
  neighborhoods.boroname AS borough
FROM nyc_neighborhoods AS neighborhoods
JOIN nyc_subway_stations AS subways
ON ST_Intersects(neighborhoods.geom, subways.geom)
WHERE subways.color = 'RED'
AND express = 'express';
""")

┌──────────────┬─────────┬────────────────────┬───────────┐
│ subway_name  │ express │ neighborhood_name  │  borough  │
│   varchar    │ varchar │      varchar       │  varchar  │
├──────────────┼─────────┼────────────────────┼───────────┤
│ 148th St     │ express │ Harlem             │ Manhattan │
│ 145th St     │ express │ Harlem             │ Manhattan │
│ 135th St     │ express │ Harlem             │ Manhattan │
│ 125th St     │ express │ Harlem             │ Manhattan │
│ 116th St     │ express │ Harlem             │ Manhattan │
│ 110th St     │ express │ Harlem             │ Manhattan │
│ 96th St      │ express │ Upper West Side    │ Manhattan │
│ 72nd St      │ express │ Upper West Side    │ Manhattan │
│ 59th St      │ express │ Upper West Side    │ Manhattan │
│ Times Sq     │ express │ Garment District   │ Manhattan │
│ 34th St      │ express │ Garment District   │ Manhattan │
│ 14th St      │ express │ Greenwich Village  │ Manhattan │
│ Chambers St  │ express │ Tribeca      

In [14]:
con.sql("""
FROM nyc_census_blocks Select * Limit 5;
""").df()

Unnamed: 0,BLKID,POPN_TOTAL,POPN_WHITE,POPN_BLACK,POPN_NATIV,POPN_ASIAN,POPN_OTHER,BORONAME,geom
0,360850009001000,97,51,32,1,5,8,Staten Island,"[5, 4, 184, 0, 0, 0, 0, 0, 55, 3, 13, 73, 151,..."
1,360850020011000,66,52,2,0,7,5,Staten Island,"[5, 4, 136, 0, 0, 0, 0, 0, 178, 58, 13, 73, 72..."
2,360850040001000,62,14,18,2,25,3,Staten Island,"[5, 4, 120, 0, 0, 0, 0, 0, 82, 227, 12, 73, 55..."
3,360850074001000,137,92,12,0,13,20,Staten Island,"[5, 4, 184, 0, 0, 0, 0, 0, 204, 85, 13, 73, 10..."
4,360850096011000,289,230,0,0,32,27,Staten Island,"[5, 4, 89, 0, 0, 0, 0, 0, 107, 247, 12, 73, 7,..."


## Distance Within

One of the common spatial operations is to find all the features within a certain distance of another feature. For example, you might want to find all the subway stations within 500 meters of a bike share station. Let’s explore the racial geography of New York using distance queries.

First, let’s get the baseline racial make-up of the city.

In [15]:
con.sql("""
SELECT
  100.0 * Sum(popn_white) / Sum(popn_total) AS white_pct,
  100.0 * Sum(popn_black) / Sum(popn_total) AS black_pct,
  100.0 * Sum(popn_asian) / Sum(popn_total) AS asian_pct,
        
  Sum(popn_total) AS popn_total
FROM nyc_census_blocks;
""")

┌───────────────────┬────────────────────┬───────────────────┬────────────┐
│     white_pct     │     black_pct      │     asian_pct     │ popn_total │
│      double       │       double       │      double       │   int128   │
├───────────────────┼────────────────────┼───────────────────┼────────────┤
│ 44.00395007628105 │ 25.546578900241613 │ 12.70171174865126 │    8175032 │
└───────────────────┴────────────────────┴───────────────────┴────────────┘

So, of the 8M people in New York, about 44% are recorded as “white” and 26% are recorded as “black”.

Note that the contents of the `nyc_subway_stations` table routes field is what we are interested in to find the A-train. The values in there are a little complex.

In [16]:
con.sql("""
SELECT DISTINCT routes FROM nyc_subway_stations;
""")

┌────────────┐
│   ROUTES   │
│  varchar   │
├────────────┤
│ 4,5        │
│ 2          │
│ N,R        │
│ M,D        │
│ A,C,G      │
│ A,C,E,L    │
│ G,R,V      │
│ E,F,G,R,V  │
│ E,G,R,V    │
│ 2,5        │
│ ·          │
│ ·          │
│ ·          │
│ L          │
│ C          │
│ 4,5,6      │
│ L,N,Q,R,W  │
│ D,M        │
│ Q          │
│ B,D,E      │
│ E,F        │
│ E          │
│ NULL       │
├────────────┤
│  73 rows   │
│ (20 shown) │
└────────────┘

So to find the A-train, we will want any row in `routes` that has an ‘A’ in it. We can do this a number of ways, but here we will use the fact that **strpos(routes,'A')** will return a non-zero number only if ‘A’ is in the `routes` field.

In [17]:
# strpos returns the position of the first occurrence of a substring in a string
con.sql("""
SELECT DISTINCT routes
FROM nyc_subway_stations AS subways
WHERE strpos(subways.routes,'A') > 0;
""")

┌─────────┐
│ ROUTES  │
│ varchar │
├─────────┤
│ A,C     │
│ A,B,C,D │
│ A,C,G   │
│ A,C,E,L │
│ A       │
│ A,C,F   │
│ A,C,E   │
│ A,S     │
│ A,B,C   │
└─────────┘

Let’s summarize the racial make-up of within 200 meters of the A-train line.

In [18]:
# ST_Dwithin is string distance join within 200 m from those popos 200 m from the subway with routes starting with A
con.sql("""
SELECT
  100.0 * Sum(popn_white) / Sum(popn_total) AS white_pct,
  100.0 * Sum(popn_black) / Sum(popn_total) AS black_pct,
  Sum(popn_total) AS popn_total
FROM nyc_census_blocks AS census
JOIN nyc_subway_stations AS subways
ON ST_DWithin(census.geom, subways.geom, 200)
WHERE strpos(subways.routes,'A') > 0;
""")

FloatProgress(value=0.0, layout=Layout(width='auto'), style=ProgressStyle(bar_color='black'))

┌───────────────────┬───────────────────┬────────────┐
│     white_pct     │     black_pct     │ popn_total │
│      double       │      double       │   int128   │
├───────────────────┼───────────────────┼────────────┤
│ 45.59012559002023 │ 22.09362356709373 │     189824 │
└───────────────────┴───────────────────┴────────────┘

So the racial make-up along the A-train isn’t radically different from the make-up of New York City as a whole.

## Advanced Join

In the last section we saw that the A-train didn’t serve a population that differed much from the racial make-up of the rest of the city. Are there any trains that have a non-average racial make-up?

To answer that question, we’ll add another join to our query, so that we can simultaneously calculate the make-up of many subway lines at once. To do that, we’ll need to create a new table that enumerates all the lines we want to summarize.

In [19]:
# route char(1) is a single character column
con.sql("""
CREATE OR REPLACE TABLE subway_lines ( route char(1) );
INSERT INTO subway_lines (route) VALUES
  ('A'),('B'),('C'),('D'),('E'),('F'),('G'),
  ('J'),('L'),('M'),('N'),('Q'),('R'),('S'),
  ('Z'),('1'),('2'),('3'),('4'),('5'),('6'),
  ('7');
""")

In [20]:
con.sql("FROM subway_lines;")

┌─────────┐
│  route  │
│ varchar │
├─────────┤
│ A       │
│ B       │
│ C       │
│ D       │
│ E       │
│ F       │
│ G       │
│ J       │
│ L       │
│ M       │
│ N       │
│ Q       │
│ R       │
│ S       │
│ Z       │
│ 1       │
│ 2       │
│ 3       │
│ 4       │
│ 5       │
│ 6       │
│ 7       │
├─────────┤
│ 22 rows │
└─────────┘

Now we can join the table of subway lines onto our original query.

In [21]:
con.sql("""
SELECT
  lines.route,
  100.0 * Sum(popn_white) / Sum(popn_total) AS white_pct,
  100.0 * Sum(popn_black) / Sum(popn_total) AS black_pct,
  Sum(popn_total) AS popn_total
FROM nyc_census_blocks AS census
JOIN nyc_subway_stations AS subways
ON ST_DWithin(census.geom, subways.geom, 200)
JOIN subway_lines AS lines
ON strpos(subways.routes, lines.route) > 0
GROUP BY lines.route
ORDER BY black_pct DESC;
""")

FloatProgress(value=0.0, layout=Layout(width='auto'), style=ProgressStyle(bar_color='black'))

┌─────────┬────────────────────┬────────────────────┬────────────┐
│  route  │     white_pct      │     black_pct      │ popn_total │
│ varchar │       double       │       double       │   int128   │
├─────────┼────────────────────┼────────────────────┼────────────┤
│ S       │ 39.839644455121466 │ 46.503108014774334 │      33301 │
│ 3       │  42.72731756087282 │  42.06198693548893 │     223047 │
│ 5       │  33.79377760724286 │  41.38562664729877 │     218919 │
│ 2       │  39.26304853922876 │  38.39114588512005 │     291661 │
│ C       │  46.87871806640494 │ 30.598767440098747 │     224411 │
│ 4       │  37.55300060572121 │ 27.428313466439615 │     174998 │
│ B       │  39.95588172248356 │ 26.852519457641385 │     256583 │
│ A       │  45.59012559002023 │  22.09362356709373 │     189824 │
│ J       │  37.62955269040576 │ 21.637651380013697 │     132861 │
│ Q       │  56.88447982881239 │  20.63141166844987 │     127112 │
│ Z       │  38.35718630567766 │  20.15700496952864 │      871

As before, the joins create a virtual table of all the possible combinations available within the constraints of the JOIN ON restrictions, and those rows are then fed into a GROUP summary. The spatial magic is in the ST_DWithin function, that ensures only census blocks close to the appropriate subway stations are included in the calculation.

## Projection

DuckDB provides the `ST_Transform` function to transform geometries from one projection to another. The function takes three arguments: the geometry to transform, and the EPSG code of the projection of the input geometry, and the EPSG code of the projection to transform to.

In [22]:
# geometry is lon lat. have to convert to UTM to get distances
url = 'https://open.gishub.org/data/duckdb/cities.parquet'
con.sql(f"SELECT * EXCLUDE geometry, ST_GeomFromWKB(geometry) AS geometry FROM '{url}'")

┌─────────┬────────┬───────────┬───────────┬──────────────────┬────────────┬─────────────────────────────┐
│ country │   id   │ latitude  │ longitude │       name       │ population │          geometry           │
│ varchar │ double │  double   │  double   │     varchar      │   double   │          geometry           │
├─────────┼────────┼───────────┼───────────┼──────────────────┼────────────┼─────────────────────────────┤
│ UGA     │    1.0 │    0.5833 │   32.5333 │ Bombo            │    75000.0 │ POINT (32.5333 0.5833)      │
│ UGA     │    2.0 │     0.671 │    30.275 │ Fort Portal      │    42670.0 │ POINT (30.275 0.671)        │
│ ITA     │    3.0 │    40.642 │    15.799 │ Potenza          │    69060.0 │ POINT (15.799 40.642)       │
│ ITA     │    4.0 │    41.563 │    14.656 │ Campobasso       │    50762.0 │ POINT (14.656 41.563)       │
│ ITA     │    5.0 │    45.737 │     7.315 │ Aosta            │    34062.0 │ POINT (7.315 45.737)        │
│ ALD     │    6.0 │    60.097 │    1

Let's convert the data from EPSG:4326 to EPSG:5070 (NAD 83 CONUS Albers).

In [23]:
# true enforces as X,  5070 is conus albers
con.sql(f"""
SELECT * EXCLUDE geometry, ST_Transform(ST_GeomFromWKB(geometry), 'EPSG:4326', 'EPSG:5070', true) AS geometry FROM '{url}'
""")

┌─────────┬────────┬───────────┬───────────┬──────────────────┬────────────┬───────────────────────────────────────────┐
│ country │   id   │ latitude  │ longitude │       name       │ population │                 geometry                  │
│ varchar │ double │  double   │  double   │     varchar      │   double   │                 geometry                  │
├─────────┼────────┼───────────┼───────────┼──────────────────┼────────────┼───────────────────────────────────────────┤
│ UGA     │    1.0 │    0.5833 │   32.5333 │ Bombo            │    75000.0 │ POINT (11942089.442723729 7279926.80195…  │
│ UGA     │    2.0 │     0.671 │    30.275 │ Fort Portal      │    42670.0 │ POINT (11867630.052040448 6998929.07050…  │
│ ITA     │    3.0 │    40.642 │    15.799 │ Potenza          │    69060.0 │ POINT (7358230.68251168 6866592.8408855…  │
│ ITA     │    4.0 │    41.563 │    14.656 │ Campobasso       │    50762.0 │ POINT (7226148.687146555 6819079.567060…  │
│ ITA     │    5.0 │    45.737 │

In [26]:
# comes out as X, Y lon lat
df = con.sql(f"SELECT * EXCLUDE geometry, ST_AsText (ST_GeomFromWKB(geometry)) AS geometry FROM '{url}'").df()
df

Unnamed: 0,country,id,latitude,longitude,name,population,geometry
0,UGA,1.0,0.58330,32.53330,Bombo,75000.0,POINT (32.5333 0.5833)
1,UGA,2.0,0.67100,30.27500,Fort Portal,42670.0,POINT (30.275 0.671)
2,ITA,3.0,40.64200,15.79900,Potenza,69060.0,POINT (15.799 40.642)
3,ITA,4.0,41.56300,14.65600,Campobasso,50762.0,POINT (14.656 41.563)
4,ITA,5.0,45.73700,7.31500,Aosta,34062.0,POINT (7.315 45.737)
...,...,...,...,...,...,...,...
1244,BRA,1245.0,-22.92502,-43.22502,Rio de Janeiro,11748000.0,POINT (-43.22502 -22.92502)
1245,BRA,1246.0,-23.55868,-46.62502,Sao Paulo,18845000.0,POINT (-46.62502 -23.55868)
1246,AUS,1247.0,-33.92001,151.18518,Sydney,4630000.0,POINT (151.18518 -33.92001)
1247,SGP,1248.0,1.29303,103.85582,Singapore,5183700.0,POINT (103.85582 1.29303)


In [28]:
gdf =  leafmap.df_to_gdf(df, src_crs = 'EPSG:4326', dst_crs = 'EPSG:5070')
gdf

Unnamed: 0,country,id,latitude,longitude,name,population,geometry
0,UGA,1.0,0.58330,32.53330,Bombo,75000.0,POINT (11942089.443 7279926.802)
1,UGA,2.0,0.67100,30.27500,Fort Portal,42670.0,POINT (11867630.052 6998929.071)
2,ITA,3.0,40.64200,15.79900,Potenza,69060.0,POINT (7358230.683 6866592.841)
3,ITA,4.0,41.56300,14.65600,Campobasso,50762.0,POINT (7226148.687 6819079.567)
4,ITA,5.0,45.73700,7.31500,Aosta,34062.0,POINT (6552378.112 6487242.066)
...,...,...,...,...,...,...,...
1244,BRA,1245.0,-22.92502,-43.22502,Rio de Janeiro,11748000.0,POINT (7516340.693 -2185094.428)
1245,BRA,1246.0,-23.55868,-46.62502,Sao Paulo,18845000.0,POINT (7101948.951 -2487683.329)
1246,AUS,1247.0,-33.92001,151.18518,Sydney,4630000.0,POINT (-13937208.783 4302569.879)
1247,SGP,1248.0,1.29303,103.85582,Singapore,5183700.0,POINT (-12084880.760 11316811.940)


## Function List

https://duckdb.org/docs/archive/0.9.2/extensions/spatial#spatial-relationships

![](https://i.imgur.com/ogJojVX.png)

## References

- [Introduction to PostGIS - Spatial Joins](https://postgis.net/workshops/postgis-intro/joins.html)