# Export Delta Lake table to Geoparquet with DuckDB

## Setup

In [0]:
%pip install duckdb --quiet

import duckdb

duckdb.sql("install spatial; load spatial")

In [0]:
CATALOG = "workspace"
SCHEMA = "default"
VOLUME = "default"

GEOMETRY_COLUMN = "geometry"

spark.sql(f"create volume if not exists {CATALOG}.{SCHEMA}.{VOLUME}")

Let's first create an example table with GEOMETRY columns:

In [0]:
%sql
create or replace table tmp_geometries as
select
  st_point(0, 0, 4326) as geometry,
  "Null Island" as name
union all
select
  st_transform(st_point(155000, 463000, 28992), 4326) as geometry,
  "Onze Lieve Vrouwetoren" as name
union all
select
  st_makepolygon(
    st_makeline(
      array(
        st_point(- 80.1935973, 25.7741566, 4326),
        st_point(- 64.7563086, 32.3040273, 4326),
        st_point(- 66.1166669, 18.4653003, 4326),
        st_point(- 80.1935973, 25.7741566, 4326)
      )
    )
  ) as geometry,
  "Bermuda Triangle" as name;

select
  *
from
  tmp_geometries
-- Returns:

-- _sqldf:pyspark.sql.connect.dataframe.DataFrame
-- geometry:geometry(OGC:CRS84)
-- name:string

-- geometry	name
-- SRID=4326;POINT(0 0)	Null Island
-- SRID=4326;POINT(5.3872035084137675 52.15517230119224)	Onze Lieve Vrouwetoren
-- SRID=4326;POLYGON((-80.1935973 25.7741566,-64.7563086 32.3040273,-66.1166669 18.4653003,-80.1935973 25.7741566))	Bermuda Triangle

We'll use DuckDB Spatial to write he Geoparquet file, so first, we output the above Delta Lake table as a directory of Parquet files, using lon/lat coordinates.

(You could also use Databricks [Temporary Table Credentials API](https://docs.databricks.com/api/workspace/temporarytablecredentials) to directly read the Delta Lake table with the DuckDB [Delta Extension](https://duckdb.org/docs/stable/core_extensions/delta.html) instead.)

In [0]:
from pyspark.sql import functions as F

spark.table("tmp_geometries").withColumn(
    "geometry", F.expr("st_transform(geometry, 4326)")
).write.mode("overwrite").parquet(f"/Volumes/{CATALOG}/{SCHEMA}/{VOLUME}/geometries.parquet")

Now we can use duckdb to transform the Parquet files into a valid Geoparquet files:

::: {.callout-note}

(Note that if you didn't load the DuckDB Spatial extension, the below would still succeed but Geoparquet metadata would _not_ be written.)

:::

In [0]:
query = f"""
load spatial;
copy (
select 
    * replace (st_geomfromwkb({GEOMETRY_COLUMN}) as geometry)
from
    read_parquet('/Volumes/{CATALOG}/{SCHEMA}/{VOLUME}/geometries.parquet/part-*.parquet')
) to '/Volumes/{CATALOG}/{SCHEMA}/{VOLUME}/geometries_geo.parquet' (format parquet)"""
duckdb.sql(query)

There are more details around writing Geoparquet such as writing custom CRS's or defining a ["covering"](https://geoparquet.org/releases/v1.1.0/) using bounding boxes, but the above example is already a valid Geoparquet. For example, if your QGIS already supports the Parquet format (as of Aug 2025, the latest Windows version does but the latest macOS version doesn't), then you can open this file in QGIS:

![geoparquet in qgis](img/geoparquet_qgis.png)

(in fact, the [GDAL Parquet reader](https://gdal.org/en/stable/drivers/vector/parquet.html) used by QGIS can even open parquet files that are not valid geoparquet, as long as they have a WKB or WKT column and the column name and CRS matches the expected defaults or correctly defined)

## Cleanup

In [0]:
%sql
drop table tmp_geometries