# Storing spatial data in Delta Lake as WKB

You can store geometry or geography data in a Delta Lake table in a [BINARY](https://docs.databricks.com/aws/en/sql/language-manual/data-types/binary-type) column as [Well-known binary](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry#Well-known_binary) (WKB or EWKB). This is a more compact representation than well-known text (WKT), and widely supported incl. in the Geoparquet specification. On the other hand, unlike the newer [GEOMETRY](https://docs.databricks.com/aws/en/sql/language-manual/data-types/geometry-type) and GEOGRAPHY types, there is no higher level semantic support possible. Also, you need to use the conversion function [st_geomfromwkb](https://docs.databricks.com/gcp/en/sql/language-manual/functions/st_geomfromwkb) or [st_geomfromwekb](https://docs.databricks.com/gcp/en/sql/language-manual/functions/st_geomfromewkb) before any other ST function.

In [0]:
%sql
create temporary view t_ewkb as
select
  st_asewkb(st_point(3, 0, 4326)) wkb_geometry

In [0]:
%sql
select
  st_astext(st_geomfromewkb(wkb_geometry)) wkt,
  st_srid(st_geomfromewkb(wkb_geometry)) srid,
  wkb_geometry
from
  t_ewkb
-- Returns:
--   wkt	srid	wkb_geometry
-- POINT(3 0)	4326	AQEAACDmEAAAAAAAAAAACEAAAAAAAAAAAA==

Another example of Delta Lake tables with WKB columns are the [Overture Maps datasets](https://marketplace.databricks.com/provider/dd56dcf4-cb70-449e-abad-c8038c0de3d9/CARTO) prepared by CARTO, available via the Databricks Marketplace. Follow the previous link to add any them (at no cost) to your catalog, if you haven't yet. For example, for the below query, use [Divisions](https://dbc-63c92876-2d84.cloud.databricks.com/marketplace/consumer/listings/2b2d3511-55cf-493c-8224-f5c5103a8d74?o=3737817604111714) (borders of countries and other administrative divisions):

::: {.callout-caution}

The CARTO/Overture Maps tables are stored in `us-west-2` as of writing, so if you are _not_ using Databricks Free Edition and you are in any other region, you will have to pay egress charges based on the amount of data you read.

:::

In [0]:
%sql
select
  names:primary as name,
  geometry
from
  carto_overture_maps_divisions.carto.division_area
where
  subtype = 'country'
-- Returns:
-- name	geometry
-- """Australia"""	AQYAAAAjAAAAAQMAAAABAAAA (truncated)
-- """Timor-Leste"""	AQYAAAAEAAAAAQMAAAABAAAA (truncated)
-- """Vanuatu"""	AQYAAABlAAAAAQMAAAABAAAA (truncated)
-- ... ...

These CARTO tables also show one pattern to organize and cluster tables with geometries: they include the bounding box columns `__carto_xmin`, `__carto_xmax`, `__carto_ymin`, `___carto_ymax` and are clustered by these colums.

Another pattern would be to make use of spatial indexing such as [H3](https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-h3-geospatial-functions).

## Example usage with ST functions

In [0]:
%sql
with countries as (
  select
    country,
    st_geogfromwkb(geometry) geography
  from
    carto_overture_maps_divisions.carto.division_area
  where
    subtype = 'country'
    and class = 'land'
    and country in ('GB', 'FR')
)
select
  country,
  st_area(geography) / 1e6 area_km2s
from
  countries
-- Returns:
-- country	area_km2s
-- FR	549231.6644010496
-- GB	244408.1099778328