# ES|QL concepts
## FOSS4G Europe - Mostar

July 2025

## Resources

* [Blog post announcement](https://www.elastic.co/blog/esql-elasticsearch-piped-query-language)
* [Documentation](https://www.elastic.co/docs/explore-analyze/query-filter/languages/esql)
* [Reference](https://www.elastic.co/docs/reference/query-languages/esql)
* Webinar: [ES|QL: Search. Aggregate. Transform. Visualize. All with one query](https://www.elastic.co/virtual-events/cd-esql-search-aggregate-transform-visualize-all-with-one-query)
* Technical blog posts in [Search Labs](https://www.elastic.co/search-labs/blog/category/esql)
* [Run this notebook in Google Colaboratory](https://colab.research.google.com/github/jsanz/foss4g_europe_lab/blob/main/02-esql.ipynb)

## Setup

In [1]:
# Install required dependencies using uv, if available, or directly with pip
!pip install -qU elasticsearch geopandas

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/914.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m194.6/914.3 kB[0m [31m5.6 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m911.4/914.3 kB[0m [31m15.2 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m914.3/914.3 kB[0m [31m9.9 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/338.4 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m338.4/338.4 kB[0m [31m18.5 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/65.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m65.0/65.0 kB[0m [31m2.3 MB/s[0m eta [36m0:00:00[0m
[?25h

Start with the necessary imports, a couple tweaks, and defining a "esql" query that will make easier to see the results of a query, by generating a Pandas or Geopandas dataframe.

In [2]:
import os
import io

import warnings

from elasticsearch import Elasticsearch
from elasticsearch import ElasticsearchWarning
from elasticsearch.exceptions import BadRequestError

import pandas as pd
import geopandas as gpd
from shapely import wkb

# Hide the warning when no LIMIT is passed on a ES|QL query
warnings.filterwarnings('ignore', category=ElasticsearchWarning)

# Allow wide columns
pd.set_option('display.max_colwidth', None)

# Convert Well-known Binary to Text
def wkb_to_wkt(wkb_bytes):
    if wkb_bytes is None:
        return None
    try:
        return wkb.loads(wkb_bytes).wkt
    except Exception as e:
        print(f"Error converting WKB: {wkb_bytes} - {e}")
        return None

# Generate a Pandas Dataframe or a Geopandas Dataframe from a ES|QL query
def esql(query, geometry_col:str = "geometry", use_arrow:bool = False):
    try:
        # Query ES and create a Pandas Dataframe
        if use_arrow:
            es_response = client.esql.query(query=query.strip(), format="arrow", columnar=True)
            df = es_response.to_pandas()
        else:
            es_response = client.esql.query(query=query.strip(), format="csv")
            df = pd.read_csv(io.StringIO(str(es_response)))

        # Promote to a Geopandas Dataframe if a "geometry" column
        if geometry_col in df.columns:
            if use_arrow:
                # Arrow geometries are transferred as WKB
                df[geometry_col] = df[geometry_col].apply(wkb_to_wkt)
            gs = gpd.GeoSeries.from_wkt(df[geometry_col])
            gdf = gpd.GeoDataFrame(df, geometry=gs, crs="EPSG:4326")
            if geometry_col != "geometry":
                gdf.drop(columns="geometry")
            return gdf
        else:
            return df
    except BadRequestError as e:
        print("Something went wrong!")
        print(e.message)
        print("\r\n".join([c['reason'] for c in e.info['error']['root_cause']]))

Connect to Elasticsearch and print some cluster details

In [3]:
# Login details
ES_URL=os.getenv("ES_URL","https://foss4ge-lab.es.us-central1.gcp.cloud.es.io")
KB_URL=os.getenv("KB_URL","https://foss4ge-lab.kb.us-central1.gcp.cloud.es.io")

# API key that allows reading indices
ES_APIKEY=os.getenv("ES_APIKEY", "YlhrdDlwY0JPaUxuOUVMNlpHWDI6TG1LMXNFQTZQOXlGZUg5bFppaHN0UQ==")

# Load the client
client = Elasticsearch(hosts=[ES_URL], api_key=ES_APIKEY)

# Check the client
if client.ping():
  print("Connected to Elasticsearch")
  c_info = client.info()
  is_serverless = c_info['version']['build_flavor'] == 'serverless'

  # Print some cluster details
  print(f"Elasticsearch URL: {ES_URL}")
  print(f"Cluster name: {c_info['name']}")
  print(f"Version: {c_info['version']['number'] if not is_serverless else 'serverless'}")
  print("Number of documents indexed: ", client.count(index="*")['count'])
else:
  print("Connection failed")


Connected to Elasticsearch
Elasticsearch URL: https://foss4ge-lab.es.us-central1.gcp.cloud.es.io
Cluster name: instance-0000000003
Version: 9.0.3
Number of documents indexed:  2321937


## Syntax and API

### Basic syntax

<https://www.elastic.co/docs/reference/query-languages/esql/esql-syntax>

A ES|QL query is made of a `source` command that sets the data to retrieve and a list of processing commands starting with the pipe `|` character.

```text
source-command
| processing-command1
| processing-command2
```

A query can contain one line and multi line comments.

```
source-command           // Single line comment
| processing-command1    // another comment
/*
a multi
line comment in between
processing commands
*/
| processing-command2
```

About literals:

* Literals are duble quoted
* If a double quote is required in a literal, triple quotes can be used

```text
ROW name = """Indiana "Indy" Jones"""
```


### Query API

* [Documentation](https://www.elastic.co/docs/explore-analyze/query-filter/languages/esql-rest)
* [Reference](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-esql)


Elasticsearch exposes the `_query` endpoint to execute ES|QL queries, with a `format` parameter to select between different output types as `csv`, `tsv`, `arrow`, `json`, etc.

As a `curl` command a request would be as:

```
curl -X POST \
  -H "Authorization: ApiKey $ES_APIKEY" \
  -H "Content-Type: application/json" \
  -d '{"query":"FROM places-* | STATS count = count(name) | LIMIT 1"}'\
  "$ES_URL/_query?pretty&format=txt"

     count     
---------------
230573
```

In the [Kibana Console](https://www.elastic.co/docs/explore-analyze/query-filter/tools/console):

```
POST /_query?format=txt
{
  "query": "FROM places-* | STATS count = count(name) | LIMIT 1"
}
```

And sending multiline queries is possible with triple quotes:

```text
POST /_query?format=txt
{
  "query": """
  FROM places-*
  | STATS count = count(name)
  | LIMIT 1
  """
}
```

A ES|QL API query can also include a filter using Elasticsearch DSL language:


```text
POST /_query?format=txt
{
  "query": """
  FROM places-*
  | STATS count = count(name)
  | LIMIT 1
  """,
  "filter": {
    "range": {
      "confidence": {
        "gte": 0.1,
        "lte": 1
      }
    }
  }
}
```

Other API endpoints available:

* [`_query/async`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-esql-async-query): start, stop, and get results asynchronously
* [`_query/queries`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-esql-list-queries): get details about running queries

## ES|QL sources: `ROW`, `SHOW`, `FROM`

In [4]:
# Creating a row directly, useful to test functions
esql('ROW a = 1, b = "two", c = null')

Unnamed: 0,a,b,c
0,1,two,


In [5]:
# SHOW source returns Elasticsearch version
esql("SHOW INFO")

Unnamed: 0,version,date,hash
0,9.0.3,2025-06-18T22:09:56.772581489Z,cc7302afc8499e83262ba2ceaa96451681f0609d


In [6]:
# Basic query against all places indices
esql("FROM places-*")

Unnamed: 0,addresses.country,addresses.freeform,addresses.locality,addresses.postcode,addresses.region,alt_categories,brand,category,confidence,emails,geometry,id,name,phones,socials,source,updated,version,websites
0,AL,,Shkodër,,,pharmacy,,shopping,0.242152,,POINT (19.50835 42.06465),5d6e1da6-cb9e-4426-887f-a836e4f1728e,Partner T,,https://www.facebook.com/292322095037844,meta,2025-06-02T07:00:00.000Z,1,
1,AL,Bulevardi zogu i pare,Shkodër,4001,,restaurant,,fast_food_restaurant,0.337662,,POINT (19.50859 42.06464),a47b3111-9c5a-4258-a248-ef2c1a919ed5,vini_fast_food,3.556891e+11,https://www.facebook.com/108697217283038,meta,2025-06-02T07:00:00.000Z,1,
2,AL,"Ngjitur me FOTO KADIA, Sheshi Parruce",,4001,,professional_services,,real_estate,0.920482,,POINT (19.50839 42.06485),e2c9579f-3552-496e-8143-f72390e865a7,DANI Real Estate & more,3.556941e+11,https://www.facebook.com/104336115778521,meta,2025-06-02T07:00:00.000Z,1,
3,AL,Xhabije bulevardi zogu 1,Shkodër,hairdresser,,beauty_and_spa,,beauty_salon,0.396144,,POINT (19.50892 42.06486),20b194dd-c5a1-488f-8406-0a3779009862,Studio Esmeralda,,https://www.facebook.com/708994329153213,meta,2025-06-02T07:00:00.000Z,1,
4,AL,,,,,"[advertising_agency, flowers_and_gifts_shop]",,business_advertising,0.331395,,POINT (19.50892 42.06486),39d9d75f-0473-45e0-b377-0e1956b0efcf,OEL Design,3.556933e+11,https://www.facebook.com/1503227819987610,meta,2025-06-02T07:00:00.000Z,1,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,BR,"Avenida Duque de Caxias, 1540",Belém,66093-030,PA,,,appliance_repair_service,0.945926,,POINT (-48.45794 -1.42422),dc76e273-2363-4f04-977e-0a364d6ec634,Vacuomatic Maquinas e Embalagens,5.591324e+11,https://www.facebook.com/661427137357008,meta,2025-06-02T07:00:00.000Z,1,https://vacuomatic.com/
996,BR,"Avenida Duque de Caxias, 1546",Belém,66093-030,PA,,,convenience_store,0.337662,,POINT (-48.45773 -1.42419),dc5639c1-6b11-4783-ae8e-6bc03f1475f8,Depositodorondi,5.591981e+12,https://www.facebook.com/104434701179977,meta,2025-06-02T07:00:00.000Z,1,
997,BR,"Avenida Duque de Caxias, 175",Belém,66093-026,PA,"[fitness_trainer, martial_arts_club]",,gym,0.945926,,POINT (-48.457 -1.42475),256cb0e0-3c56-4923-8069-2cb81732d370,Radical Training Academia,,https://www.facebook.com/912764958810405,meta,2025-06-02T07:00:00.000Z,1,
998,BR,Passagem Augusto Numa Pinto,Belem do Pará,66123-190,PA,"[fitness_trainer, gym]",,martial_arts_club,0.754817,,POINT (-48.45718 -1.42459),ef18bd58-ab1f-44df-8536-1457b81fcdc8,Porto taekwondo,5.591983e+12,https://www.facebook.com/874383179371843,meta,2025-06-02T07:00:00.000Z,1,http://www.youtube.com/user/MrPortotaekwondo


## Control the output: `LIMIT`

By default a ES|QL query result is limited to `1000` rows. Use `LIMIT` to reduce that number

In [7]:
# Basic query against all places indices,
# returing the first 5 rows (in no particular order)

esql("""
FROM places-*
| LIMIT 5
""")

Unnamed: 0,addresses.country,addresses.freeform,addresses.locality,addresses.postcode,addresses.region,alt_categories,brand,category,confidence,emails,geometry,id,name,phones,socials,source,updated,version,websites
0,ES,"Avinguda Alqueria de Mina, 3",Paiporta,46200,,professional_services,,energy_equipment_and_solution,0.909824,,POINT (-0.40806 39.42637),21999719-67e9-4fc3-9071-48959e7cd0d7,Plug and Play Energy,34960431153,https://www.facebook.com/111858700512041,meta,2025-06-02T07:00:00.000Z,1,http://www.pnp.energy/
1,ES,"Carretera a Benetússer, 66",Paiporta,46200,,"[automotive_parts_and_accessories, automotive_repair]",,car_dealer,0.337662,,POINT (-0.4076 39.42689),3a01ec81-15b0-45f8-8b21-f6f43896d2ec,Auto Villmon,1963974741,https://www.facebook.com/249819715212219,meta,2025-06-02T07:00:00.000Z,1,http://www.auto-villmon.es
2,ES,"Carretera a Benetússer, 68",Paiporta,46200,,home_and_garden,,carpenter,0.941538,,POINT (-0.40705 39.42682),2a965aa7-3b61-4f90-bb1f-69f928f3ecd6,Chapas Tarín e Hijos,34963975629,https://www.facebook.com/381282395384290,meta,2025-06-02T07:00:00.000Z,1,http://www.valenciaswood.com/
3,ES,"Carretera a Benetússer, 68",Paiporta,46200,,,,shopping,0.566292,,POINT (-0.40691 39.4268),28152cf9-f8fd-48b7-8ade-10ec2a85d5ac,Valencias Wood Luxury,34600415476,https://www.facebook.com/104516684646116,meta,2025-06-02T07:00:00.000Z,1,http://Valenciaswood.com/
4,ES,"Carretera a Benetússer, 43",Paiporta,46200,,"[automotive, motorsports_store]",,motorcycle_dealer,0.978451,,POINT (-0.40689 39.42715),2fecd47f-8ec8-446e-9c83-94c0fbd0dec8,Dubon Racing,34961265437,https://www.facebook.com/157476124339653,meta,2025-06-02T07:00:00.000Z,1,https://www.ktmdubonvalencia.es/


## Change the output with `KEEP`, `RENAME`, and `SORT`

In [8]:
# Rename a field and only return a limited set of fields
esql("""
FROM places-*
| RENAME name as title
| KEEP title, category
| LIMIT 5
""")

Unnamed: 0,title,category
0,Quest Parnell,hotel
1,Spa Parnell,beauty_salon
2,Hulena Architects Limited,architectural_designer
3,Ray White Taylor Rentals,property_management
4,Sal's Authentic NY Pizza,pizza_restaurant


In [9]:
# KEEP also establishes the order of the columns returned,
# sometimes relevant for post-processing in client code
esql("""
FROM places-*
| RENAME name as title
| KEEP category, title
| LIMIT 5
""")

Unnamed: 0,category,title
0,automotive_repair,Pit Stop New Lynn
1,tire_dealer_and_repair,A Grade Tuning Mechanical & Tyres
2,pet_services,Pet liner
3,car_dealer,Mike Vinsen Motors
4,automotive_repair,Automotive New Lynn


In [10]:
# Once renamed, the previous identifier is not available anymore
esql("""
FROM places-*
| RENAME name as title
| KEEP name, category
| LIMIT 5
""")

Something went wrong!
verification_exception
Found 1 problem
line 3:8: Unknown column [name]


In [11]:
# Sort by a field, and reverse another
esql("""
FROM places-bosnia
| RENAME name AS title
| SORT category ASC, title DESC
| KEEP category, title
| LIMIT 5
""")

Unnamed: 0,category,title
0,abuse_and_addiction_treatment,Physio Ben
1,abuse_and_addiction_treatment,Odvikavanje
2,abuse_and_addiction_treatment,MedTim International
3,abuse_and_addiction_treatment,Laser centar
4,abuse_and_addiction_treatment,Klinika MedTiM


## Include metadata with `METADATA`

Use `METADATA` to get access to the `_index` and `_id`:

In [12]:
# Get also the source index using the METADATA keyword
esql("""
FROM places-* METADATA _index, _id
| KEEP _index, _id, name, category
| LIMIT 5
""")

Unnamed: 0,_index,_id,name,category
0,places-auckland,0d9d2eed-4b0e-43c4-b506-045d9bcdc216,Pit Stop New Lynn,automotive_repair
1,places-auckland,766fe257-cefd-47ce-8ee4-721f61e00eba,A Grade Tuning Mechanical & Tyres,tire_dealer_and_repair
2,places-auckland,4d99e28a-707b-4581-bc40-22d5b3af6566,Pet liner,pet_services
3,places-auckland,a588996d-3a41-4af9-8cec-ad56efbf0dba,Mike Vinsen Motors,car_dealer
4,places-auckland,8467c445-10ed-4ec1-8e80-b959f9f8eca6,Automotive New Lynn,automotive_repair


## Filtering and processing

In [13]:
# A basic filter
esql("""
FROM places-* METADATA _index
| RENAME _index as dataset
| WHERE name LIKE "*Burger*"
    AND category IN ("restaurant", "burger_restaurant")
    AND confidence < 0.3
| SORT confidence DESC
| KEEP dataset, name, category, confidence
| LIMIT 5
""")

Unnamed: 0,dataset,name,category,confidence
0,places-bosnia,Burgers by Manzoni,burger_restaurant,0.296943
1,places-belem,Prime Burger food truck,burger_restaurant,0.296943
2,places-belem,Nick Burger,burger_restaurant,0.296943
3,places-belem,Purple Burgers,burger_restaurant,0.296943
4,places-valencia,TORO Burger Lounge,restaurant,0.296943


In [14]:
# STATS allows running aggrecations.
# In this count agg, no other data is available afterwards
esql("""
FROM ne_countries
| STATS counts = count(id)
""")

Unnamed: 0,counts
0,257


In [15]:
# When grouping by other fields, those are also available
# for further operations like sorting or filtering
esql("""
FROM ne_countries
| WHERE type in ("Country", "Sovereign country")
| STATS counts = count(id) BY continent
| WHERE counts > 30
| SORT continent
| KEEP continent, counts
| LIMIT 5
""")

Unnamed: 0,continent,counts
0,Africa,53
1,Asia,48
2,Europe,48


In [16]:
# Aggregate: count by more than one grouping field
esql("""
FROM ne_countries
| WHERE type not in ("Country", "Sovereign country")
| STATS counts = count(id) BY continent, type
| WHERE counts > 1
| SORT continent, type
| KEEP continent, type, counts
| LIMIT 50
""")

Unnamed: 0,continent,type,counts
0,Africa,Indeterminate,2
1,Asia,Dependency,3
2,Asia,Indeterminate,5
3,Europe,Disputed,2
4,North America,Dependency,12
5,North America,Indeterminate,2
6,Oceania,Dependency,12
7,Seven seas (open ocean),Dependency,5
8,South America,Indeterminate,2


In [17]:
# Use EVAL to compute new fields
esql("""
FROM ne_countries
| WHERE gdp_md IS NOT NULL
    AND pop_est > 0
    AND type IN ("Country", "Sovereign country")
| EVAL gdp_pop = ROUND(( gdp_md * 1e6) / ( pop_est::double))::integer
| SORT gdp_pop DESC
| KEEP name, type, gdp_md, pop_est, gdp_pop
| LIMIT 10
""")

Unnamed: 0,name,type,gdp_md,pop_est,gdp_pop
0,Monaco,Sovereign country,7188,38964,184478
1,Liechtenstein,Sovereign country,6876,38019,180857
2,Luxembourg,Sovereign country,71104,619896,114703
3,Isle of Man,Country,7491,84584,88563
4,Macao,Country,53859,640445,84096
5,Switzerland,Sovereign country,703082,8574832,81994
6,Ireland,Sovereign country,388698,4941444,78661
7,Norway,Sovereign country,403336,5347896,75420
8,Iceland,Sovereign country,24188,361313,66945
9,United States of America,Country,21433226,328239523,65298


## Joins

Traditionally, Elasticsearch has not been able to join datasets in a comfortable way for developers and analysts. This has changed with ES|QL and with the introduction of the `index.mode: lookup` setting.

More details on:

* [`LOOKUP JOIN` docs](https://www.elastic.co/docs/reference/query-languages/esql/esql-lookup-join)
* [`index.mode`](https://www.elastic.co/docs/reference/elasticsearch/index-settings/index-modules#index-mode-setting)

Our `ne_countries` index was created with that setting so we can now join that dataset with our places indices.

In [18]:
# Let's find outlier data in our places-* indices
esql(
"""
FROM places-*
// Aggregate all our places by country
| STATS counts = count(addresses.country) BY addresses.country

// Filter for only those groups with counts between 1 and 100
| WHERE counts > 1 AND counts < 50

// The lookup common field needs to be the same
| RENAME addresses.country AS iso_a2, counts AS places

// Run the LOOKUP JOIN by the iso_a2 field
| LOOKUP JOIN ne_countries ON iso_a2

// Merge records for repeated iso_a2 entries
| STATS places = SUM(places), names = VALUES(name) BY iso_a2

// Sort and print
| SORT places DESC
| KEEP iso_a2, places, names
"""
)

Unnamed: 0,iso_a2,places,names
0,AU,76,"[Australia, Indian Ocean Ter., Coral Sea Is., Ashmore and Cartier Is.]"
1,HU,29,Hungary
2,SA,25,Saudi Arabia
3,FR,20,"[France, Clipperton I.]"
4,US,15,United States of America
5,PL,15,Poland
6,SK,14,Slovakia
7,SI,12,Slovenia
8,GB,11,United Kingdom
9,TR,11,Turkey


## Wrap up

This is just the basics, there are plenty of functions and operators to process your datasets, please refer to the [documentation](https://www.elastic.co/docs/explore-analyze/query-filter/languages/esql) for further details.