# ES|QL concepts
## FOSS4G Europe - Mostar

July 2025

## Resources

* [Blog post announcement](https://www.elastic.co/blog/esql-elasticsearch-piped-query-language)
* [Documentation](https://www.elastic.co/docs/explore-analyze/query-filter/languages/esql)
* [Reference](https://www.elastic.co/docs/reference/query-languages/esql)
* Webinar: [ES|QL: Search. Aggregate. Transform. Visualize. All with one query](https://www.elastic.co/virtual-events/cd-esql-search-aggregate-transform-visualize-all-with-one-query)
* Technical blog posts in [Search Labs](https://www.elastic.co/search-labs/blog/category/esql)

## Setup

In [48]:
# Install required dependencies
!pip install -qU elasticsearch geopandas

Start with the necessary imports, a couple tweaks, and defining a "esql" query that will make easier to see the results of a query, by generating a Pandas or Geopandas dataframe.

In [49]:
import os
import io

import warnings

from elasticsearch import Elasticsearch
from elasticsearch import ElasticsearchWarning
from elasticsearch.exceptions import BadRequestError

import pandas as pd
import geopandas as gpd
from shapely import wkb

# Hide the warning when no LIMIT is passed on a ES|QL query
warnings.filterwarnings('ignore', category=ElasticsearchWarning)

# Allow wide columns
pd.set_option('display.max_colwidth', None)

# Convert Well-known Binary to Text
def wkb_to_wkt(wkb_bytes):
    if wkb_bytes is None:
        return None
    try:
        return wkb.loads(wkb_bytes).wkt
    except Exception as e:
        print(f"Error converting WKB: {wkb_bytes} - {e}")
        return None

# Generate a Pandas Dataframe or a Geopandas Dataframe from a ES|QL query
def esql(query, geometry_col:str = "geometry", use_arrow:bool = True):
    try:
        # Query ES and create a Pandas Dataframe
        if use_arrow:
            es_response = client.esql.query(query=query.strip(), format="arrow", columnar=True)
            df = es_response.to_pandas()
        else:
            es_response = client.esql.query(query=query.strip(), format="csv")
            df = pd.read_csv(io.StringIO(str(es_response)))

        # Promote to a Geopandas Dataframe if a "geometry" column
        if geometry_col in df.columns:
            if use_arrow:
                # Arrow geometries are transferred as WKB
                df[geometry_col] = df[geometry_col].apply(wkb_to_wkt)
            gs = gpd.GeoSeries.from_wkt(df[geometry_col])
            gdf = gpd.GeoDataFrame(df, geometry=gs, crs="EPSG:4326")
            if geometry_col != "geometry":
                gdf.drop(columns="geometry")
            return gdf
        else:
            return df
    except BadRequestError as e:
        print("Something went wrong!")
        print(e.message)
        print("\r\n".join([c['reason'] for c in e.info['error']['root_cause']]))

Connect to Elasticsearch and print some cluster details

In [50]:
# Login details
ES_URL=os.getenv("ES_URL","https://foss4geurope.es.us-central1.gcp.cloud.es.io")
KB_URL=os.getenv("KB_URL","https://foss4geurope.kb.us-central1.gcp.cloud.es.io")

# API key that allows reading indices
ES_APIKEY=os.getenv("ES_APIKEY", "WkdPUjZKY0JhVEI4aFAyRmpWM186MmRvQVlLaGVwck1WbV9RSkdJT1N6UQ==")

# Load the client
client = Elasticsearch(hosts=[ES_URL], api_key=ES_APIKEY)
c_info = client.info()
is_serverless = c_info['version']['build_flavor'] == 'serverless'


# Print some cluster details
print(f"Elasticsearch URL: {ES_URL}")
print(f"Cluster name: {c_info['name']}")
print(f"Version: {c_info['version']['number'] if not is_serverless else 'serverless'}")
print("Number of documents indexed: ", client.count(index="*")['count'])

Elasticsearch URL: https://foss4geurope.es.us-central1.gcp.cloud.es.io
Cluster name: instance-0000000000
Version: 9.0.3
Number of documents indexed:  243006


## Syntax and API

### Basic syntax

<https://www.elastic.co/docs/reference/query-languages/esql/esql-syntax>

A ES|QL query is made of a `source` command that sets the data to retrieve and a list of processing commands starting with the pipe `|` character.

```text
source-command
| processing-command1
| processing-command2
```

A query can contain one line and multi line comments.

```
source-command           // Single line comment
| processing-command1    // another comment
/*
a multi
line comment in between
processing commands
*/
| processing-command2
```

About literals:

* Literals are duble quoted
* If a double quote is required in a literal, triple quotes can be used

```text
ROW name = """Indiana "Indy" Jones"""
```


### Query API

* [Documentation](https://www.elastic.co/docs/explore-analyze/query-filter/languages/esql-rest)
* [Reference](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-esql)


Elasticsearch exposes the `_query` endpoint to execute ES|QL queries, with a `format` parameter to select between different output types as `csv`, `tsv`, `arrow`, `json`, etc.

As a `curl` command a request would be as:

```
curl -X POST \
  -H "Authorization: ApiKey $ES_APIKEY" \
  -H "Content-Type: application/json" \
  -d '{"query":"FROM places-* | STATS count = count(name) | LIMIT 1"}'\
  "$ES_URL/_query?pretty&format=txt"

     count     
---------------
230573
```

In the [Kibana Console](https://www.elastic.co/docs/explore-analyze/query-filter/tools/console):

```
POST /_query?format=txt
{
  "query": "FROM places-* | STATS count = count(name) | LIMIT 1"
}
```

And sending multiline queries is possible with triple quotes:

```text
POST /_query?format=txt
{
  "query": """
  FROM places-*
  | STATS count = count(name)
  | LIMIT 1
  """
}
```

A ES|QL API query can also include a filter using Elasticsearch DSL language:


```text
POST /_query?format=txt
{
  "query": """
  FROM places-*
  | STATS count = count(name)
  | LIMIT 1
  """,
  "filter": {
    "range": {
      "confidence": {
        "gte": 0.1,
        "lte": 1
      }
    }
  }
}
```

Other API endpoints available:

* [`_query/async`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-esql-async-query): start, stop, and get results asynchronously
* [`_query/queries`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-esql-list-queries): get details about running queries

## ES|QL sources: `ROW`, `SHOW`, `FROM`

In [51]:
# Creating a row directly, useful to test functions
esql('ROW a = 1, b = "two", c = null')

Unnamed: 0,a,b,c
0,1,two,


In [52]:
# SHOW source returns Elasticsearch version
esql("SHOW INFO")

Unnamed: 0,version,date,hash
0,9.0.3,2025-06-18T22:09:56.772581489Z,cc7302afc8499e83262ba2ceaa96451681f0609d


In [53]:
# Basic query against all places indices
esql("FROM places-*")

Unnamed: 0,addresses.country,addresses.freeform,addresses.locality,addresses.postcode,addresses.region,alt_categories,brand,category,confidence,emails,geometry,id,name,phones,socials,source,updated,version,websites
0,BR,"Rua São Miguel, 1439",Belém,66065-695,PA,[beauty_and_spa],,beauty_salon,0.941538,,POINT (-48.47892 -1.46419),055bd127-0586-4a67-9a2a-1c222f2c08c1,Studio hair KAIRÓS,+5591981953171,https://www.facebook.com/2097663423785081,[meta],2025-06-02 07:00:00,1,
1,BR,"Passagem Teixeira, 235",Belém,66045-228,PA,[liquor_store],,beer_bar,0.883117,,POINT (-48.47836 -1.4651),d48822f2-387c-4b4f-b2b6-5afb0143684c,Cantinho Retro,,https://www.facebook.com/101333455169456,[meta],2025-06-02 07:00:00,1,
2,BR,rua são silvestre,Belém,,PA,[accommodation],,home_developer,0.492940,,POINT (-48.47767 -1.46515),f6af9253-ee62-4e82-a1e0-ae1ee05c4ab3,Vila Duque De Caxias Cremação,,https://www.facebook.com/1519295374952987,[meta],2025-06-02 07:00:00,1,
3,BR,Cremação,Belém,66045-590,PA,,,accommodation,0.492940,,POINT (-48.47828 -1.4643),0a41ff4a-446c-43b7-86b3-36170bd67a98,Bairro Cremação,,https://www.facebook.com/512180752244482,[meta],2025-06-02 07:00:00,1,
4,BR,"Passagem Teixeira, 105",Belém,66045-228,PA,,,professional_services,0.337662,,POINT (-48.47828 -1.46395),18636827-f732-4820-987a-58b419a35cca,AD Teixeira,,https://www.facebook.com/101695871672868,[meta],2025-06-02 07:00:00,1,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,BR,"Avenida Almirante Barroso, 5501",Belém,66645-250,PA,"[car_wash, energy_company]",,gas_station,0.978451,,POINT (-48.43726 -1.40811),a2abf0d4-e153-48ef-a186-1054eb8a62d4,Posto Shell,+559132385267,https://www.facebook.com/321430038024834,[meta],2025-06-02 07:00:00,1,http://www.shell.com.br/
996,BR,"Rodovia BR-316, S/N Km 1",,66080-710,,"[cafe, coffee_shop]",,fast_food_restaurant,0.296943,,POINT (-48.43693 -1.40816),ed58d5f4-555e-45c8-99f4-dca6fc388d63,Papa Xibé Food em Castanheira,+19132252776,https://www.facebook.com/379534509216590,[meta],2025-06-02 07:00:00,1,
997,BR,"Rod BR 316, s/n km 12",Belém,,PA,[business_management_services],,public_service_and_government,0.396144,,POINT (-48.43695 -1.40814),6d0cf9e3-60e5-484c-aa93-f3cd6e96e068,EMATER Empresa Assist Tec e Extensão Rural Geral,,https://www.facebook.com/179829788736502,[meta],2025-06-02 07:00:00,1,http://agriculturafamiliarater.blogspot.com/2011/07/queijos-maely-agroindustria-familiar.html
998,BR,"Rodovia br , 316, 1001",,66645-000,,,,mobile_phone_store,0.838462,,POINT (-48.43691 -1.40815),01369f82-4cbc-49f0-a8fe-85aac22ad8dd,Jóia Celular,+559132314532,https://www.facebook.com/103970045785617,[meta],2025-06-02 07:00:00,1,http://www.casadocelular.com.br/


## Control the output: `LIMIT`

By default a ES|QL query result is limited to `1000` rows. Use `LIMIT` to reduce that number

In [54]:
# Basic query against all places indices,
# returing the first 5 rows (in no particular order)

esql("""
FROM places-*
| LIMIT 5
""")

Unnamed: 0,addresses.country,addresses.freeform,addresses.locality,addresses.postcode,addresses.region,alt_categories,brand,category,confidence,emails,geometry,id,name,phones,socials,source,updated,version,websites
0,AL,,Shkodër,,,[pharmacy],,shopping,0.242152,,POINT (19.50835 42.06465),5d6e1da6-cb9e-4426-887f-a836e4f1728e,Partner T,,https://www.facebook.com/292322095037844,meta,2025-06-02 07:00:00,1,
1,AL,Bulevardi zogu i pare,Shkodër,4001,,[restaurant],,fast_food_restaurant,0.337662,,POINT (19.50859 42.06464),a47b3111-9c5a-4258-a248-ef2c1a919ed5,vini_fast_food,355689118888.0,https://www.facebook.com/108697217283038,meta,2025-06-02 07:00:00,1,
2,AL,"Ngjitur me FOTO KADIA, Sheshi Parruce",,4001,,[professional_services],,real_estate,0.920482,,POINT (19.50839 42.06485),e2c9579f-3552-496e-8143-f72390e865a7,DANI Real Estate & more,355694054888.0,https://www.facebook.com/104336115778521,meta,2025-06-02 07:00:00,1,
3,AL,Xhabije bulevardi zogu 1,Shkodër,hairdresser,,[beauty_and_spa],,beauty_salon,0.396144,,POINT (19.50892 42.06486),20b194dd-c5a1-488f-8406-0a3779009862,Studio Esmeralda,,https://www.facebook.com/708994329153213,meta,2025-06-02 07:00:00,1,
4,AL,,,,,"[advertising_agency, flowers_and_gifts_shop]",,business_advertising,0.331395,,POINT (19.50892 42.06486),39d9d75f-0473-45e0-b377-0e1956b0efcf,OEL Design,355693310324.0,https://www.facebook.com/1503227819987610,meta,2025-06-02 07:00:00,1,


## Change the output with `KEEP`, `RENAME`, and `SORT`

In [55]:
# Rename a field and only return a limited set of fields
esql("""
FROM places-*
| RENAME name as title
| KEEP title, category
| LIMIT 5
""")

Unnamed: 0,title,category
0,Partner T,shopping
1,vini_fast_food,fast_food_restaurant
2,DANI Real Estate & more,real_estate
3,Studio Esmeralda,beauty_salon
4,OEL Design,business_advertising


In [56]:
# KEEP also establishes the order of the columns returned,
# sometimes relevant for post-processing in client code
esql("""
FROM places-*
| RENAME name as title
| KEEP category, title
| LIMIT 5
""")

Unnamed: 0,category,title
0,shopping,Partner T
1,fast_food_restaurant,vini_fast_food
2,real_estate,DANI Real Estate & more
3,beauty_salon,Studio Esmeralda
4,business_advertising,OEL Design


In [57]:
# Once renamed, the previous identifier is not available anymore
esql("""
FROM places-*
| RENAME name as title
| KEEP name, category
| LIMIT 5
""")

Something went wrong!
verification_exception
Found 1 problem
line 3:8: Unknown column [name]


In [58]:
# Sort by a field, and reverse another
esql("""
FROM places-bosnia
| RENAME name AS title
| SORT category ASC, title DESC
| KEEP category, title
| LIMIT 5
""")

Unnamed: 0,category,title
0,abuse_and_addiction_treatment,Physio Ben
1,abuse_and_addiction_treatment,Odvikavanje
2,abuse_and_addiction_treatment,MedTim International
3,abuse_and_addiction_treatment,Laser centar
4,abuse_and_addiction_treatment,Klinika MedTiM


## Include metadata with `METADATA`

Use `METADATA` to get access to the `_index` and `_id`:

In [59]:
# Get also the source index using the METADATA keyword
esql("""
FROM places-* METADATA _index, _id
| KEEP _index, _id, name, category
| LIMIT 5
""")

Unnamed: 0,_index,_id,name,category
0,places-bosnia,b7a4dbac-774d-4f99-a8ac-3e2f0c6309fc,TinkerLabs Bijeljina,preschool
1,places-bosnia,fc30b259-a172-4ab9-9520-b82962efbd3c,Petar Pan,preschool
2,places-bosnia,cd1c86f8-cc62-4138-ab49-e92bad59d346,Mamasita,pancake_house
3,places-bosnia,b8df9b8f-7316-461d-9b48-3aab9c04f021,Modni Studio Madness,womens_clothing_store
4,places-bosnia,763ed266-3a30-46f7-abe1-b2f6e6124acd,Happy Travel Bijeljina,tours


## Filtering and processing

In [60]:
# A basic filter
esql("""
FROM places-* METADATA _index
| RENAME _index as dataset
| WHERE name LIKE "*Burger*"
    AND category IN ("restaurant", "burger_restaurant")
    AND confidence < 0.3
| SORT confidence DESC
| KEEP dataset, name, category, confidence
| LIMIT 5
""")

Unnamed: 0,dataset,name,category,confidence
0,places-bosnia,Burgers by Manzoni,burger_restaurant,0.296943
1,places-belem,Nick Burger,burger_restaurant,0.296943
2,places-belem,Prime Burger food truck,burger_restaurant,0.296943
3,places-belem,Purple Burgers,burger_restaurant,0.296943
4,places-valencia,TORO Burger Lounge,restaurant,0.296943


In [61]:
# STATS allows running aggrecations.
# In this count agg, no other data is available afterwards
esql("""
FROM ne_countries
| STATS counts = count(id)
""")

Unnamed: 0,counts
0,257


In [62]:
# When grouping by other fields, those are also available
# for further operations like sorting or filtering
esql("""
FROM ne_countries
| WHERE type in ("Country", "Sovereign country")
| STATS counts = count(id) BY continent
| WHERE counts > 30
| SORT continent
| KEEP continent, counts
| LIMIT 5
""")

Unnamed: 0,continent,counts
0,Africa,53
1,Asia,48
2,Europe,48


In [63]:
# Aggregate: count by more than one grouping field
esql("""
FROM ne_countries
| WHERE type not in ("Country", "Sovereign country")
| STATS counts = count(id) BY continent, type
| WHERE counts > 1
| SORT continent, type
| KEEP continent, type, counts
| LIMIT 50
""")

Unnamed: 0,continent,type,counts
0,Africa,Indeterminate,2
1,Asia,Dependency,3
2,Asia,Indeterminate,5
3,Europe,Disputed,2
4,North America,Dependency,12
5,North America,Indeterminate,2
6,Oceania,Dependency,12
7,Seven seas (open ocean),Dependency,5
8,South America,Indeterminate,2


In [64]:
# Use EVAL to compute new fields
esql("""
FROM ne_countries
| WHERE gdp_md IS NOT NULL
    AND pop_est > 0
    AND type IN ("Country", "Sovereign country")
| EVAL gdp_pop = ROUND(( gdp_md * 1e6) / ( pop_est::double))::integer
| SORT gdp_pop DESC
| KEEP name, type, gdp_md, pop_est, gdp_pop
| LIMIT 10
""")

Unnamed: 0,name,type,gdp_md,pop_est,gdp_pop
0,Monaco,Sovereign country,7188,38964,184478
1,Liechtenstein,Sovereign country,6876,38019,180857
2,Luxembourg,Sovereign country,71104,619896,114703
3,Isle of Man,Country,7491,84584,88563
4,Macao,Country,53859,640445,84096
5,Switzerland,Sovereign country,703082,8574832,81994
6,Ireland,Sovereign country,388698,4941444,78661
7,Norway,Sovereign country,403336,5347896,75420
8,Iceland,Sovereign country,24188,361313,66945
9,United States of America,Country,21433226,328239523,65298


## Joins

Traditionally, Elasticsearch has not been able to join datasets in a comfortable way for developers and analysts. This has changed with ES|QL and with the introduction of the `index.mode: lookup` setting.

More details on:

* [`LOOKUP JOIN` docs](https://www.elastic.co/docs/reference/query-languages/esql/esql-lookup-join)
* [`index.mode`](https://www.elastic.co/docs/reference/elasticsearch/index-settings/index-modules#index-mode-setting)

Our `ne_countries` index was created with that setting so we can now join that dataset with our places indices.

In [89]:
# Let's find outlier data in our places-* indices
esql(
"""
FROM places-*
// Aggregate all our places by country
| STATS counts = count(addresses.country) BY addresses.country

// Filter for only those groups with counts between 1 and 100
| WHERE counts > 1 AND counts < 50

// The lookup common field needs to be the same
| RENAME addresses.country AS iso_a2, counts AS places

// Run the LOOKUP JOIN by the iso_a2 field
| LOOKUP JOIN ne_countries ON iso_a2

// Merge records for repeated iso_a2 entries
| STATS places = SUM(places), names = VALUES(name) BY iso_a2

// Sort and print
| SORT places DESC
| KEEP iso_a2, places, names
"""
)

Unnamed: 0,iso_a2,places,names
0,HU,29,[Hungary]
1,SA,24,[Saudi Arabia]
2,FR,20,"[France, Clipperton I.]"
3,AU,16,"[Australia, Indian Ocean Ter., Coral Sea Is., Ashmore and Cartier Is.]"
4,PL,15,[Poland]
5,SK,14,[Slovakia]
6,SI,12,[Slovenia]
7,US,11,[United States of America]
8,CZ,11,[Czechia]
9,TR,11,[Turkey]


## Wrap up

This is just the basics, there are plenty of functions and operators to process your datasets, please refer to the [documentation](https://www.elastic.co/docs/explore-analyze/query-filter/languages/esql) for further details.