<a href="https://colab.research.google.com/github/leemthompo/notebook-tests/blob/main/console2nb_test.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**bold text**[[full-text-filter-tutorial]]
== Basic full-text search and filtering in {es}
++++
<titleabbrev>Basics: Full-text search and filtering</titleabbrev>
++++
#
This is a hands-on introduction to the basics of full-text search with {es}, also known as _lexical search_, using the <<search-search,`_search` API>> and <<query-dsl,Query DSL>>.
You'll also learn how to filter data, to narrow down search results based on exact criteria.
#
In this scenario, we're implementing a search function for a cooking blog.
The blog contains recipes with various attributes including textual content, categorical data, and numerical ratings.
#
The goal is to create search queries that enable users to:
#
* Find recipes based on ingredients they want to use or avoid
* Discover dishes suitable for their dietary needs
* Find highly-rated recipes in specific categories
* Find recent recipes from their favorite authors
#
To achieve these goals we'll use different Elasticsearch queries to perform full-text search, apply filters, and combine multiple search criteria.
#
[discrete]
[[full-text-filter-tutorial-create-index]]
=== Step 1: Create an index
#
Create the `cooking_blog` index to get started:
#
#

In [None]:
!pip install elasticsearch
from getpass import getpass
from elasticsearch import Elasticsearch

In [None]:
CLOUD_ID = getpass("Elastic Cloud ID: ")

ELASTIC_API_KEY = getpass("Elastic Api Key: ")

client = Elasticsearch(
    cloud_id=CLOUD_ID,
    api_key=ELASTIC_API_KEY,
)

In [8]:
resp = client.indices.create(
    index="cooking_blog",
)
print(resp)

#
#
Now define the mappings for the index:
#
#

In [9]:
resp = client.indices.put_mapping(
    index="cooking_blog",
    properties={
        "title": {
            "type": "text",
            "analyzer": "standard",
            "fields": {
                "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                }
            }
        },
        "description": {
            "type": "text",
            "fields": {
                "keyword": {
                    "type": "keyword"
                }
            }
        },
        "author": {
            "type": "text",
            "fields": {
                "keyword": {
                    "type": "keyword"
                }
            }
        },
        "date": {
            "type": "date",
            "format": "yyyy-MM-dd"
        },
        "category": {
            "type": "text",
            "fields": {
                "keyword": {
                    "type": "keyword"
                }
            }
        },
        "tags": {
            "type": "text",
            "fields": {
                "keyword": {
                    "type": "keyword"
                }
            }
        },
        "rating": {
            "type": "float"
        }
    },
)
print(resp)

{'acknowledged': True}


#
Note that if you used <<dynamic-field-mapping,dynamic mapping>>, these multi-fields would be created automatically.
It helps to save disk space and avoid potential issues with Lucene's term byte-length limit.
#
[TIP]
====
Full-text search is powered by <<analysis,text analysis>>.
Text analysis normalizes and standardizes text data so it can be efficiently stored in an inverted index and searched in near real-time.
Analysis happens at both <<analysis-index-search-time,index and search time>>.
This tutorial won't cover analysis in detail, but it's important to understand how text is processed to create effective search queries.
====
#
[discrete]
[[full-text-filter-tutorial-index-data]]
=== Step 2: Add sample blog posts to your index
#
Now you'll need to index some example blog posts using the <<bulk, Bulk API>>.
Note that `text` fields are analyzed and multi-fields are generated at index time.
#
#

In [10]:
resp = client.bulk(
    index="cooking_blog",
    refresh="wait_for",
    operations=[
        {
            "index": {
                "_id": "1"
            }
        },
        {
            "title": "Perfect Pancakes: A Fluffy Breakfast Delight",
            "description": "Learn the secrets to making the fluffiest pancakes, so amazing you won't believe your tastebuds. This recipe uses buttermilk and a special folding technique to create light, airy pancakes that are perfect for lazy Sunday mornings.",
            "author": "Maria Rodriguez",
            "date": "2023-05-01",
            "category": "Breakfast",
            "tags": [
                "pancakes",
                "breakfast",
                "easy recipes"
            ],
            "rating": 4.8
        },
        {
            "index": {
                "_id": "2"
            }
        },
        {
            "title": "Spicy Thai Green Curry: A Vegetarian Adventure",
            "description": "Dive into the flavors of Thailand with this vibrant green curry. Packed with vegetables and aromatic herbs, this dish is both healthy and satisfying. Don't worry about the heat - you can easily adjust the spice level to your liking.",
            "author": "Liam Chen",
            "date": "2023-05-05",
            "category": "Main Course",
            "tags": [
                "thai",
                "vegetarian",
                "curry",
                "spicy"
            ],
            "rating": 4.6
        },
        {
            "index": {
                "_id": "3"
            }
        },
        {
            "title": "Classic Beef Stroganoff: A Creamy Comfort Food",
            "description": "Indulge in this rich and creamy beef stroganoff. Tender strips of beef in a savory mushroom sauce, served over a bed of egg noodles. It's the ultimate comfort food for chilly evenings.",
            "author": "Emma Watson",
            "date": "2023-05-10",
            "category": "Main Course",
            "tags": [
                "beef",
                "pasta",
                "comfort food"
            ],
            "rating": 4.7
        },
        {
            "index": {
                "_id": "4"
            }
        },
        {
            "title": "Vegan Chocolate Avocado Mousse",
            "description": "Discover the magic of avocado in this rich, vegan chocolate mousse. Creamy, indulgent, and secretly healthy, it's the perfect guilt-free dessert for chocolate lovers.",
            "author": "Alex Green",
            "date": "2023-05-15",
            "category": "Dessert",
            "tags": [
                "vegan",
                "chocolate",
                "avocado",
                "healthy dessert"
            ],
            "rating": 4.5
        },
        {
            "index": {
                "_id": "5"
            }
        },
        {
            "title": "Crispy Oven-Fried Chicken",
            "description": "Get that perfect crunch without the deep fryer! This oven-fried chicken recipe delivers crispy, juicy results every time. A healthier take on the classic comfort food.",
            "author": "Maria Rodriguez",
            "date": "2023-05-20",
            "category": "Main Course",
            "tags": [
                "chicken",
                "oven-fried",
                "healthy"
            ],
            "rating": 4.9
        }
    ],
)
print(resp)


{'errors': False, 'took': 600, 'items': [{'index': {'_index': 'cooking_blog', '_id': '1', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 2, 'failed': 0}, '_seq_no': 0, '_primary_term': 1, 'status': 201}}, {'index': {'_index': 'cooking_blog', '_id': '2', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 2, 'failed': 0}, '_seq_no': 1, '_primary_term': 1, 'status': 201}}, {'index': {'_index': 'cooking_blog', '_id': '3', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 2, 'failed': 0}, '_seq_no': 2, '_primary_term': 1, 'status': 201}}, {'index': {'_index': 'cooking_blog', '_id': '4', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 2, 'failed': 0}, '_seq_no': 3, '_primary_term': 1, 'status': 201}}, {'index': {'_index': 'cooking_blog', '_id': '5', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 2, 'failed': 0}, '_seq_no': 4, '_primary_term': 1, 'status': 201}}]}


#
#
[discrete]
[[full-text-filter-tutorial-match-query]]
=== Step 3: Perform basic full-text searches
#
Full-text search involves executing text-based queries across one or more document fields.
These queries calculate a relevance score for each matching document, based on how closely the document's content aligns with the search terms.
{es} offers various query types, each with its own method for matching text and <<relevance-scores,relevance scoring>>.
#
[discrete]
==== `match` query
#
The <<query-dsl-match-query, `match`>> query is the standard query for full-text, or "lexical", search.
The query text will be analyzed according to the analyzer configuration specified on each field (or at query time).
#
First, search the `description` field for "fluffy pancakes":
#
#

In [11]:
resp = client.search(
    index="cooking_blog",
    query={
        "match": {
            "description": {
                "query": "fluffy pancakes"
            }
        }
    },
)
print(resp)


{'took': 1, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 1, 'relation': 'eq'}, 'max_score': 1.8378843, 'hits': [{'_index': 'cooking_blog', '_id': '1', '_score': 1.8378843, '_source': {'title': 'Perfect Pancakes: A Fluffy Breakfast Delight', 'description': "Learn the secrets to making the fluffiest pancakes, so amazing you won't believe your tastebuds. This recipe uses buttermilk and a special folding technique to create light, airy pancakes that are perfect for lazy Sunday mornings.", 'author': 'Maria Rodriguez', 'date': '2023-05-01', 'category': 'Breakfast', 'tags': ['pancakes', 'breakfast', 'easy recipes'], 'rating': 4.8}}]}}


#
#
At search time, {es} defaults to the analyzer defined in the field mapping. In this example, we're using the `standard` analyzer. Using a different analyzer at search time is an <<different-analyzers,advanced use case>>.
#
#
#
[discrete]
==== Require all terms in a match query
#
Specify the `and` operator to require both terms in the `description` field.
This stricter search returns zero hits on our sample data, as no document contains both "fluffy" and "pancakes" in the description.
#
#

In [12]:
resp = client.search(
    index="cooking_blog",
    query={
        "match": {
            "description": {
                "query": "fluffy pancakes",
                "operator": "and"
            }
        }
    },
)
print(resp)


{'took': 0, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 0, 'relation': 'eq'}, 'max_score': None, 'hits': []}}


#
#
#
#
[discrete]
==== Specify a minimum number of terms to match
#
Use the <<query-dsl-minimum-should-match,`minimum_should_match`>> parameter to specify the minimum number of terms a document should have to be included in the search results.
#
Search the title field to match at least 2 of the 3 terms: "fluffy", "pancakes", or "breakfast".
This is useful for improving relevance while allowing some flexibility.
#
#

In [13]:
resp = client.search(
    index="cooking_blog",
    query={
        "match": {
            "title": {
                "query": "fluffy pancakes breakfast",
                "minimum_should_match": 2
            }
        }
    },
)
print(resp)


{'took': 2, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 1, 'relation': 'eq'}, 'max_score': 4.0408072, 'hits': [{'_index': 'cooking_blog', '_id': '1', '_score': 4.0408072, '_source': {'title': 'Perfect Pancakes: A Fluffy Breakfast Delight', 'description': "Learn the secrets to making the fluffiest pancakes, so amazing you won't believe your tastebuds. This recipe uses buttermilk and a special folding technique to create light, airy pancakes that are perfect for lazy Sunday mornings.", 'author': 'Maria Rodriguez', 'date': '2023-05-01', 'category': 'Breakfast', 'tags': ['pancakes', 'breakfast', 'easy recipes'], 'rating': 4.8}}]}}


#
#
[discrete]
[[full-text-filter-tutorial-multi-match]]
=== Step 4: Search across multiple fields at once
#
When users enter a search query, they often don't know (or care) whether their search terms appear in a specific field.
A <<query-dsl-multi-match-query,`multi_match`>> query allows searching across multiple fields simultaneously.
#
Let's start with a basic `multi_match` query:
#
#

In [14]:
resp = client.search(
    index="cooking_blog",
    query={
        "multi_match": {
            "query": "vegetarian curry",
            "fields": [
                "title",
                "description",
                "tags"
            ]
        }
    },
)
print(resp)

{'took': 6, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 1, 'relation': 'eq'}, 'max_score': 2.8276732, 'hits': [{'_index': 'cooking_blog', '_id': '2', '_score': 2.8276732, '_source': {'title': 'Spicy Thai Green Curry: A Vegetarian Adventure', 'description': "Dive into the flavors of Thailand with this vibrant green curry. Packed with vegetables and aromatic herbs, this dish is both healthy and satisfying. Don't worry about the heat - you can easily adjust the spice level to your liking.", 'author': 'Liam Chen', 'date': '2023-05-05', 'category': 'Main Course', 'tags': ['thai', 'vegetarian', 'curry', 'spicy'], 'rating': 4.6}}]}}


#
#
This query searches for "vegetarian curry" across the title, description, and tags fields. Each field is treated with equal importance.
#
However, in many cases, matches in certain fields (like the title) might be more relevant than others. We can adjust the importance of each field using field boosting:
#
#

In [15]:
resp = client.search(
    index="cooking_blog",
    query={
        "multi_match": {
            "query": "vegetarian curry",
            "fields": [
                "title^3",
                "description^2",
                "tags"
            ]
        }
    },
)
print(resp)


{'took': 1, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 1, 'relation': 'eq'}, 'max_score': 7.546015, 'hits': [{'_index': 'cooking_blog', '_id': '2', '_score': 7.546015, '_source': {'title': 'Spicy Thai Green Curry: A Vegetarian Adventure', 'description': "Dive into the flavors of Thailand with this vibrant green curry. Packed with vegetables and aromatic herbs, this dish is both healthy and satisfying. Don't worry about the heat - you can easily adjust the spice level to your liking.", 'author': 'Liam Chen', 'date': '2023-05-05', 'category': 'Main Course', 'tags': ['thai', 'vegetarian', 'curry', 'spicy'], 'rating': 4.6}}]}}


#
+
* `title^3`: The title field is 3 times more important than an unboosted field
* `description^2`: The description is 2 times more important
* `tags`: No boost applied (equivalent to `^1`)
+
These boosts help tune relevance, prioritizing matches in the title over the description, and matches in the description over tags.
#
Learn more about fields and per-field boosting in the <<query-dsl-multi-match-query,`multi_match` query>> reference.
#
#
#
[TIP]
====
The `multi_match` query is often recommended over a single `match` query for most text search use cases, as it provides more flexibility and better matches user expectations. It won't work if the multi-field mapping isn't enabled.
====
#
[discrete]
[[full-text-filter-tutorial-filtering]]
=== Step 5: Filter and find exact matches
#
<<filter-context,Filtering>> allows you to narrow down your search results based on exact criteria.
Unlike full-text searches, filters are binary (yes/no) and do not affect the relevance score.
Filters execute faster than queries because excluded results don't need to be scored.
#
This <<query-dsl-bool-query,`bool`>> query will return only blog posts in the "Breakfast" category.
#
#

In [16]:
resp = client.search(
    index="cooking_blog",
    query={
        "bool": {
            "filter": [
                {
                    "term": {
                        "category.keyword": "Breakfast"
                    }
                }
            ]
        }
    },
)
print(resp)


{'took': 0, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 1, 'relation': 'eq'}, 'max_score': 0.0, 'hits': [{'_index': 'cooking_blog', '_id': '1', '_score': 0.0, '_source': {'title': 'Perfect Pancakes: A Fluffy Breakfast Delight', 'description': "Learn the secrets to making the fluffiest pancakes, so amazing you won't believe your tastebuds. This recipe uses buttermilk and a special folding technique to create light, airy pancakes that are perfect for lazy Sunday mornings.", 'author': 'Maria Rodriguez', 'date': '2023-05-01', 'category': 'Breakfast', 'tags': ['pancakes', 'breakfast', 'easy recipes'], 'rating': 4.8}}]}}


#
#
[TIP]
====
The `.keyword` suffix accesses the unanalyzed version of a field, enabling exact, case-sensitive matching. This works in two scenarios:
#
1. *When using dynamic mapping for text fields*. Elasticsearch automatically creates a `.keyword` sub-field.
2. *When text fields are explicitly mapped with a `.keyword` sub-field*. For example, we explicitly mapped the `category` field in <<full-text-filter-tutorial-create-index,Step 1>> of this tutorial.
====
#
[discrete]
[[full-text-filter-tutorial-range-query]]
==== Search for posts within a date range
#
Often users want to find content published within a specific time frame.
A <<query-dsl-range-query,`range`>> query finds documents that fall within numeric or date ranges.
#
#

In [17]:
resp = client.search(
    index="cooking_blog",
    query={
        "range": {
            "date": {
                "gte": "2023-05-01",
                "lte": "2023-05-31"
            }
        }
    },
)
print(resp)

{'took': 1, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 5, 'relation': 'eq'}, 'max_score': 1.0, 'hits': [{'_index': 'cooking_blog', '_id': '1', '_score': 1.0, '_source': {'title': 'Perfect Pancakes: A Fluffy Breakfast Delight', 'description': "Learn the secrets to making the fluffiest pancakes, so amazing you won't believe your tastebuds. This recipe uses buttermilk and a special folding technique to create light, airy pancakes that are perfect for lazy Sunday mornings.", 'author': 'Maria Rodriguez', 'date': '2023-05-01', 'category': 'Breakfast', 'tags': ['pancakes', 'breakfast', 'easy recipes'], 'rating': 4.8}}, {'_index': 'cooking_blog', '_id': '2', '_score': 1.0, '_source': {'title': 'Spicy Thai Green Curry: A Vegetarian Adventure', 'description': "Dive into the flavors of Thailand with this vibrant green curry. Packed with vegetables and aromatic herbs, this dish is both healthy and satisfying. Don't worry abo

#
#
[discrete]
[[full-text-filter-tutorial-term-query]]
==== Find exact matches
#
Sometimes users want to search for exact terms to eliminate ambiguity in their search results.
A <<query-dsl-term-query,`term`>> query searches for an exact term in a field without analyzing it.
Exact, case-sensitive matches on specific terms are often referred to as "keyword" searches.
#
Here you'll search for the author "Maria Rodriguez" in the `author.keyword` field.
#
#

In [18]:
resp = client.search(
    index="cooking_blog",
    query={
        "term": {
            "author.keyword": "Maria Rodriguez"
        }
    },
)
print(resp)


{'took': 0, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 2, 'relation': 'eq'}, 'max_score': 0.87546873, 'hits': [{'_index': 'cooking_blog', '_id': '1', '_score': 0.87546873, '_source': {'title': 'Perfect Pancakes: A Fluffy Breakfast Delight', 'description': "Learn the secrets to making the fluffiest pancakes, so amazing you won't believe your tastebuds. This recipe uses buttermilk and a special folding technique to create light, airy pancakes that are perfect for lazy Sunday mornings.", 'author': 'Maria Rodriguez', 'date': '2023-05-01', 'category': 'Breakfast', 'tags': ['pancakes', 'breakfast', 'easy recipes'], 'rating': 4.8}}, {'_index': 'cooking_blog', '_id': '5', '_score': 0.87546873, '_source': {'title': 'Crispy Oven-Fried Chicken', 'description': 'Get that perfect crunch without the deep fryer! This oven-fried chicken recipe delivers crispy, juicy results every time. A healthier take on the classic comfort foo

#
#
[TIP]
====
Avoid using the `term` query for <<text,`text` fields>> because they are transformed by the analysis process.
====
#
[discrete]
[[full-text-filter-tutorial-complex-bool]]
=== Step 6: Combine multiple search criteria
#
A <<query-dsl-bool-query,`bool`>> query allows you to combine multiple query clauses to create sophisticated searches.
In this tutorial scenario it's useful for when users have complex requirements for finding recipes.
#
Let's create a query that addresses the following user needs:
#
* Must be a vegetarian main course
* Should contain "curry" or "spicy" in the title or description
* Must not be a dessert
* Must have a rating of at least 4.5
* Should prefer recipes published in the last month
#
#

In [19]:
resp = client.search(
    index="cooking_blog",
    query={
        "bool": {
            "must": [
                {
                    "term": {
                        "category.keyword": "Main Course"
                    }
                },
                {
                    "term": {
                        "tags": "vegetarian"
                    }
                },
                {
                    "range": {
                        "rating": {
                            "gte": 4.5
                        }
                    }
                }
            ],
            "should": [
                {
                    "multi_match": {
                        "query": "curry spicy",
                        "fields": [
                            "title^2",
                            "description"
                        ]
                    }
                },
                {
                    "range": {
                        "date": {
                            "gte": "now-1M/d"
                        }
                    }
                }
            ],
            "must_not": [
                {
                    "term": {
                        "category.keyword": "Dessert"
                    }
                }
            ]
        }
    },
)
print(resp)


{'took': 2, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 1, 'relation': 'eq'}, 'max_score': 7.9835095, 'hits': [{'_index': 'cooking_blog', '_id': '2', '_score': 7.9835095, '_source': {'title': 'Spicy Thai Green Curry: A Vegetarian Adventure', 'description': "Dive into the flavors of Thailand with this vibrant green curry. Packed with vegetables and aromatic herbs, this dish is both healthy and satisfying. Don't worry about the heat - you can easily adjust the spice level to your liking.", 'author': 'Liam Chen', 'date': '2023-05-05', 'category': 'Main Course', 'tags': ['thai', 'vegetarian', 'curry', 'spicy'], 'rating': 4.6}}]}}


#
#
#
#
[discrete]
[[full-text-filter-tutorial-learn-more]]
=== Learn more
#
This tutorial introduced the basics of full-text search and filtering in {es}.
Building a real-world search experience requires understanding many more advanced concepts and techniques.
Here are some resources once you're ready to dive deeper:
#
* <<search-analyze, Elasticsearch basics — Search and analyze data>>: Understand all your options for searching and analyzing data in {es}.
* <<analysis,Text analysis>>: Understand how text is processed for full-text search.
* <<search-with-elasticsearch>>: Learn about more advanced search techniques using the `_search` API, including semantic search.
#
#
#