# Keyword querying and filtering

<a target="_blank" href="https://colab.research.google.com/github/elasticsearch-labs/blob/main/search/01-keyword-querying-filtering.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This interactive notebook will introduce you to the basic Elasticsearch queries, using the official Elasticsearch Python client. Before getting start this section we recommend working through our [quick start](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/00-quick-start.ipynb).

# Install and import libraries

In [None]:
!pip install elasticsearch

In [None]:

from elasticsearch import Elasticsearch
import pandas as pd
from google.colab import data_table
import getpass


data_table.enable_dataframe_formatter()

# Create the client instance


In [None]:
cloud_id = getpass.getpass('Cloud ID: ')
elastic_username = 'elastic'
elastic_password = getpass.getpass('Password: ') 
client = Elasticsearch(
    cloud_id=cloud_id,
    basic_auth=(elastic_username, elastic_password)
)

# Pretty print the response

def pretty_response_transform(response):
    result = []
    for hit in response['hits']['hits']:
        result.append({
            'id' : hit['_id'],
            'publication_date' : hit['_source']['publish_date'],
            'score' : hit['_score'],
            'title' : hit['_source']['title'],
            'summary' : hit['_source']['summary']
        })
    return result

## Querying
In the query context, a query clause answers the question _“How well does this document match this query clause?”_. In addition to deciding whether or not the document matches, the query clause also calculates a relevance score in the `_score `metadata field.

### Full text queries

Full text queries enable you to search analyzed text fields such as the body of an email. The query string is processed using the same analyzer that was applied to the field during indexing.

* **match**.
    The standard query for performing full text queries, including fuzzy matching and phrase or proximity queries.
* **multi-match**.
    The multi-field version of the match query.

### Match query
Returns documents that `match` a provided text, number, date or boolean value. The provided text is analyzed before matching.

The `match` query is the standard query for performing a full-text search, including options for fuzzy matching.

[Read more](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query.html#match-query-ex-request).



In [None]:
response = client.search(index="book_index", query={
    "match": {
        "summary": {
            "query": "guide"
            }
        }
    })

pd.DataFrame.from_records(pretty_response_transform(response))

Unnamed: 0,id,publication_date,score,title,summary
0,3cXgIYkBfxlbyhU5Krfc,2019-10-29,0.704228,The Pragmatic Programmer: Your Journey to Mastery,A guide to pragmatic programming for software ...
1,3sXgIYkBfxlbyhU5Krfc,2019-05-03,0.704228,Python Crash Course,"A fast-paced, no-nonsense guide to programming..."
2,5MXgIYkBfxlbyhU5Krfd,2011-05-13,0.677165,The Clean Coder: A Code of Conduct for Profess...,A guide to professional conduct in the field o...
3,4MXgIYkBfxlbyhU5Krfc,2008-08-11,0.628835,Clean Code: A Handbook of Agile Software Craft...,"A guide to writing code that is easy to read, ..."
4,48XgIYkBfxlbyhU5Krfd,1994-10-31,0.628835,Design Patterns: Elements of Reusable Object-O...,Guide to design patterns that can be used in a...


### Multi-match query

The `multi_match` query builds on the match query to allow multi-field queries.

[Read more](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html).

In [None]:
response = client.search(index="book_index", query={
    "multi_match": {
        "query": "javascript",
        "fields": ["summary", "title"]
        }
    })

pd.DataFrame.from_records(pretty_response_transform(response))

Unnamed: 0,id,publication_date,score,title,summary
0,4sXgIYkBfxlbyhU5Krfc,2018-12-04,2.030753,Eloquent JavaScript,A modern introduction to programming
1,5cXgIYkBfxlbyhU5Krfd,2008-05-15,1.706409,JavaScript: The Good Parts,A deep dive into the parts of JavaScript that ...
2,4cXgIYkBfxlbyhU5Krfc,2015-03-27,1.636058,You Don't Know JS: Up & Going,Introduction to JavaScript and programming as ...


Individual fields can be boosted with the caret (^) notation.

In [None]:
response = client.search(index="book_index", query={
    "multi_match": {
        "query": "javascript",
        "fields": ["summary", "title^3"]
        }
    })

pd.DataFrame.from_records(pretty_response_transform(response))

Unnamed: 0,id,publication_date,score,title,summary
0,4sXgIYkBfxlbyhU5Krfc,2018-12-04,6.092258,Eloquent JavaScript,A modern introduction to programming
1,5cXgIYkBfxlbyhU5Krfd,2008-05-15,5.119226,JavaScript: The Good Parts,A deep dive into the parts of JavaScript that ...
2,4cXgIYkBfxlbyhU5Krfc,2015-03-27,1.636058,You Don't Know JS: Up & Going,Introduction to JavaScript and programming as ...


### Prefix search

Returns documents that contain a specific prefix in a provided field.

[Read more](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-prefix-query.html)

In [None]:
response = client.search(index="book_index", query={
    "prefix": {
        "title": {
            "value": 'java'
            }
        }
    })

pd.DataFrame.from_records(pretty_response_transform(response))

Unnamed: 0,id,publication_date,score,title,summary
0,4sXgIYkBfxlbyhU5Krfc,2018-12-04,1.0,Eloquent JavaScript,A modern introduction to programming
1,5cXgIYkBfxlbyhU5Krfd,2008-05-15,1.0,JavaScript: The Good Parts,A deep dive into the parts of JavaScript that ...


### Fuzzy search

Returns documents that contain terms similar to the search term, as measured by a Levenshtein edit distance.

An edit distance is the number of one-character changes needed to turn one term into another. These changes can include:

* Changing a character (box → fox)
* Removing a character (black → lack)
* Inserting a character (sic → sick)
* Transposing two adjacent characters (act → cat)

[Read more](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-fuzzy-query.html)



In [None]:
response = client.search(index="book_index", query={
    "fuzzy": {
        "title": {
            "value": 'pyvascript'
            }
        }
    })

pd.DataFrame.from_records(pretty_response_transform(response))

Unnamed: 0,id,publication_date,score,title,summary
0,4sXgIYkBfxlbyhU5Krfc,2018-12-04,1.624602,Eloquent JavaScript,A modern introduction to programming
1,5cXgIYkBfxlbyhU5Krfd,2008-05-15,1.365127,JavaScript: The Good Parts,A deep dive into the parts of JavaScript that ...


## Filtering

In a filter context, a query clause answers the question *“Does this document match this query clause?”* The answer is a simple Yes or No — no scores are calculated. Filter context is mostly used for filtering structured data, for example:
* Does this `timestamp` fall into the range 2015 to 2016?
* Is the `status` field set to `"published"`?

Filter context is in effect whenever a query clause is passed to a `filter` parameter, such as the `filter` or `must_not` parameters in the `bool` query.

[Read more](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html)

### **bool.must**
The clause (query) must appear in matching documents and will contribute to the score.

In [None]:
response = client.search(index="book_index", query={
    "bool": {
        "must": [{
            "term": {
                "summary": "guide"
                }
            }, {
            "term": {
                "summary": "code"
                }
            }]
        }
    })

pd.DataFrame.from_records(pretty_response_transform(response))

Unnamed: 0,id,publication_date,score,title,summary
0,4MXgIYkBfxlbyhU5Krfc,2008-08-11,1.97297,Clean Code: A Handbook of Agile Software Craft...,"A guide to writing code that is easy to read, ..."


### **bool.should**

The clause (query) should appear in the matching document.

In [None]:
response = client.search(index="book_index", query={
    "bool": {
        "should": [{
            "term": {
                "summary": "guide"
                }
            }, {
              "term": {
                  "summary": "code"
                  }
            }]
        }
    })

pd.DataFrame.from_records(pretty_response_transform(response))

Unnamed: 0,id,publication_date,score,title,summary
0,4MXgIYkBfxlbyhU5Krfc,2008-08-11,1.97297,Clean Code: A Handbook of Agile Software Craft...,"A guide to writing code that is easy to read, ..."
1,5cXgIYkBfxlbyhU5Krfd,2008-05-15,1.254593,JavaScript: The Good Parts,A deep dive into the parts of JavaScript that ...
2,3cXgIYkBfxlbyhU5Krfc,2019-10-29,0.704228,The Pragmatic Programmer: Your Journey to Mastery,A guide to pragmatic programming for software ...
3,3sXgIYkBfxlbyhU5Krfc,2019-05-03,0.704228,Python Crash Course,"A fast-paced, no-nonsense guide to programming..."
4,5MXgIYkBfxlbyhU5Krfd,2011-05-13,0.677165,The Clean Coder: A Code of Conduct for Profess...,A guide to professional conduct in the field o...
5,48XgIYkBfxlbyhU5Krfd,1994-10-31,0.628835,Design Patterns: Elements of Reusable Object-O...,Guide to design patterns that can be used in a...


### **bool.filter**

The clause (query) must appear in matching documents. **However unlike `must` the `score` of the query will be ignored.** Filter clauses are executed in filter context, meaning that scoring is ignored and clauses are considered for caching.

In [None]:
response = client.search(index="book_index", query={
    "bool": {
        "filter": [{
            "term": {
                "summary": "guide"
                }
            }]
        }
    })

pd.DataFrame.from_records(pretty_response_transform(response))

Unnamed: 0,id,publication_date,score,title,summary
0,3cXgIYkBfxlbyhU5Krfc,2019-10-29,0.0,The Pragmatic Programmer: Your Journey to Mastery,A guide to pragmatic programming for software ...
1,3sXgIYkBfxlbyhU5Krfc,2019-05-03,0.0,Python Crash Course,"A fast-paced, no-nonsense guide to programming..."
2,4MXgIYkBfxlbyhU5Krfc,2008-08-11,0.0,Clean Code: A Handbook of Agile Software Craft...,"A guide to writing code that is easy to read, ..."
3,48XgIYkBfxlbyhU5Krfd,1994-10-31,0.0,Design Patterns: Elements of Reusable Object-O...,Guide to design patterns that can be used in a...
4,5MXgIYkBfxlbyhU5Krfd,2011-05-13,0.0,The Clean Coder: A Code of Conduct for Profess...,A guide to professional conduct in the field o...
