# Search using query rules

<a target="_blank" href="https://colab.research.google.com/github/kderusso/elasticsearch-labs/blob/kderusso/query-rules-notebook/notebooks/search/05-query-rules.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This interactive notebook will introduce you to how use query rules, using the official [Elasticsearch Python client](https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/connecting.html).
You'll store query rules in Elasticsearch using the [query rules API](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-rules-apis.html) and query them using [rule_query](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-rule-query.html).

## Create Elastic Cloud deployment

If you don't have an Elastic Cloud deployment, sign up [here](https://cloud.elastic.co/registration?fromURI=%2Fhome) for a free trial.

- Go to the [Create deployment](https://cloud.elastic.co/deployments/create) page
   - Select **Create deployment**

## Install packages and import modules

To get started, we'll need to connect to our Elastic deployment using the Python client.
Because we're using an Elastic Cloud deployment, we'll use the **Cloud ID** to identify our deployment.

First we need to install the `elasticsearch` Python client.

In [None]:
!pip install -qU elasticsearch sentence-transformers==2.2.2

## Initialize the Elasticsearch client

Now we can instantiate the [Elasticsearch python client](https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/index.html), providing the cloud id and password in your deployment.

In [30]:
from elasticsearch import Elasticsearch
from getpass import getpass

CLOUD_ID = getpass("Elastic Cloud ID")
CLOUD_PASSWORD = getpass("Elastic Password")

# Create the client instance
client = Elasticsearch(
    cloud_id=CLOUD_ID,
    basic_auth=("elastic", CLOUD_PASSWORD)
)

Elastic Cloud ID··········
Elastic Password··········


If you're running Elasticsearch locally or self-managed, you can pass in the Elasticsearch host instead. [Read more](https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/connecting.html#_verifying_https_with_certificate_fingerprints_python_3_10_or_later) on how to connect to Elasticsearch locally.

Confirm that the client has connected with this test.

In [31]:
print(client.info())

{'name': 'instance-0000000000', 'cluster_name': '1a56ad21587c44d3930932eb9fa1d8e8', 'cluster_uuid': 'gX4zlwtlR4qhZpp1SPm4Yg', 'version': {'number': '8.8.2', 'build_flavor': 'default', 'build_type': 'docker', 'build_hash': '98e1271edf932a480e4262a471281f1ee295ce6b', 'build_date': '2023-06-26T05:16:16.196344851Z', 'build_snapshot': False, 'lucene_version': '9.6.0', 'minimum_wire_compatibility_version': '7.17.0', 'minimum_index_compatibility_version': '7.0.0'}, 'tagline': 'You Know, for Search'}


## Index some test data

Our client is set up and connected to our Elastic deployment.
Now we need some data to test out the basics of Elasticsearch queries.
We'll use a small index of products with the following fields:

- `name`
- `description`
- `price`
- `currency`
- `plug_type`
- `voltage`

### Index test data

Run the following command to upload some sample data.

In [None]:
import json
from urllib.request import urlopen

url = "https://raw.githubusercontent.com/kderusso/elasticsearch-labs/kderusso/query-rules-notebook/notebooks/search/query-rules-data.json"
response = urlopen(url)
docs = json.loads(response.read())

actions = []
for doc in docs:
    actions.append({"index": {"_index": "products_index", "_id": doc["id"]}})
    actions.append(doc["content"])
client.bulk(index="products_index", operations=actions)


First, let's search our data for a reliable wireless charger. (We'll include a `pretty_response` function to make hard-to-read json output more readable).

In [None]:
def pretty_response(response):
    for hit in response['hits']['hits']:
        id = hit['_id']
        score = hit['_score']
        name = hit['_source']['name']
        description = hit['_source']['description']
        price = hit["_source"]["price"]
        currency = hit["_source"]["currency"]
        plug_type = hit["_source"]["plug_type"]
        voltage = hit["_source"]["voltage"]
        pretty_output = (f"\nID: {id}\nName: {name}\nDescription: {description}\nPrice: {price}\nCurrency: {currency}\nPlug type: {plug_type}\nVoltage: {voltage}\nScore: {score}")
        print(pretty_output)

response = client.search(index="products_index", body={
    "query": {
      "multi_match": {
          "query": "reliable wireless charger for iPhone",
          "fields": [ "name^5", "description" ]
      }
    }
})

pretty_response(response)

As we can see from the response, the European result is ranked first. This might not be desirable if, for example, I know that my searcher is coming from the US or the UK which have different plugs and specifications. 

Query rules can help here!

## Creating rules

Let's assume that separately, we know what country our users are coming from (perhaps geolocation from IP addresses or logged in user account information). Now, we want to create query rules to boost wireless chargers based on that information when people search for anything containing the phrase `wireless charger`.

In [None]:
client.query_ruleset.put(ruleset_id="promotion-rules", rules=[
    {
      "rule_id": "us-charger",
      "type": "pinned",
      "criteria": [
        {
          "type": "contains",
          "metadata": "my_query",
          "values": ["wireless charger"]
        },
        {
          "type": "exact",
          "metadata": "country",
          "values": ["us"]
        }
      ],
      "actions": {
        "ids": [
          "us1"
        ]
      }
    },
    {
      "rule_id": "uk-charger",
      "type": "pinned",
      "criteria": [
        {
          "type": "contains",
          "metadata": "my_query",
          "values": ["wireless charger"]
        },
        {
          "type": "exact",
          "metadata": "country",
          "values": ["uk"]
        }
      ],
      "actions": {
        "ids": [
          "uk1"
        ]
      }
    }
  ])

In order for these rules to match, one of the following must be true:

- `my_query` contains the string "wireless charger" *AND* `country` is "us"
- `my_query` contains the string "wireless charger" *AND* `country` is "uk"

We can view our ruleset using the API as well (with another `pretty_ruleset` function for readability):

In [None]:
def pretty_ruleset(response):
    print("Ruleset ID: " + response['ruleset_id'])
    for rule in response['rules']:
        rule_id = rule['rule_id']
        type = rule['type']
        print(f"\nRule ID: {rule_id}\n\tType: {type}\n\tCriteria:")
        criteria = rule['criteria']
        for rule_criteria in criteria:
            criteria_type = rule_criteria['type']
            metadata = rule_criteria['metadata']
            values = rule_criteria['values']
            print(f"\t\t{metadata} {criteria_type} {values}")
        ids = rule['actions']['ids']
        print(f"\tPinned ids: {ids}")
        
response = client.query_ruleset.get(ruleset_id="promotion-rules")
pretty_ruleset(response)

Next, we use the rule_query to perform a search using the same organic query as above, but with the addition of query rules:

In [None]:
response = client.search(index="products_index", body={
    "query": {
        "rule_query": {
            "organic": {
                "multi_match": {
                    "query": "reliable wireless charger for iPhone",
                    "fields": [ "name^5", "description" ]
                }
            },
            "match_criteria": {
              "my_query": "reliable wireless charger for iPhone",
              "country": "us"
            },
            "ruleset_id": "promotion-rules"
        }
    }
})

pretty_response(response)

Now, the rule query boosts the documents that we want to be displayed first!