# Synonyms API quick start

<a target="_blank" href="https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/search/06-synonyms-api.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This interactive notebook will introduce you to the Synonyms API ([blog post](https://www.elastic.co/blog/update-synonyms-elasticsearch-introducing-synonyms-api), [API documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/synonyms-apis.html)) using the official [Elasticsearch Python client](https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/connecting.html). Synonyms allow you to enhance search relevancy by defining relationships between terms that have the similar meanings. In this notebook, you'll create & update synonyms sets, configure an index to use synonyms, and run queries that leverage synonyms for enhanced relevancy.

## Create Elastic Cloud deployment

If you don't have an Elastic Cloud deployment, sign up [here](https://cloud.elastic.co/registration?utm_source=github&utm_content=elasticsearch-labs-notebook) for a free trial.

Once logged in to your Elastic Cloud account, go to the [Create deployment](https://cloud.elastic.co/deployments/create) page and select **Create deployment**. Leave all settings with their default values.

## Install packages and import modules

To get started, we'll need to connect to our Elastic deployment using the Python client.
Because we're using an Elastic Cloud deployment, we'll use the **Cloud ID** to identify our deployment.

First we need to install the `elasticsearch` Python client.

In [None]:
!pip install -qU elasticsearch

## Initialize the Elasticsearch client

Now we can instantiate the [Elasticsearch python client](https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/index.html), providing the cloud id and password in your deployment.

In [None]:
from elasticsearch import Elasticsearch
from getpass import getpass

# https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#finding-your-cloud-id
ELASTIC_CLOUD_ID = getpass("Elastic Cloud ID: ")

# https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#creating-an-api-key
ELASTIC_API_KEY = getpass("Elastic Api Key: ")

# Create the client instance
client = Elasticsearch(
    # For local development
    # hosts=["http://localhost:9200"] 
    cloud_id=ELASTIC_CLOUD_ID,
    api_key=ELASTIC_API_KEY,
)

If you're running Elasticsearch locally or self-managed, you can pass in the Elasticsearch host instead. [Read more](https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/connecting.html#_verifying_https_with_certificate_fingerprints_python_3_10_or_later) on how to connect to Elasticsearch locally.

Confirm that the client has connected with this test.

In [None]:
print(client.info())

## Configure & populate the index

Our client is set up and connected to our Elastic deployment. Now we need to configure the index that will store our test data and populate it with some documents. We'll use a small index of books with the following fields:

- `title`
- `authors`
- `publish_date`
- `num_reviews`
- `publisher`

### Create synonyms set

Let's create our initial synonyms set first.

In [None]:
synonyms_set = [
    {
        "id": "synonym-1",
        "synonyms": "js, javascript, java script"
    }
]

client.synonyms.put_synonym(id="my-synonyms-set", synonyms_set=synonyms_set)

### Configure the index

Ensure that you do not have a previously created index with the name `book_index`.

In [None]:
client.indices.delete(index="book_index", ignore_unavailable=True)

🔐 NOTE: at any time you can come back to this section and run the `delete` function above to remove your index and start from scratch.



In order to use synonyms, we need to define a [custom analyzer](https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-custom-analyzer.html) that uses the [`synonym`](https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-synonym-tokenfilter.html) or [`synonym_graph`](https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-synonym-graph-tokenfilter.html) token filter. Let's create an index that's configured to use an appropriate custom analyzer.


In [None]:
settings = {
    "analysis": {
        "analyzer": {
            "my_custom_index_analyzer": {
                "tokenizer": "standard",
                "filter": [
                    "lowercase"
                ]
            },
            "my_custom_search_analyzer": {
                "tokenizer": "standard",
                "filter": [
                    "lowercase",
                    "my_synonym_filter"
                ]
            }
        },
        "filter": {
            "my_synonym_filter": {
                "type": "synonym_graph",
                "synonyms_set": "my-synonyms-set",
                "updateable": True
            }
        }
    }
}

mappings = {
    "properties": {
        "title": {
            "type": "text",
            "analyzer": "my_custom_index_analyzer",
            "search_analyzer": "my_custom_search_analyzer"
        },
        "summary": {
            "type": "text",
            "analyzer": "my_custom_index_analyzer",
            "search_analyzer": "my_custom_search_analyzer"
        }
    }
}

client.indices.create(index='book_index', mappings=mappings, settings=settings)

There are a few things to note in the configuration:

- We are using the [`synonym_graph` token filter](https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-synonym-graph-tokenfilter.html).
- We have defined two analyzers: `my_custom_index_analyzer` and `my_custom_search_analyzer`. `my_custom_search_analyzer` is used as a [search analyzer](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-analyzer.html).
- `my_synonym_filter` is used only in `my_custom_search_analyzer`.

The `synonym_graph` token filter allows us to use multi-word synonyms. However, it is important to apply this filter only at search time, hence why we use it only in `my_custom_search_analyzer`. And since synonyms are only applied at search time, we can update them without reindexing.

See [_The same, but different: Boosting the power of Elasticsearch with synonyms_](https://www.elastic.co/blog/boosting-the-power-of-elasticsearch-with-synonyms) for more background information about search-time synonyms.

### Populate the index

Run the following command to upload some test data, containing information about 10 popular programming books from this [dataset](https://raw.githubusercontent.com/elastic/elasticsearch-labs/main/notebooks/search/data.json).

In [None]:
import json
from urllib.request import urlopen

url = "https://raw.githubusercontent.com/elastic/elasticsearch-labs/main/notebooks/search/data.json"
response = urlopen(url)
books = json.loads(response.read())

operations = []
for book in books:
    operations.append({"index": {"_index": "book_index"}})
    operations.append(book)
client.bulk(index="book_index", operations=operations, refresh=True)

## Aside: Pretty printing Elasticsearch search results

Your `search` API calls will return hard-to-read nested JSON.
We'll create a little function called `pretty_search_response` to return nice, human-readable outputs from our examples.

In [8]:
def pretty_search_response(response):
    if len(response['hits']['hits']) == 0:
        print('Your search returned no results.')
    else:
        for hit in response['hits']['hits']:
            id = hit['_id']
            publication_date = hit['_source']['publish_date']
            score = hit['_score']
            title = hit['_source']['title']
            summary = hit['_source']['summary']
            publisher = hit["_source"]["publisher"]
            num_reviews = hit["_source"]["num_reviews"]
            authors = hit["_source"]["authors"]
            pretty_output = (f"\nID: {id}\nPublication date: {publication_date}\nTitle: {title}\nSummary: {summary}\nPublisher: {publisher}\nReviews: {num_reviews}\nAuthors: {authors}\nScore: {score}")
            print(pretty_output)

## Run queries

Let's use our synonyms in some Elasticsearch queries. We'll start by searching for books about Javascript.

In [None]:
response = client.search(
    index="book_index",
    query={
        "multi_match": {
            "query": "java script",
            "fields": [
                "title^10",
                "summary",
            ]
        }
    }
)

pretty_search_response(response)


ID: 3NfpXIsBGHjk6-WLlqOE
Publication date: 2018-12-04
Title: Eloquent JavaScript
Summary: A modern introduction to programming
Publisher: no starch press
Reviews: 38
Authors: ['marijn haverbeke']
Score: 20.307524

ID: 29fpXIsBGHjk6-WLlqOE
Publication date: 2015-03-27
Title: You Don't Know JS: Up & Going
Summary: Introduction to JavaScript and programming as a whole
Publisher: oreilly
Reviews: 36
Authors: ['kyle simpson']
Score: 19.787104

ID: 39fpXIsBGHjk6-WLlqOE
Publication date: 2008-05-15
Title: JavaScript: The Good Parts
Summary: A deep dive into the parts of JavaScript that are essential to writing maintainable code
Publisher: oreilly
Reviews: 51
Authors: ['douglas crockford']
Score: 17.064087


Notice that even though we searched for the term "java script", we got results containing the terms "JS" and "JavaScript". Our synonyms are working!

Now let's try searching for books about AI.

In [None]:
response = client.search(
    index="book_index",
    query={
        "multi_match": {
            "query": "AI",
            "fields": [
                "title^10",
                "summary",
            ]
        }
    }
)

pretty_search_response(response)

Your search returned no results.


We didn't get any results! There are some books that use the terms "artificial intelligence", but not "AI". Let's try using the Synonyms API to add a new synonym rule for "AI" so the previous query returns results.

In [None]:
client.synonyms.put_synonym_rule(set_id="my-synonyms-set", rule_id="synonym-2", synonyms="ai, artificial intelligence")

If we run the query again, we should now get some results.

In [None]:
response = client.search(
    index="book_index",
    query={
        "multi_match": {
            "query": "AI",
            "fields": [
                "title^10",
                "summary",
            ]
        }
    }
)

pretty_search_response(response)


ID: 2dfpXIsBGHjk6-WLlqOE
Publication date: 2020-04-06
Title: Artificial Intelligence: A Modern Approach
Summary: Comprehensive introduction to the theory and practice of artificial intelligence
Publisher: pearson
Reviews: 39
Authors: ['stuart russell', 'peter norvig']
Score: 42.500813


## Conclusion

The Synonyms API allows you to dynamically create & modify the synonyms used in your search index in real time. After reading this notebook, you should have all you need to start integrating the Synonyms API into your search experience!