# Showcase - Google Cloud Enterprise Search - Structured Search Engine
_Structured Data is stored in Google Cloud Enterprise Search with data schema in JSON-like format_


---


* Authors: Nutchanon Leelapornudom (nutchanon@google.com)
* Created: 28/07/2023
* Last Updated: 03/08/2023

---

## Objective

This notebook is an example of using **Google Cloud Enterprise Search API** with **Structured Search Engine** of **Generative AI App Builder**.

We will walkthrough many Google Cloud Enterprise Search features that exists on API, which you can use via REST API, RPC API, or Cloud Client/SDK.

With these functionalities, you can use Google Cloud Enterprise Search to integrate with your application to enhance inteligent search on your systems.

**Google Cloud Enterprise Search - Structured Search Engine contains the following features**

Search/Query features: (cover in this notebook)
1. Semantic Search
2. Result Filtering (Post Filteriing)
3. Result Ordering
4. Spell Correction
5. Query Expansion
6. Boost and Bury
7. Dynamic Facets
8. Search Autocomplete

Engine configuration and data management features: (not cover in this notebook)
1. Search Engine Management (Provisioning, Delete)
2. Schema Management (Metadata, Data Type, Field Semantic Meaning)
3. Data Management (Import, List, Delete, Purge)
4. Field Management (Retrivable, Searchable, Indexable, Completable)

Some features exists on the console and widget only: (not cover in this notebook)
* Metrics Analytics
* User events
* Result Feedback

Other features for architects: (not cover in this notebook)
* Security control
* Serverless operations

In order to run this notebook you must have access to Google Cloud Enterprise Search.

In this example, we use the Kaggle Movie data ([link](gs://cloud-samples-data/gen-app-builder/search/kaggle_movies)), a public dataset, on Google Cloud Storage.

This notebook is tested on Google Colab.

Users may wish to:
1. Search on structured data, JSON documents, for semantic meaning
2. Filter some the search results for specific business purpose
3. Order the search results before showing on the user interface
4. Get relevant facets based-on the search results for next filtering
5. Get autocomplete search on the search box

---

In this notebook the following examples will be elaborated:

- ✅ Example of searching a structured data on Enterprise Search with Python SDK

---

**References:**

- [Google Cloud Enterprise Search Documentation](https://cloud.google.com/generative-ai-app-builder/docs/enterprise-search-introduction)

##Download and install required SDKs
Additional packages that required for running this notebook
* langchain
* google-cloud-discoveryengine

**Note:** Do not forget to restart the kernel, once you've install packages

In [None]:
required_restart = False

# Install langchain
try:
  import langchain
except ImportError:
  ! pip install langchain==0.0.236

# Install Enterprise Search SDK and Vertex PaLM endpoint
try:
    from google.cloud import discoveryengine_v1beta
except ImportError:
    ! pip install google-cloud-discoveryengine
    required_restart = True

# Restart after pacakages installation
print("Do I need to restart the runtime ?: {}".format(required_restart))

Do I need to restart the runtime ?: False


---

#### ⚠️ Do not forget to click the "RESTART RUNTIME" button above.

---

##Authentication to the platform

If running in Colab authenticate with `google.colab.google.auth` otherwise assume that running on Vertex Workbench.

In [None]:
import sys

if 'google.colab' in sys.modules:
  from google.colab import auth as google_auth
  google_auth.authenticate_user()

## Configure Google Cloud project
Configure your Google Cloud project to use Gen App Builder, and Cloud Storage.

You need to have these permissions as below for running this notebook. However, if you want do use some codes here to the production system, you need to do fine-gain access control to reduce security risk.
* Discovery Engine Admin

**Note:** During this notebook launched, your Google Cloud project may need a whitelist to access Gen App Builder platform. Please contact Google Cloud Account Team for asking the access.

In [None]:
GOOGLE_CLOUD_PROJECT = '<google_cloud_project_id>' #@param {"type": "string"}
GOOGLE_CLOUD_REGION = 'global' #@param {"type": "string"}

##Optional packages that will be used later
You do not need to install this packages, but you have to bypass some functionalities in this notebook.

In [None]:
# pandas DataFrame for pretty debug
try:
  import pandas as pd
except ImportError:
  ! pip install pandas

##Extended GoogleCloudEnterpriseSearchRetriever class from official "langchain.retrievers"
Main Enterprise Search class to wrap search functionality.

In [None]:
import json
from typing import List, Any
from langchain.retrievers import GoogleCloudEnterpriseSearchRetriever
from google.cloud import discoveryengine_v1beta
from google.cloud.discoveryengine_v1beta.types import SearchResponse, Document
from google.protobuf.json_format import MessageToDict
from google.cloud.discoveryengine_v1beta import CompletionServiceClient

class ExtendedGoogleCloudEnterpriseSearchRetriever(GoogleCloudEnterpriseSearchRetriever):
    """Entended of GoogleCloudEnterpriseSearchRetriever of langchain.retrivers."""
    # variables from GoogleCloudEnterpriseSearchRetriever
    _client = None
    _serving_config = None
    _project_id = None
    _search_engine_id = None
    def __init__(self, **data: Any) -> None:
        super().__init__(**data)
        self._client = self._client # get from GoogleCloudEnterpriseSearchRetriever
        self._serving_config = self._serving_config # get from GoogleCloudEnterpriseSearchRetriever
        self._project_id = self.project_id # get from GoogleCloudEnterpriseSearchRetriever
        self._search_engine_id = self.search_engine_id # get from GoogleCloudEnterpriseSearchRetriever
        self._complete_client = CompletionServiceClient()

    # another variables
    _spell_correction: str = None
    _facets: dict = None

    # auto complete
    _complete_client: CompletionServiceClient = None

    # get relevant documents from structured search engine
    def get_relevant_structured_documents(self,
                                          query: str,
                                          filter: str = None,
                                          order_by: str = None,
                                          spell_correction: int = 2,
                                          query_expansion: int = 1,
                                          boost_spec: dict = None,
                                          facet_specs: dict = None) -> List[Document]:
        # search on the engine
        request = discoveryengine_v1beta.SearchRequest(
            serving_config=self._serving_config,
            query=query,
            filter=filter,
            order_by=order_by,
            spell_correction_spec={'mode': spell_correction},
            query_expansion_spec={'condition': query_expansion},
            boost_spec=boost_spec,
            facet_specs=facet_specs,
        )
        res = self._client.search(request)

        # collect other metadata informations
        self._set_spell_correction(res.corrected_query)

        # set facets
        self._set_facets(res.facets)

        # return the results
        return self._structured_documents_formatting(res)

    def _structured_documents_formatting(self, res: SearchResponse) -> list:
        # formatting the response
        documents = []
        for result in res.results:
          data = MessageToDict(result.document._pb)
          content = data.get('structData', {})
          content['_id'] = data.get('id')
          documents.append(content)
        return documents

    def pretty_results(self, results: dict, top: int = 5) -> pd.DataFrame:
        # formatting the json
        # return pd.DataFrame(results).head()
        return self._kaggle_movies_pretty_results(results, top)

    def _kaggle_movies_pretty_results(self, results: dict, top: int) -> pd.DataFrame:
        # formatting the json
        raw_df = pd.DataFrame(results)
        df = raw_df[['id', 'title', 'status', 'release_date', 'runtime', 'revenue', 'vote_average', 'vote_count', 'overview', 'poster_path', 'genres']]
        return df.head(top)

    # metdata control for structured search engine
    def _set_spell_correction(self, corrected_query: str) -> None:
        self._spell_correction = corrected_query

    def get_spell_correction(self) -> str:
        return self._spell_correction

    def _set_facets(self, facets: list) -> None:
        facets_dict = []

        for facet in facets:
          facet_dict = {}
          facet_values = []
          facet_dict['key'] = facet.key
          for doc in facet.values:
              facet_values.append(doc.value)
          facet_dict['values'] = facet_values
          facets_dict.append(facet_dict)

        self._facets = facets_dict

    def get_facets(self) -> dict:
        return self._facets

    def get_autocompletes(self, search_complete: str, query_model: str = "document", user_pseudo_id: str = "user-0001"):
        data_store="projects/{}/locations/global/collections/default_collection/dataStores/{}".format(self._project_id, self._search_engine_id)
        request = discoveryengine_v1beta.CompleteQueryRequest(
            data_store=data_store,
            query=search_complete,
            query_model=query_model,
            user_pseudo_id=user_pseudo_id,
        )
        res = self._complete_client.complete_query(request)

        # format the response
        suggestions = []
        for suggestion in res.query_suggestions:
          suggestions.append(suggestion.suggestion)

        return suggestions

##Gen App Builder - Enterprise Search - Structured Engine Configuration
Configure your Enterprise Search Engine here. You need the full "Engine ID" here.

In [None]:
ENTERPRISE_SEARCH_ENGINE = "<enterprise_search_engine_id>" #@param {type: "string"}
retriever = ExtendedGoogleCloudEnterpriseSearchRetriever(
    project_id=GOOGLE_CLOUD_PROJECT, search_engine_id=ENTERPRISE_SEARCH_ENGINE
)

##Test: Running a search on Enterprise Search - Structured Engine
Let's try to search and see the result of the Search Engine.

I show the sample response in table format by using **Pandas DataFrame Head()**.

In [None]:
SEARCH_QUERY = "harry potter" #@param {type: "string"}
res = retriever.get_relevant_structured_documents(SEARCH_QUERY)
df = retriever.pretty_results(res)
df

Unnamed: 0,id,title,status,release_date,runtime,revenue,vote_average,vote_count,overview,poster_path,genres
0,671,Harry Potter and the Philosopher's Stone,Released,2001-11-16,152.0,976475550.0,7.5,7188.0,Harry Potter has lived under the stairs at his...,https://image.tmdb.org/t/p/original/wuMc08IPKE...,"[{'name': 'Adventure', 'id': '12'}, {'name': '..."
1,767,Harry Potter and the Half-Blood Prince,Released,2009-07-07,153.0,933959197.0,7.4,5435.0,"As Harry begins his sixth year at Hogwarts, he...",https://image.tmdb.org/t/p/original/z7uo9zmQdQ...,"[{'name': 'Adventure', 'id': '12'}, {'name': '..."
2,672,Harry Potter and the Chamber of Secrets,Released,2002-11-13,161.0,876688482.0,7.4,5966.0,"Ignoring threats to his life, Harry returns to...",https://image.tmdb.org/t/p/original/sdEOH0992Y...,"[{'id': '12', 'name': 'Adventure'}, {'id': '14..."
3,675,Harry Potter and the Order of the Phoenix,Released,2007-06-28,138.0,938212738.0,7.4,5633.0,Returning for his fifth year of study at Hogwa...,https://image.tmdb.org/t/p/original/5aOyriWkPe...,"[{'id': '12', 'name': 'Adventure'}, {'id': '14..."
4,12444,Harry Potter and the Deathly Hallows: Part 1,Released,2010-10-17,146.0,954305868.0,7.5,5708.0,"Harry, Ron and Hermione walk away from their l...",https://image.tmdb.org/t/p/original/iGoXIpQb7P...,"[{'id': '12', 'name': 'Adventure'}, {'id': '14..."


##Walkthrough Search features
**Search/Query features:**
1. Semantic Search
2. Result Filtering (Post Filteriing)
3. Result Ordering
4. Spell Correction
5. Query Expansion
6. Boost and Bury
7. Dynamic Facets
8. Search Autocomplete

##1. Semantic Search
[Semantic Search](https://en.wikipedia.org/wiki/Semantic_search) is a similarity search based-on the meaning of the words/sentences for query.

In [None]:
SEARCH_QUERY = "Marvel Studio movies" #@param {type: "string"}
res = retriever.get_relevant_structured_documents(SEARCH_QUERY)
df = retriever.pretty_results(res,10)
df

Unnamed: 0,id,title,status,release_date,runtime,revenue,vote_average,vote_count,overview,poster_path,genres
0,259910,Marvel Studios: Assembling a Universe,Released,2014-03-18,43.0,0.0,6.6,44.0,A look at the story behind Marvel Studios and ...,https://image.tmdb.org/t/p/original/5LLBKBH4ud...,"[{'id': '10770', 'name': 'TV Movie'}, {'id': '..."
1,299969,"Marvel: 75 Years, From Pulp to Pop!",Released,2014-11-04,41.0,0.0,7.7,22.0,In celebration of the publisher's 75th anniver...,https://image.tmdb.org/t/p/original/g716rAncfs...,"[{'id': '99', 'name': 'Documentary'}]"
2,99861,Avengers: Age of Ultron,Released,2015-04-22,141.0,1405404000.0,7.3,6908.0,When Tony Stark tries to jumpstart a dormant p...,https://image.tmdb.org/t/p/original/t90Y3G8UGQ...,"[{'name': 'Action', 'id': '28'}, {'name': 'Adv..."
3,284274,Iron Man & Captain America: Heroes United,Released,2014-07-29,71.0,0.0,5.8,21.0,Iron Man and Captain America battle to keep th...,https://image.tmdb.org/t/p/original/5li3ZIIPrj...,"[{'name': 'Adventure', 'id': '12'}, {'id': '16..."
4,230896,Iron Man & Hulk: Heroes United,Released,2013-12-03,71.0,0.0,5.4,48.0,The Invincible Iron Man and the Incredible Hul...,https://image.tmdb.org/t/p/original/4vPNRJtPjt...,"[{'name': 'Action', 'id': '28'}, {'name': 'Adv..."
5,102899,Ant-Man,Released,2015-07-14,117.0,519312000.0,7.0,6029.0,Armed with the astonishing ability to shrink i...,https://image.tmdb.org/t/p/original/D6e8RJf2qU...,"[{'id': '878', 'name': 'Science Fiction'}, {'n..."
6,284053,Thor: Ragnarok,Post Production,2017-10-25,0.0,0.0,0.0,0.0,Thor is imprisoned on the other side of the un...,https://image.tmdb.org/t/p/original/avy7IR8UMl...,"[{'id': '28', 'name': 'Action'}, {'id': '12', ..."
7,283995,Guardians of the Galaxy Vol. 2,Released,2017-04-19,137.0,863416100.0,7.6,4858.0,The Guardians must fight to keep their newfoun...,https://image.tmdb.org/t/p/original/y4MBh0EjBl...,"[{'id': '28', 'name': 'Action'}, {'name': 'Adv..."
8,36586,Blade II,Released,2002-03-22,117.0,155010000.0,6.3,1556.0,A rare mutation has occurred within the vampir...,https://image.tmdb.org/t/p/original/jlURNpXCMK...,"[{'name': 'Fantasy', 'id': '14'}, {'id': '27',..."


In [None]:
#@title Try another example
SEARCH_QUERY = "Disney Princess movies" #@param {type: "string"}
res = retriever.get_relevant_structured_documents(SEARCH_QUERY)
df = retriever.pretty_results(res)
df

Unnamed: 0,id,title,status,release_date,runtime,revenue,vote_average,vote_count,overview,poster_path,genres
0,14128,Cinderella II: Dreams Come True,Released,2002-02-23,74.0,0.0,5.5,265.0,"As a newly crowned princess, Cinderella quickl...",https://image.tmdb.org/t/p/original/36nD4aDvoF...,"[{'name': 'Family', 'id': '10751'}, {'name': '..."
1,15969,The Return of Jafar,Released,1994-12-15,69.0,0.0,5.7,447.0,The evil Jafar escapes from the magic lamp as ...,https://image.tmdb.org/t/p/original/sC4wDVBMPM...,"[{'id': '10751', 'name': 'Family'}, {'name': '..."
2,812,Aladdin,Released,1992-11-25,90.0,504050200.0,7.4,3495.0,Princess Jasmine grows tired of being forced t...,https://image.tmdb.org/t/p/original/7f53XAE4nP...,"[{'id': '16', 'name': 'Animation'}, {'id': '10..."
3,109445,Frozen,Released,2013-11-27,102.0,1274219000.0,7.3,5440.0,Young princess Anna of Arendelle dreams about ...,https://image.tmdb.org/t/p/original/jIjdFXKUNt...,"[{'name': 'Animation', 'id': '16'}, {'name': '..."
4,81,Nausicaä of the Valley of the Wind,Released,1984-03-11,117.0,3301446.0,7.7,808.0,"After a global war, the seaside kingdom known ...",https://image.tmdb.org/t/p/original/hnYowHwLq0...,"[{'id': '12', 'name': 'Adventure'}, {'id': '16..."


##2. Result Filtering (Post Filteriing)
Result Filtering (Post Filteriing) - You have an option to filter only specific results based-on your business functions.

Think as a WHERE clause in SQL statement where you can control which rows/documents that you want to get on your application.

In this example, We will filter results only **the movies that have vote average score more than 7.0**

You can see the filtering syntax on [Google Cloud Retail Filter](https://cloud.google.com/retail/docs/filter-and-order)

In [None]:
#@title Filter only movies that have vote average score more than 7.0
SEARCH_QUERY = "Disney Princess movies" #@param {type: "string"}
FILTER_RESULT = "vote_average: IN(7.0i,*)" #@param {type: "string"}
res = retriever.get_relevant_structured_documents(SEARCH_QUERY, FILTER_RESULT)
df = retriever.pretty_results(res)
df

Unnamed: 0,id,title,status,release_date,runtime,revenue,vote_average,vote_count,overview,poster_path,genres
0,812,Aladdin,Released,1992-11-25,90.0,504050200.0,7.4,3495.0,Princess Jasmine grows tired of being forced t...,https://image.tmdb.org/t/p/original/7f53XAE4nP...,"[{'name': 'Animation', 'id': '16'}, {'name': '..."
1,109445,Frozen,Released,2013-11-27,102.0,1274219000.0,7.3,5440.0,Young princess Anna of Arendelle dreams about ...,https://image.tmdb.org/t/p/original/jIjdFXKUNt...,"[{'name': 'Animation', 'id': '16'}, {'id': '12..."
2,81,Nausicaä of the Valley of the Wind,Released,1984-03-11,117.0,3301446.0,7.7,808.0,"After a global war, the seaside kingdom known ...",https://image.tmdb.org/t/p/original/hnYowHwLq0...,"[{'id': '12', 'name': 'Adventure'}, {'name': '..."


##3. Result Ordering
Result Ordering - You have an option to sort your results based-on the specific fields/columns that you defined.

Think as an ORDER BY clause in SQL statement.

Normally, the results are returned with ordered by relevance scores that are calculated with the search engine.

In [None]:
#@title Order the results by vote average score and revenue of movies in acsending form
SEARCH_QUERY = "Disney Princess movies" #@param {type: "string"}
ORDER_BY = "vote_average desc, revenue desc" #@param {type: "string"}
res = retriever.get_relevant_structured_documents(query=SEARCH_QUERY, order_by=ORDER_BY)
df = retriever.pretty_results(res)
df

Unnamed: 0,id,title,status,release_date,runtime,revenue,vote_average,vote_count,overview,poster_path,genres
0,81,Nausicaä of the Valley of the Wind,Released,1984-03-11,117.0,3301446.0,7.7,808.0,"After a global war, the seaside kingdom known ...",https://image.tmdb.org/t/p/original/hnYowHwLq0...,"[{'id': '12', 'name': 'Adventure'}, {'id': '16..."
1,812,Aladdin,Released,1992-11-25,90.0,504050200.0,7.4,3495.0,Princess Jasmine grows tired of being forced t...,https://image.tmdb.org/t/p/original/7f53XAE4nP...,"[{'id': '16', 'name': 'Animation'}, {'id': '10..."
2,109445,Frozen,Released,2013-11-27,102.0,1274219000.0,7.3,5440.0,Young princess Anna of Arendelle dreams about ...,https://image.tmdb.org/t/p/original/jIjdFXKUNt...,"[{'name': 'Animation', 'id': '16'}, {'id': '12..."
3,15969,The Return of Jafar,Released,1994-12-15,69.0,0.0,5.7,447.0,The evil Jafar escapes from the magic lamp as ...,https://image.tmdb.org/t/p/original/sC4wDVBMPM...,"[{'id': '10751', 'name': 'Family'}, {'id': '12..."
4,14128,Cinderella II: Dreams Come True,Released,2002-02-23,74.0,0.0,5.5,265.0,"As a newly crowned princess, Cinderella quickl...",https://image.tmdb.org/t/p/original/36nD4aDvoF...,"[{'name': 'Family', 'id': '10751'}, {'name': '..."


##4. Spell Correction
Spell Correction - The ability of the search engine to understand what actually you want to search and help to correct typo before query.

(0) MODE_UNSPECIFIED - Go with the default of the engine. This is case, it is AUTO.

(1) SUGGESTION_ONLY - Try to find a spell suggestion, but it will not be used as the search query.

(2) AUTO (default) - Automatic spell correction

In [None]:
#@title Try to search "harry pettor", which is intended typo
SEARCH_QUERY = "harry pettor" #@param {type: "string"}
res = retriever.get_relevant_structured_documents(query=SEARCH_QUERY,spell_correction=2)
df = retriever.pretty_results(res)
print("Search Spell Correction: {}".format(retriever.get_spell_correction()))
df

Search Spell Correction: harry potter


Unnamed: 0,id,title,status,release_date,runtime,revenue,vote_average,vote_count,overview,poster_path,genres
0,671,Harry Potter and the Philosopher's Stone,Released,2001-11-16,152.0,976475550.0,7.5,7188.0,Harry Potter has lived under the stairs at his...,https://image.tmdb.org/t/p/original/wuMc08IPKE...,"[{'id': '12', 'name': 'Adventure'}, {'name': '..."
1,767,Harry Potter and the Half-Blood Prince,Released,2009-07-07,153.0,933959197.0,7.4,5435.0,"As Harry begins his sixth year at Hogwarts, he...",https://image.tmdb.org/t/p/original/z7uo9zmQdQ...,"[{'id': '12', 'name': 'Adventure'}, {'name': '..."
2,672,Harry Potter and the Chamber of Secrets,Released,2002-11-13,161.0,876688482.0,7.4,5966.0,"Ignoring threats to his life, Harry returns to...",https://image.tmdb.org/t/p/original/sdEOH0992Y...,"[{'name': 'Adventure', 'id': '12'}, {'id': '14..."
3,675,Harry Potter and the Order of the Phoenix,Released,2007-06-28,138.0,938212738.0,7.4,5633.0,Returning for his fifth year of study at Hogwa...,https://image.tmdb.org/t/p/original/5aOyriWkPe...,"[{'id': '12', 'name': 'Adventure'}, {'id': '14..."
4,12444,Harry Potter and the Deathly Hallows: Part 1,Released,2010-10-17,146.0,954305868.0,7.5,5708.0,"Harry, Ron and Hermione walk away from their l...",https://image.tmdb.org/t/p/original/iGoXIpQb7P...,"[{'name': 'Adventure', 'id': '12'}, {'name': '..."


##5. Query Expansion
Query Expansion - Expand search query to be synonyms, related words, etc. for getting more results.

(0) CONDITION_UNSPECIFIED - Go with the default of the engine. This is case, it is DISABLED.

(1) DISABLED (default) - Disabled query expansion. Only the exact search query is used.

(2) AUTO - Automatic query expansion

In [None]:
#@title Disabled Query Expansion
SEARCH_QUERY = "movies released in 2020" #@param {type: "string"}
res = retriever.get_relevant_structured_documents(query=SEARCH_QUERY,query_expansion=1)
df = retriever.pretty_results(res)
df

Unnamed: 0,id,title,status,release_date,runtime,revenue,vote_average,vote_count,overview,poster_path,genres
0,10483,Death Race,Released,2008-08-22,105.0,73762516.0,6.0,1205.0,"Terminal Island, New York: 2020. Overcrowding ...",https://image.tmdb.org/t/p/original/3dIZ049Axm...,"[{'id': '28', 'name': 'Action'}, {'name': 'Thr..."
1,79611,Target,Released,2011-06-26,158.0,73000.0,6.4,5.0,"In the year 2020, a group of wealthy Moscovite...",https://image.tmdb.org/t/p/original/uMzkG73Y5q...,"[{'name': 'Fantasy', 'id': '14'}, {'id': '18',..."


In [None]:
#@title Enabled Query Expansion
SEARCH_QUERY = "movies released in 2020" #@param {type: "string"}
res = retriever.get_relevant_structured_documents(query=SEARCH_QUERY,query_expansion=2)
df = retriever.pretty_results(res)
df

Unnamed: 0,id,title,status,release_date,runtime,revenue,vote_average,vote_count,overview,poster_path,genres
0,10483,Death Race,Released,2008-08-22,105.0,73762516.0,6.0,1205.0,"Terminal Island, New York: 2020. Overcrowding ...",https://image.tmdb.org/t/p/original/3dIZ049Axm...,"[{'name': 'Action', 'id': '28'}, {'id': '53', ..."
1,79611,Target,Released,2011-06-26,158.0,73000.0,6.4,5.0,"In the year 2020, a group of wealthy Moscovite...",https://image.tmdb.org/t/p/original/uMzkG73Y5q...,"[{'id': '14', 'name': 'Fantasy'}, {'name': 'Dr..."
2,58419,2020 Texas Gladiators,Released,1983-01-01,91.0,0.0,3.9,4.0,"In a post-apocalyptic Texas, a band of warrior...",https://image.tmdb.org/t/p/original/HSmbkvvLnJ...,"[{'id': '28', 'name': 'Action'}, {'name': 'Sci..."
3,205908,Recon 2020: The Caprini Massacre,Released,2004-05-22,92.0,0.0,1.0,2.0,Soldiers land on Caprini and confront diabolic...,https://image.tmdb.org/t/p/original/3qezbjLEw7...,"[{'name': 'Science Fiction', 'id': '878'}, {'n..."
4,63077,The Coming Days,Released,2010-11-04,130.0,0.0,5.5,11.0,Welcome to 2020: The European Union has collap...,https://image.tmdb.org/t/p/original/wz62NIe9Js...,"[{'name': 'Drama', 'id': '18'}, {'name': 'Scie..."


##6. Boost
Boost is a way that you can control sorting order of the result set. Rather than using the default, which is sorted by relevance score, you have an option to boost or bury based-on your matching conditions.

"boost" parameter can be used to control by [-1.0,1.0] of floating point, which 1.0 means increasing, where -1.0 mean decreasing.

In [None]:
#@title Without Boosting
SEARCH_QUERY = "Disney Princess movies" #@param {type: "string"}
res = retriever.get_relevant_structured_documents(query=SEARCH_QUERY)
df = retriever.pretty_results(res)
df

Unnamed: 0,id,title,status,release_date,runtime,revenue,vote_average,vote_count,overview,poster_path,genres
0,14128,Cinderella II: Dreams Come True,Released,2002-02-23,74.0,0.0,5.5,265.0,"As a newly crowned princess, Cinderella quickl...",https://image.tmdb.org/t/p/original/36nD4aDvoF...,"[{'id': '10751', 'name': 'Family'}, {'id': '16..."
1,15969,The Return of Jafar,Released,1994-12-15,69.0,0.0,5.7,447.0,The evil Jafar escapes from the magic lamp as ...,https://image.tmdb.org/t/p/original/sC4wDVBMPM...,"[{'id': '10751', 'name': 'Family'}, {'id': '12..."
2,812,Aladdin,Released,1992-11-25,90.0,504050200.0,7.4,3495.0,Princess Jasmine grows tired of being forced t...,https://image.tmdb.org/t/p/original/7f53XAE4nP...,"[{'name': 'Animation', 'id': '16'}, {'name': '..."
3,109445,Frozen,Released,2013-11-27,102.0,1274219000.0,7.3,5440.0,Young princess Anna of Arendelle dreams about ...,https://image.tmdb.org/t/p/original/jIjdFXKUNt...,"[{'id': '16', 'name': 'Animation'}, {'id': '12..."
4,81,Nausicaä of the Valley of the Wind,Released,1984-03-11,117.0,3301446.0,7.7,808.0,"After a global war, the seaside kingdom known ...",https://image.tmdb.org/t/p/original/hnYowHwLq0...,"[{'id': '12', 'name': 'Adventure'}, {'id': '16..."


In [None]:
#@title With Boosting based-on vote average score and number of votes
SEARCH_QUERY = "Disney Princess movies" #@param {type: "string"}

# Define boosting conditions
boost_spec = {'condition_boost_specs' : [
    {'condition': 'vote_average: IN(7.0i, *) AND vote_count: IN(1000, *)', 'boost': 1.0},
    {'condition': 'vote_average: IN(7.5i, *)', 'boost': 0.75},
    {'condition': 'vote_count: IN(1000i, *)', 'boost': 0.5},
    {'condition': 'vote_count: IN(*, 100e)', 'boost': -0.5}
]}
res = retriever.get_relevant_structured_documents(query=SEARCH_QUERY, boost_spec=boost_spec)
df = retriever.pretty_results(res)
df

Unnamed: 0,id,title,status,release_date,runtime,revenue,vote_average,vote_count,overview,poster_path,genres
0,812,Aladdin,Released,1992-11-25,90.0,504050200.0,7.4,3495.0,Princess Jasmine grows tired of being forced t...,https://image.tmdb.org/t/p/original/7f53XAE4nP...,"[{'name': 'Animation', 'id': '16'}, {'name': '..."
1,109445,Frozen,Released,2013-11-27,102.0,1274219000.0,7.3,5440.0,Young princess Anna of Arendelle dreams about ...,https://image.tmdb.org/t/p/original/jIjdFXKUNt...,"[{'id': '16', 'name': 'Animation'}, {'name': '..."
2,81,Nausicaä of the Valley of the Wind,Released,1984-03-11,117.0,3301446.0,7.7,808.0,"After a global war, the seaside kingdom known ...",https://image.tmdb.org/t/p/original/hnYowHwLq0...,"[{'id': '12', 'name': 'Adventure'}, {'name': '..."
3,14128,Cinderella II: Dreams Come True,Released,2002-02-23,74.0,0.0,5.5,265.0,"As a newly crowned princess, Cinderella quickl...",https://image.tmdb.org/t/p/original/36nD4aDvoF...,"[{'id': '10751', 'name': 'Family'}, {'id': '16..."
4,15969,The Return of Jafar,Released,1994-12-15,69.0,0.0,5.7,447.0,The evil Jafar escapes from the magic lamp as ...,https://image.tmdb.org/t/p/original/sC4wDVBMPM...,"[{'name': 'Family', 'id': '10751'}, {'id': '12..."


##7.Dynamic Facets
Facets can be shown next to results in order to search, which can help users filter the results that related.

Dynamic Facets return based-on your search query.

In [None]:
#@title Return dynamic facets on Genres
SEARCH_QUERY = "Disney Princess movies" #@param {type: "string"}
facets = [{'facet_key': {'key': 'genres.name'}, 'limit': 10}]
res = retriever.get_relevant_structured_documents(query=SEARCH_QUERY,facet_specs=facets)
df = retriever.pretty_results(res)
print(retriever.get_facets())
df

[{'key': 'genres.name', 'values': ['Adventure', 'Animation', 'Comedy', 'Family', 'Fantasy', 'Romance']}]


Unnamed: 0,id,title,status,release_date,runtime,revenue,vote_average,vote_count,overview,poster_path,genres
0,14128,Cinderella II: Dreams Come True,Released,2002-02-23,74.0,0.0,5.5,265.0,"As a newly crowned princess, Cinderella quickl...",https://image.tmdb.org/t/p/original/36nD4aDvoF...,"[{'name': 'Family', 'id': '10751'}, {'id': '16..."
1,15969,The Return of Jafar,Released,1994-12-15,69.0,0.0,5.7,447.0,The evil Jafar escapes from the magic lamp as ...,https://image.tmdb.org/t/p/original/sC4wDVBMPM...,"[{'name': 'Family', 'id': '10751'}, {'name': '..."
2,812,Aladdin,Released,1992-11-25,90.0,504050200.0,7.4,3495.0,Princess Jasmine grows tired of being forced t...,https://image.tmdb.org/t/p/original/7f53XAE4nP...,"[{'name': 'Animation', 'id': '16'}, {'id': '10..."
3,109445,Frozen,Released,2013-11-27,102.0,1274219000.0,7.3,5440.0,Young princess Anna of Arendelle dreams about ...,https://image.tmdb.org/t/p/original/jIjdFXKUNt...,"[{'id': '16', 'name': 'Animation'}, {'id': '12..."
4,81,Nausicaä of the Valley of the Wind,Released,1984-03-11,117.0,3301446.0,7.7,808.0,"After a global war, the seaside kingdom known ...",https://image.tmdb.org/t/p/original/hnYowHwLq0...,"[{'id': '12', 'name': 'Adventure'}, {'name': '..."


In [None]:
#@title Try on another search query, and see how dynamic facets work
SEARCH_QUERY = "wizard movies" #@param {type: "string"}
facets = [{'facet_key': {'key': 'genres.name'}, 'limit': 10}]
res = retriever.get_relevant_structured_documents(query=SEARCH_QUERY,facet_specs=facets)
df = retriever.pretty_results(res)
print(retriever.get_facets())
df

[{'key': 'genres.name', 'values': ['Action', 'Adventure', 'Animation', 'Comedy', 'Crime', 'Documentary', 'Drama', 'Family', 'Fantasy', 'Foreign']}]


Unnamed: 0,id,title,status,release_date,runtime,revenue,vote_average,vote_count,overview,poster_path,genres
0,334629,The Wonderful Wizard of Oz: 50 Years of Magic,Released,1990-02-20,52.0,0.0,0.0,0.0,Documentary about the making of the 1939 MGM c...,https://image.tmdb.org/t/p/original/e35f0vHsp3...,"[{'name': 'Documentary', 'id': '99'}]"
1,68894,Shadow Magic,Released,2000-09-08,116.0,0.0,6.0,1.0,"Beijing, 1902: an enterprising young portrait ...",https://image.tmdb.org/t/p/original/eHbA0xV9gr...,"[{'id': '18', 'name': 'Drama'}, {'id': '10769'..."
2,80530,Magic Beyond Words: The JK Rowling Story,Released,2011-07-18,87.0,0.0,7.4,42.0,Magic Beyond Words: The J.K. Rowling Story is ...,https://image.tmdb.org/t/p/original/bB6v5GxntC...,"[{'id': '18', 'name': 'Drama'}, {'name': 'TV M..."
3,395278,Wizard Mode,Released,2016-05-02,82.0,0.0,7.0,5.0,Mastering classic pinball arcade games require...,https://image.tmdb.org/t/p/original/vkXENvFYc3...,"[{'name': 'Documentary', 'id': '99'}]"
4,41639,Magic & Bird: A Courtship of Rivals,Released,2010-03-10,88.0,0.0,8.3,6.0,Magic &amp; Bird: A Courtship of Rivals is a 2...,https://image.tmdb.org/t/p/original/8tEPbrOD7z...,"[{'name': 'Documentary', 'id': '99'}, {'id': '..."


In [None]:
#@title Use dynamic facets response, from the previous code, to filter only related results, In this case, I fitler only "Romance" or "Comedy" genres on "wizard movies"
SEARCH_QUERY = "wizard movies" #@param {type: "string"}
FILTER_RESULT = "genres.name: ANY(\"Romance\", \"Comedy\")" #@param {type: "string"}
res = retriever.get_relevant_structured_documents(SEARCH_QUERY, FILTER_RESULT)
df = retriever.pretty_results(res)
df

Unnamed: 0,id,title,status,release_date,runtime,revenue,vote_average,vote_count,overview,poster_path,genres
0,45671,Rough Magic,Released,1995-09-07,100.0,0.0,6.3,7.0,A sleazy politician sends an agent (Russell Cr...,https://image.tmdb.org/t/p/original/wXoV64GylO...,"[{'id': '35', 'name': 'Comedy'}, {'id': '18', ..."
1,14096,The Magician,Released,2005-06-18,85.0,0.0,6.0,3.0,Following the dealings of a Melbourne-based hi...,https://image.tmdb.org/t/p/original/pA9fSsyqdE...,"[{'name': 'Comedy', 'id': '35'}, {'name': 'Dra..."
2,6435,Practical Magic,Released,1998-10-16,104.0,46683377.0,6.3,348.0,"Sally and Gillian Owens, born into a magical f...",https://image.tmdb.org/t/p/original/AwmToSgf2I...,"[{'name': 'Drama', 'id': '18'}, {'id': '14', '..."
3,20898,Aladdin and His Magic Lamp,Released,1967-12-30,84.0,0.0,5.4,4.0,A young boy finds a magic lantern that contain...,https://image.tmdb.org/t/p/original/h51fSNXo1d...,"[{'id': '12', 'name': 'Adventure'}, {'name': '..."
4,126104,The Magic Gloves,Released,2003-09-08,90.0,0.0,5.0,1.0,An absurdist look at depressed but adaptable B...,https://image.tmdb.org/t/p/original/wjC4eYE7mz...,"[{'id': '35', 'name': 'Comedy'}, {'id': '18', ..."


##8.Search Autocomplete
Enterprise Search has an ability to do autocompletion for the search bar. This will help the user to get better experience for the search.

You have options to configure the autocompletion function to look for difference information
* **document (default)** - Using suggestions generated from user-imported documents
* **search-history** - Using suggestions generated from the past history of search
* **user-event** - Using suggestions generated from user-imported search events
* **document-completable** - Using suggestions taken directly from user-imported document fields marked as completable.

In [None]:
#@title Perform autocomplete for har
SEARCH_COMPLETE="har" #@param {type: "string"}
res = retriever.get_autocompletes(search_complete=SEARCH_COMPLETE, query_model='document')
print(res)

['harry', 'hard', 'harsh', 'harold', 'harvey', 'harvest', 'harrison', 'harlem', 'hard time', 'harmony', 'hart', 'harvard', 'harper', 'harrowing', 'hardy', 'hard working', 'harsh reality', 'harder', 'hardship', 'harm']


In [None]:
#@title Perform autocomplete for harry
SEARCH_COMPLETE="harry" #@param {type: "string"}
res = retriever.get_autocompletes(search_complete=SEARCH_COMPLETE, query_model='document')
print(res)

['harry', 'harry potter', 'harry callahan', 'harry palmer', 'harry belafonte', 'harry houdini', 'harry dean', 'harry morgan', 'harry lee', 'harry lauter', 'harry kane', 'harry holt', 'harry langdon', 'harry andrew', 'harry voss', 'harry connick', 'harry redmond', 'harry deleyer', 'harry nilsson', 'harry webb']


In [None]:
#@title Perform autocomplete for harry po
SEARCH_COMPLETE="harry po" #@param {type: "string"}
res = retriever.get_autocompletes(search_complete=SEARCH_COMPLETE, query_model='document')
print(res)

['harry potter', 'harry poole']


#Congratulations !!

##I hope you've learned a lot from this notebook.

Please stay tune on new features of Enterprise Search

I'm Nutchanon, a Googler, want to say "Thank you" to all of you.