# Configure AI Applications to Optimize Search Results : Challenge Lab (GENAI091) #

Lab link: https://partner.cloudskillsboost.google/paths/2302/course_templates/1250/labs/529942

### Task 1. Create a metadata document for data store ###

In [1]:
import json

original_data = [
    {"id": "doc-1", "title": "Heaven Resort", "category": "information", "rating": 4.8, "document_uri": "gs://qwiklabs-gcp-04-f9b8c7153658/hotel1.pdf"},
    {"id": "doc-2", "title": "Paradise Reef Resort", "category": "information", "rating": 4.7, "document_uri": "gs://qwiklabs-gcp-04-f9b8c7153658/hotel2.pdf"},
    {"id": "doc-3", "title": "AquaPulse Maldives", "category": "information", "rating": 4.0, "document_uri": "gs://qwiklabs-gcp-04-f9b8c7153658/hotel3.pdf"},
    {"id": "doc-4", "title": "Heaven Resort Financials", "category": "financials", "rating": 4.8, "document_uri": "gs://qwiklabs-gcp-04-f9b8c7153658/hotel1-financials.pdf"},
    {"id": "doc-5", "title": "Paradise Reef Resort Financials", "category": "financials", "rating": 4.7, "document_uri": "gs://qwiklabs-gcp-04-f9b8c7153658/hotel2-financials.pdf"},
    {"id": "doc-6", "title": "AquaPulse Maldives Financials", "category": "financials", "rating": 4.0, "document_uri": "gs://qwiklabs-gcp-04-f9b8c7153658/hotel3-financials.pdf"}
]

transformed_data = []
for item in original_data:
    # Extract fields for jsonData
    json_data_payload = {
        "title": item["title"],
        "category": item["category"],
        "rating": item["rating"]
    }
    
    # Create the new structure
    new_item = {
        "id": item["id"],
        "jsonData": json.dumps(json_data_payload), # This nests the JSON as a string
        "content": {
            "mimeType": "application/pdf", # Assuming all are PDFs based on your URIs
            "uri": item["document_uri"]
        }
    }
    transformed_data.append(new_item)

output_filename = "metadata.json"

with open(output_filename, 'w', encoding='utf-8') as f:
    for record in transformed_data:
        f.write(json.dumps(record, ensure_ascii=False) + '\n')

print(f"Successfully created '{output_filename}' with the new structure.")

Successfully created 'metadata.json' with the new structure.


### Task 2. Set up Google Identity ###

In [2]:
# GCP Console:
# AI Applications => Settings => Location: Global : (*) Google Identity

### Task 3. Create and query an unstructured data search app ###

In [None]:
# GCP Console:

# 3-a: Create AI App
# In AI Applications console create a new app of type Agentspace named cymbal-travel with company name cymbal-hotels-company in the global region.
# Configure this app to use a data store named cymbal-travel that ingests linked unstructured documents (JSONL with metadata) from metadata.json file you uploaded to : gs://qwiklabs-gcp-04-f9b8c7153658/metadata.json.
# Finally, preview the app and verify if it is working as intended.
# Note: It will take up to 15 minutes to import documents. Check the data store to see if the data has been parsed, ingested and indexed. Further tasks will proceed only after this task is complete.

In [None]:
# 3-b: Set up Configure Fields in Results

# Field Type    Value
# Title	        title
# Text 1        category
# Text 2        rating

# GCP Console:
# => AI Applications => Apps => App-Name => Configurations
## Modify: Data display options => Configure fields in results(!)

In [None]:
# 3-c: Set up Facet Settings as followed:

# Field            Value
# Field 1          category
# Display Name 1   Category

# GCP Console:
# => AI Applications => Apps => App-Name => Configurations
## Modify: Data display options => Facet Settings(!)

In [None]:
# 3-d: get results in "preview"
# From the Preview screen send the following search request to your app and then filter documents of the category information
# Note: You should get results in preview for the query that display only information documents.

# GCP Console:
# App => Preview => Search "What hotels are available in the Maldives?"
# Filter by category: information (Under "Search results" => select "Category": information)

### Task 4. Filter responses to user requests (command line) ###

4-a:
Send a search request from the command line using the Discovery Engine APIs and the default_search:search endpoint and add the filter for category, so that it only returns information documents.

Search query: "What hotels are available in the Maldives?"

Note: You should get results for the query using Discovery Engine APIs that display only information documents.

In [4]:
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://discoveryengine.googleapis.com/v1beta/projects/qwiklabs-gcp-04-f9b8c7153658/locations/global/collections/default_collection/dataStores/cymbal-travel_1749343030266/servingConfigs/default_search:search" \
-d '{
"query": "What hotels are available in the Maldives?",
"filter": "category:ANY(\"information\")",
}'

SyntaxError: unterminated string literal (detected at line 4) (2185810528.py, line 4)

4-b:
Send a search request from the command line using the Discovery Engine APIs and the default_search:search endpoint and add the filter for category, so that it only returns financials documents.

Search query: "What is the revenue for the hotels in the Maldives?"

Note: You should get results for the query using Discovery Engine APIs that display only financials documents.

In [None]:
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://discoveryengine.googleapis.com/v1beta/projects/qwiklabs-gcp-04-f9b8c7153658/locations/global/collections/default_collection/dataStores/cymbal-travel_1749343030266/servingConfigs/default_search:search" \
-d '{
"query": "What hotels are available in the Maldives?",
"filter": "category:ANY(\"financials\")",
}'

### Task 5. Boost results with higher ratings ###

5-1:

Create a control of type Boost/bury named ratings-boost, with 0.7 value for Boost, that filters for the property rating to be above 4.1

In [5]:
# GCP Console:
# => AI Applications => Apps => App-Name => Configurations => Control
## Create Control: "Actions" 
#### Filter: rating: IN(*, 4.1)
#### Bust/Bury: 0.7

5-2:

Send a search request from the command line using the Discovery Engine APIs and the default_search:search endpoint and add the filter for category, so that it only returns information documents, and the filter for page_size to be 2, so that you only get the the 2 documents that have the highest rating.

Search Query: "What hotels are available in the Maldives?"

Note: You should get results for the query using Discovery Engine APIs that display only information documents.

In [None]:
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://discoveryengine.googleapis.com/v1beta/projects/qwiklabs-gcp-04-f9b8c7153658/locations/global/collections/default_collection/dataStores/cymbal-travel_1749343030266/servingConfigs/default_search:search" \
-d '{
"query": "What hotels are available in the Maldives?",
"filter": "category: ANY(\"information\")",
"page_size": 2,
"order_by": "rating desc"
}'