# AISearch01 - Create a search index in Azure AI Search

This notebook steps through creating, loading, and querying an index in Azure AI Search index by calling the azure-search-documents library in the Azure SDK for Python. 

## Install packages and set variables

In [19]:
! pip install azure-search-documents==11.6.0b12 --quiet
! pip install azure-identity --quiet
! pip install python-dotenv --quiet

In [20]:
# Load environment variables from .env file
# This allows us to keep sensitive information like API keys separate from code
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
import os

load_dotenv(override=True) # Take environment variables from .env file

# Provide variables for Azure AI Search connection
# These should be set in your .env file for security
search_endpoint = os.environ["AZURE_SEARCH_ENDPOINT"]  # Your search service URL
search_api_key = os.environ["AZURE_SEARCH_API_KEY"]    # Admin API key for full access
index_name: str = "11_hotels-quickstart-csharp"        # Name for our search index

## Create an index

In [21]:
# Import AzureKeyCredential for authenticating with Azure Search
from azure.core.credentials import AzureKeyCredential

# Create a credential object using your Azure Search API key
# This will be used to authenticate all requests to the search service
credential = AzureKeyCredential(search_api_key)

# Import necessary Azure Search SDK classes
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents import SearchClient
from azure.search.documents.indexes.models import (
    ComplexField,      # For nested/complex fields in the schema (like Address)
    SimpleField,       # For simple fields (string, int, etc.)
    SearchFieldDataType, # Enum for field data types
    SearchableField,   # For fields that should be full-text searchable
    SearchIndex        # The index definition object
)

# Create a search index client to manage indexes
# This client is used for index management operations (create, update, delete)
index_client = SearchIndexClient(
    endpoint=search_endpoint, credential=credential)

# Define the fields for the hotel index
# Each field defines how data will be stored and searched
fields = [
    # Primary key field - must be unique for each document
    SimpleField(name="HotelId", type=SearchFieldDataType.String, key=True),
    
    # Hotel name - searchable and sortable for easy finding and ordering
    SearchableField(name="HotelName", type=SearchFieldDataType.String, sortable=True),
    
    # Description - full-text searchable with English language analyzer for better search results
    SearchableField(name="Description", type=SearchFieldDataType.String, analyzer_name="en.lucene"),
    
    # Category - can be used for faceting (grouping), filtering, and sorting
    SearchableField(name="Category", type=SearchFieldDataType.String, facetable=True, filterable=True, sortable=True),
    
    # Tags - collection of strings, useful for faceting and filtering by amenities
    SearchableField(name="Tags", collection=True, type=SearchFieldDataType.String, facetable=True, filterable=True),
    
    # Boolean field for parking availability - can be used in filters and facets
    SimpleField(name="ParkingIncluded", type=SearchFieldDataType.Boolean, facetable=True, filterable=True, sortable=True),
    
    # Date field for renovation date - useful for filtering recent renovations
    SimpleField(name="LastRenovationDate", type=SearchFieldDataType.DateTimeOffset, facetable=True, filterable=True, sortable=True),
    
    # Numeric rating field - perfect for sorting and filtering by quality
    SimpleField(name="Rating", type=SearchFieldDataType.Double, facetable=True, filterable=True, sortable=True),
    
    # Complex field containing nested address information
    ComplexField(name="Address", fields=[
        SearchableField(name="StreetAddress", type=SearchFieldDataType.String),
        SearchableField(name="City", type=SearchFieldDataType.String, facetable=True, filterable=True, sortable=True),
        SearchableField(name="StateProvince", type=SearchFieldDataType.String, facetable=True, filterable=True, sortable=True),
        SearchableField(name="PostalCode", type=SearchFieldDataType.String, facetable=True, filterable=True, sortable=True),
        SearchableField(name="Country", type=SearchFieldDataType.String, facetable=True, filterable=True, sortable=True),
    ])
]

# Scoring profiles can be used to boost certain fields or apply custom ranking
scoring_profiles = []

# Suggester enables autocomplete and search suggestions
# These fields will be used to provide search suggestions to users
suggester = [{'name': 'sg', 'source_fields': ['Tags', 'Address/City', 'Address/Country']}]

# Create the search index with all defined components
index = SearchIndex(name=index_name, fields=fields, suggesters=suggester, scoring_profiles=scoring_profiles)
result = index_client.create_or_update_index(index)
print(f' {result.name} created')

 11_hotels-quickstart-csharp created


[Checkpoint 1]
![alt text](Image\image1.png)

## Create a documents payload

In [22]:
# Create a documents payload
# Each document represents a hotel with all the fields we defined in our index schema
# The "@search.action" field tells Azure Search what to do with each document
documents = [
    {
    "@search.action": "upload",  # Upload action adds or updates the document
    "HotelId": "1",             # Unique identifier for this hotel
    "HotelName": "Stay-Kay City Hotel",
    "Description": "This classic hotel is fully-refurbished and ideally located on the main commercial artery of the city in the heart of New York. A few minutes away is Times Square and the historic centre of the city, as well as other places of interest that make New York one of America's most attractive and cosmopolitan cities.",
    "Category": "Boutique",      # Hotel category for faceting/filtering
    "Tags": [ "view", "air conditioning", "concierge" ],  # Array of amenities
    "ParkingIncluded": "false",  # Boolean value as string
    "LastRenovationDate": "2020-01-18T00:00:00Z",  # ISO 8601 date format
    "Rating": 3.60,             # Numeric rating out of 5
    "Address": {                # Nested object containing address details
        "StreetAddress": "677 5th Ave",
        "City": "New York",
        "StateProvince": "NY",
        "PostalCode": "10022",
        "Country": "USA"
        }
    },
    {
    "@search.action": "upload",
    "HotelId": "2",
    "HotelName": "Old Century Hotel",
    "Description": "The hotel is situated in a nineteenth century plaza, which has been expanded and renovated to the highest architectural standards to create a modern, functional and first-class hotel in which art and unique historical elements coexist with the most modern comforts. The hotel also regularly hosts events like wine tastings, beer dinners, and live music.",
    "Category": "Boutique",
    "Tags": [ "pool", "free wifi", "concierge" ],  # Different amenities for variety
    "ParkingIncluded": "false",
    "LastRenovationDate": "2019-02-18T00:00:00Z",
    "Rating": 3.60,
    "Address": {
        "StreetAddress": "140 University Town Center Dr",
        "City": "Sarasota",       # Different city for geographic diversity
        "StateProvince": "FL",
        "PostalCode": "34243",
        "Country": "USA"
        }
    },
    {
    "@search.action": "upload",
    "HotelId": "3",
    "HotelName": "Gastronomic Landscape Hotel",
    "Description": "The Gastronomic Hotel stands out for its culinary excellence under the management of William Dough, who advises on and oversees all of the Hotel's restaurant services.",
    "Category": "Suite",          # Different category to demonstrate faceting
    "Tags": [ "restaurant", "bar", "continental breakfast" ],  # Food-focused amenities
    "ParkingIncluded": "true",    # This hotel includes parking
    "LastRenovationDate": "2015-09-20T00:00:00Z",
    "Rating": 4.80,              # Higher rating for filtering examples
    "Address": {
        "StreetAddress": "3393 Peachtree Rd",
        "City": "Atlanta",
        "StateProvince": "GA",
        "PostalCode": "30326",
        "Country": "USA"
        }
    },
    {
    "@search.action": "upload",
    "HotelId": "4",
    "HotelName": "Sublime Palace Hotel",
    "Description": "Sublime Palace Hotel is located in the heart of the historic center of Sublime in an extremely vibrant and lively area within short walking distance to the sites and landmarks of the city and is surrounded by the extraordinary beauty of churches, buildings, shops and monuments. Sublime Cliff is part of a lovingly restored 19th century resort, updated for every modern convenience.",
    "Category": "Boutique",
    "Tags": [ "concierge", "view", "air conditioning" ],
    "ParkingIncluded": "true",
    "LastRenovationDate": "2020-02-06T00:00:00Z",
    "Rating": 4.60,              # Another high-rated hotel
    "Address": {
        "StreetAddress": "7400 San Pedro Ave",
        "City": "San Antonio",
        "StateProvince": "TX",
        "PostalCode": "78216",
        "Country": "USA"
        }
    }
]

## Upload documents

In [23]:
# Create a search client for document operations (different from index management)
# This client is used for searching, uploading, and managing documents within an index
search_client = SearchClient(endpoint=search_endpoint,
                      index_name=index_name,
                      credential=credential)

# Upload documents to the search index
# This operation adds all our hotel documents to the search index
try:
    result = search_client.upload_documents(documents=documents)
    print("Upload of new document succeeded: {}".format(result[0].succeeded))
except Exception as ex:
    print (ex.message)

    # Re-initialize index client if needed (this seems like leftover code)
    index_client = SearchIndexClient(
    endpoint=search_endpoint, credential=credential)

Upload of new document succeeded: True


[Checkpoint 2]
![alt text](Image\image2.png)

## Run your first query

In [24]:
# Run an empty query (returns selected fields, all documents)
# The "*" search text matches all documents in the index
# This is useful for getting an overview of all data
results =  search_client.search(query_type='simple',
    search_text="*" ,                                    # "*" means match all documents
    select='HotelName,Description,Tags',                 # Only return these specific fields
    include_total_count=True)                            # Include total count in results

print ('Total Documents Matching Query:', results.get_count())
for result in results:
    print(result["@search.score"])      # Relevance score (all will be 1.0 for "*" queries)
    print(result["HotelName"])          # Hotel name
    print(result["Tags"])               # Array of amenities/tags
    print(f"Description: {result['Description']}")  # Hotel description

Total Documents Matching Query: 4
1.0
Gastronomic Landscape Hotel
['restaurant', 'bar', 'continental breakfast']
Description: The Gastronomic Hotel stands out for its culinary excellence under the management of William Dough, who advises on and oversees all of the Hotel's restaurant services.
1.0
Old Century Hotel
['pool', 'free wifi', 'concierge']
Description: The hotel is situated in a nineteenth century plaza, which has been expanded and renovated to the highest architectural standards to create a modern, functional and first-class hotel in which art and unique historical elements coexist with the most modern comforts. The hotel also regularly hosts events like wine tastings, beer dinners, and live music.
1.0
Sublime Palace Hotel
['concierge', 'view', 'air conditioning']
Description: Sublime Palace Hotel is located in the heart of the historic center of Sublime in an extremely vibrant and lively area within short walking distance to the sites and landmarks of the city and is surroun

## Run a term query

In [25]:
# Run a term query - search for documents containing specific terms
# This demonstrates full-text search capabilities across searchable fields
results =  search_client.search(query_type='simple',
    search_text="wifi" ,                                 # Search for the term "wifi"
    select='HotelName,Description,Tags',                 # Return only these fields
    include_total_count=True)                            # Include total matching count

print ('Total Documents Matching Query:', results.get_count())
for result in results:
    print(result["@search.score"])                       # Higher scores = better matches
    print(result["HotelName"])                           # Hotel name
    print(f"Description: {result['Description']}")      # Description (may contain "wifi")

Total Documents Matching Query: 1
0.6931472
Old Century Hotel
Description: The hotel is situated in a nineteenth century plaza, which has been expanded and renovated to the highest architectural standards to create a modern, functional and first-class hotel in which art and unique historical elements coexist with the most modern comforts. The hotel also regularly hosts events like wine tastings, beer dinners, and live music.


## Add a filter

In [26]:
# Add a filter to narrow down search results
# Demonstrates combining search text with filters and sorting
results = search_client.search(
    search_text="hotels",                    # Search for "hotels" in searchable fields
    select='HotelId,HotelName,Rating',       # Return only these specific fields
    filter='Rating gt 4',                   # Filter: only hotels with rating > 4.0
    order_by='Rating desc')                 # Sort by rating in descending order (highest first)

# Display results showing only high-rated hotels
for result in results:
    print("{}: {} - {} rating".format(result["HotelId"], result["HotelName"], result["Rating"]))

3: Gastronomic Landscape Hotel - 4.8 rating
4: Sublime Palace Hotel - 4.6 rating


## Scope a query to specific searchable fields

In [27]:
# Scope a query to specific searchable fields
# This limits the search to only look in the HotelName field, not all searchable fields
results = search_client.search(
    search_text="sublime",                   # Search term to find
    search_fields=['HotelName'],             # Only search within HotelName field
    select='HotelId,HotelName')              # Return only ID and name

# This will only find hotels with "sublime" in their name
for result in results:
    print("{}: {}".format(result["HotelId"], result["HotelName"]))

4: Sublime Palace Hotel


## Return facets

In [28]:
# Return facets - useful for building search navigation/filters
# Facets provide counts of documents grouped by field values
# This is commonly used to show filter options in search UIs
results = search_client.search(search_text="*", facets=["Category"])

# Get the facet results for the Category field
facets = results.get_facets()

# Display each category and how many hotels belong to it
print("Hotel categories and their counts:")
for facet in facets["Category"]:
    print("    {}".format(facet))  # Shows category name and document count

Hotel categories and their counts:
    {'value': 'Boutique', 'count': 3}
    {'value': 'Suite', 'count': 1}


## Look up a document 

In [29]:
# Look up a specific document by its key (HotelId)
# This is the fastest way to retrieve a document when you know its unique identifier
# No search scoring or ranking is involved - it's a direct document retrieval
result = search_client.get_document(key="3")  # Get hotel with ID "3"

print("Details for hotel '3' are:")
print("Name: {}".format(result["HotelName"]))
print("Rating: {}".format(result["Rating"]))
print("Category: {}".format(result["Category"]))

Details for hotel '3' are:
Name: Gastronomic Landscape Hotel
Rating: 4.8
Category: Suite


## Autocomplete a query

In [30]:
# Autocomplete a query - provides search suggestions as the user types
# This uses the suggester we defined earlier with Tags, City, and Country fields
# Very useful for improving user experience in search applications
search_suggestion = 'sa'  # Partial input from user
results = search_client.autocomplete(
    search_text=search_suggestion,     # The partial text to complete
    suggester_name="sg",               # Use our "sg" suggester defined in index
    mode='twoTerms')                   # Return up to two-term suggestions

print("Autocomplete for:", search_suggestion)
# Display suggested completions (like "San Antonio", "Sarasota", etc.)
for result in results:
    print (result['text'])

Autocomplete for: sa
san antonio
sarasota


## Clean up

If you are finished with this index, you can delete it by running the following lines. Deleting unnecessary indexes frees up space for stepping through more quickstarts and tutorials.

In [31]:
# Clean up - delete the index to free up space and avoid charges
# This is important in demo/tutorial scenarios to prevent accumulating test indexes
# In production, you would typically keep your indexes unless truly no longer needed
try:
    result = index_client.delete_index(index_name)
    print ('Index', index_name, 'Deleted')
except Exception as ex:
    print (ex)

Index 11_hotels-quickstart-csharp Deleted


Confirm the index deletion by running the following script that lists all of the indexes on your search service. If hotels-quickstart is not listed, you've successfully deleted the index and have completed this quickstart.

In [32]:
# Verify that the index has been successfully deleted
# This will throw an exception if the index no longer exists, confirming deletion
try:
    result = index_client.get_index(index_name)
    print ("Index still exists:", result)
except Exception as ex:
    print ("Index successfully deleted - it no longer exists")

Index successfully deleted - it no longer exists
