# 1. Agent bricks: Review Apspect Extraction Agent

This notebook walks through instructions to build an Information Extraction Agent in Agent Bricks to extract aspect insights & sentiments from raw reviews

Data Flow: 
**raw reviews -> review aspect extractions**  -> location aspect daily -> flag all issues -> issue diagnosis and recommendations

## Build Information Extraction Agent

Extract Structured Insights from Raw Reviews table

Example: 
![](/Workspace/Users/cindy.wu@databricks.com/voc_industry_demo/resource_configs/ie_agent_create.png)

- Use the auto-generated 'Sample JSON output' in the config, and edit once the agent is created.
- Once Agent is created, edit the `JSON Schema` of the agent and paste in [`ie_agent_config.json`](resource_configs/ie_agent_config.json) ![screenshot](/Workspace/Users/cindy.wu@databricks.com/voc_industry_demo/resource_configs/ie_agent_json_edit.png)
- Then under `Instructions`, paste in instructions below.  
```
1. Extract ALL relevant metadata mentioned or implied in the review.
   - Include fields such as: star rating, review date, length of stay, and overall sentiment.
   - If metadata is missing, set to null — do not guess or invent.

2. Extract EVERY relevant aspect from the review text.
   - Use the predefined aspect list (Arrival & Departure, Staff & Service, In-Room Experience, Food & Beverage, Facilities & Amenities, Environment & Location, Value & Loyalty).
   - For each aspect:
       • Identify sentiment (very_positive, positive, neutral, negative, very_negative).
       • Include short, verbatim evidence (phrases directly from the review).
       • opinion_terms: array of short polarity-bearing words/phrases tied to this aspect (e.g., “spotless,” “friendly,” “overpriced,” “noisy AC”); use verbatim spans when possible
   - Deduplicate aspects — each aspect should appear at most once.
   - Do not miss subtle mentions, mixed opinions, or multiple details for the same aspect.
   - Capture both positive and negative details accurately, without omitting context.

3. Extract all **entities** explicitly or implicitly mentioned in the review.
   - Entities include: staff roles or names, attractions, nearby locations etc.
   - Keep entity names consistent and distinct.
   - Do not fabricate entities; if unclear, set to null.
   - Avoid redundancy: each unique entity should appear only once.

4. Output clean, valid JSON following the specified schema (no extra text, no commentary).
   - Ensure consistency across reviews.
   ```
- Click `Save and update` to update the agent. This will take a few minutes, and the agent is ready.
- Review the agent and provide necessary feedback. 
![Example](/Workspace/Users/cindy.wu@databricks.com/voc_industry_demo/resource_configs/ie_agent_review.png)
- Update the agent again.



## Batch Inference With IE Agent Endpoint and `AI_QUERY`
#### [IE Query in SQL Editor](queries/AI Query Extraction with KIE.dbquery.ipynb)


In [0]:
dbutils.widgets.text("ie_agent_endpoint",'kie-8a22274f-endpoint')
dbutils.widgets.text("catalog", "main")
dbutils.widgets.text("schema", "voc_demo")
dbutils.widgets.text("reviews_table", "raw_reviews")
dbutils.widgets.text("output_table", "review_extractions")


In [0]:
# %sql
# CREATE OR REPLACE TABLE IDENTIFIER(:catalog || '.' ||:schema || '.' || :output_table) AS
# WITH query_results AS (
#   SELECT
#     review_uid,
#     review_text,
#     ai_query(
#       :ie_agent_endpoint,
#       review_text,
#       failOnError => false
#     ) AS response
#   FROM (
#     SELECT review_uid, review_text
#     FROM IDENTIFIER(:catalog || '.' ||:schema || '.' || :reviews_table)
#   )
# )
# SELECT
#   review_uid,
#   review_text,
#   response.result AS response,
#   response.errorMessage AS error
# FROM query_results;
