## 4B. AI Generated Summaries for Dashboard

Last Updated: 6 Oct 2025 </br>
Description: Gemini AI was utilised to generate AI summaries and actionable insights from the top 20 positive and negative reviews for each aspect based on relevance score and sentiment score. 

#### Import Libraries

In [2]:
import pandas as pd
from tqdm import tqdm
import time
import os
import vertexai
from vertexai.generative_models import GenerativeModel

#### Configuration

In [4]:
try:
    # 1. Set the path to your JSON credentials file.
    CREDENTIALS_FILE = "credentials/causal-cacao-473308-d4-f7b956a7b6e8.json"
    
    # 2. Set your Google Cloud Project ID.
    PROJECT_ID = "causal-cacao-473308-d4" # Replace with your project ID
    
    # 3. Set the location of your project (e.g., "us-central1").
    LOCATION = "us-central1" # Replace with your project's region

    # Set the credentials environment variable for the session
    os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = CREDENTIALS_FILE
    
    # Initialize the Vertex AI SDK
    vertexai.init(project=PROJECT_ID, location=LOCATION)
    
    print("Vertex AI (Gemini) authenticated successfully using JSON credentials.")

except Exception as e:
    print(f"Error initializing Vertex AI: {e}")
    print("Please ensure your Project ID, Location, and credentials file path are correct.")
    exit()

Vertex AI (Gemini) authenticated successfully using JSON credentials.


In [13]:
# Define the model name to use
MODEL_NAME = "gemini-2.5-flash-lite"

# Define input/output file paths
INPUT_FILE = "../Output/absa_analysis_twostep_results_wide.csv"
OUTPUT_FILE = "../Output/gemini_actionable_insights.csv"

# The list of aspects you want to analyze
ASPECTS = [
    "Room Quality & Comfort", "Facilities & Amenities", "Customer Service & Staff",
    "Dining Experience", "Casino Experience", "Shopping & Retail",
    "Location & Accessibility", "Pricing & Value for Money", "Atmosphere & Ambience"
]

#### Helper Functions

In [25]:
def generate_with_retry(prompt, model_name=MODEL_NAME, retries=3, delay=5):
    """
    Calls the Vertex AI Gemini API with a retry mechanism.
    """
    model = GenerativeModel(model_name)
    for i in range(retries):
        try:
            # Generate content using the Vertex AI method
            response = model.generate_content(prompt)
            return response.text
        except Exception as e:
            print(f"API call failed (attempt {i+1}/{retries}): {e}")
            if i < retries - 1:
                print(f"Retrying in {delay} seconds...")
                time.sleep(delay)
                delay *= 2 # Exponential backoff
            else:
                print("Max retries reached. Returning error.")
                return f"ERROR: API call failed after {retries} retries."

In [27]:
def generate_aspect_summary(aspect, reviews, sentiment_type):
    """
    Builds a detailed prompt, calls the Gemini API, and parses the response
    to extract themes and actionable recommendations.
    """
    reviews_text = "\n".join([f"- {review}" for review in reviews])
    
    prompt = f"""
        ### ROLE:
        You are a world-class hospitality management consultant analyzing {sentiment_type} feedback for the Marina Bay Sands hotel. Your analysis must be sharp, concise, and focused on operational improvements.
        
        ### TASK:
        Analyze the following guest reviews which are specifically about the aspect of **"{aspect}"**. Based ONLY on this feedback, identify the top 3 recurring themes and provide specific, actionable recommendations for hotel management.
        
        ### REVIEWS TO ANALYZE:
        {reviews_text}
        
        ### REQUIRED OUTPUT FORMAT:
        Your response MUST strictly follow this format, using these exact tags:
        <ANALYSIS>
        1. **[Theme 1]:** [Briefly describe the first major theme found in the reviews.]
        2. **[Theme 2]:** [Briefly describe the second major theme.]
        3. **[Theme 3]:** [Briefly describe the third major theme.]
        </ANALYSIS>
        <ACTION_PLAN>
        1. **[Action 1]:** [Provide a specific, operational action item that addresses one of the themes.]
        2. **[Action 2]:** [Provide another specific, operational action item.]
        3. **[Action 3]:** [Provide a third specific, operational action item.]
        </ACTION_PLAN>
        """
    
    response_text = generate_with_retry(prompt)
    
    try:
        analysis = response_text.split("<ANALYSIS>")[1].split("</ANALYSIS>")[0].strip()
        action_plan = response_text.split("<ACTION_PLAN>")[1].split("</ACTION_PLAN>")[0].strip()
    except IndexError:
        print(f"  - Warning: Could not parse response for {aspect} ({sentiment_type}). Returning raw response.")
        analysis = f"Parsing Error. Raw Response: {response_text[:500]}..."
        action_plan = "Parsing Error."
        
    return analysis, action_plan

#### Script Execution

In [33]:
if __name__ == "__main__":
    
    print(f"\nLoading data from {INPUT_FILE}...")
    try:
        df = pd.read_csv(INPUT_FILE)
        print("Data loaded successfully.")
    except FileNotFoundError:
        print(f"ERROR: Input file not found at '{INPUT_FILE}'")
        print("Please make sure you have run the ABSA script first to generate this file.")
        exit()

    results_list = []
    
    for aspect in tqdm(ASPECTS, desc="Generating Insights for Aspects"):
        
        sentiment_col = f"{aspect}_sentiment"
        relevance_col = f"{aspect}_relevance_score"
        sentiment_score_col = f"{aspect}_sentiment_score"

        # --- Process POSITIVE Reviews ---
        top_positive_reviews = df[df[sentiment_col] == 'Positive'].sort_values(
            by=[relevance_col, sentiment_score_col], ascending=False
        ).head(20)['review_text'].tolist()

        positive_analysis, positive_action = "", ""
        if top_positive_reviews:
            positive_analysis, positive_action = generate_aspect_summary(aspect, top_positive_reviews, "Positive")
        
        # --- Process NEGATIVE Reviews ---
        top_negative_reviews = df[df[sentiment_col] == 'Negative'].sort_values(
            by=[relevance_col, sentiment_score_col], ascending=False
        ).head(20)['review_text'].tolist()
        
        negative_analysis, negative_action = "", ""
        if top_negative_reviews:
            negative_analysis, negative_action = generate_aspect_summary(aspect, top_negative_reviews, "Negative")

        results_list.append({
            "aspect": aspect,
            "positive_sentiments_analysis": positive_analysis,
            "positive_sentiments_action": positive_action,
            "negative_sentiments_analysis": negative_analysis,
            "negative_sentiments_action": negative_action
        })

        # --- Add a delay to avoid hitting API rate limits ---
        # A 10-second pause between each aspect (which makes 2 API calls) is a safe buffer.
        print(f"  - Pausing for 10 seconds to respect API rate limits...")
        time.sleep(10)

    final_insights_df = pd.DataFrame(results_list)

    # --- SAVE THE FINAL RESULTS ---
    if not os.path.exists("output"):
        os.makedirs("output")
        
    final_insights_df.to_csv(OUTPUT_FILE, index=False, encoding='utf-8-sig')
    
    print("\n" + "="*50)
    print("INSIGHTS GENERATION COMPLETE")
    print("="*50)
    print(f"\nFinal results have been saved to: {OUTPUT_FILE}")


Loading data from output/absa_analysis_twostep_results_wide.csv...
Data loaded successfully.


Generating Insights for Aspects:   0%|          | 0/9 [00:00<?, ?it/s]

  - Pausing for 10 seconds to respect API rate limits...


Generating Insights for Aspects:  11%|█         | 1/9 [00:23<03:04, 23.08s/it]

  - Pausing for 10 seconds to respect API rate limits...


Generating Insights for Aspects:  22%|██▏       | 2/9 [00:39<02:13, 19.02s/it]

  - Pausing for 10 seconds to respect API rate limits...


Generating Insights for Aspects:  33%|███▎      | 3/9 [00:55<01:46, 17.77s/it]

  - Pausing for 10 seconds to respect API rate limits...


Generating Insights for Aspects:  44%|████▍     | 4/9 [01:11<01:25, 17.06s/it]

  - Pausing for 10 seconds to respect API rate limits...


Generating Insights for Aspects:  56%|█████▌    | 5/9 [01:28<01:07, 16.93s/it]

  - Pausing for 10 seconds to respect API rate limits...


Generating Insights for Aspects:  67%|██████▋   | 6/9 [01:44<00:49, 16.57s/it]

  - Pausing for 10 seconds to respect API rate limits...


Generating Insights for Aspects:  78%|███████▊  | 7/9 [02:00<00:32, 16.38s/it]

  - Pausing for 10 seconds to respect API rate limits...


Generating Insights for Aspects:  89%|████████▉ | 8/9 [02:15<00:16, 16.23s/it]

  - Pausing for 10 seconds to respect API rate limits...


Generating Insights for Aspects: 100%|██████████| 9/9 [02:32<00:00, 16.93s/it]


INSIGHTS GENERATION COMPLETE

Final results have been saved to: output/gemini_actionable_insights.csv



