# AI-Powered SQL Functions – Databricks Notebook Exercises

Workshop Notebook: AI SQL Functions – Building AI into Your SQL 
Workflows 

In this notebook, you will explore Databricks AI Functions – a set of built-in SQL functions that integrate Large Language Models (LLMs) directly into SQL queries. This enables powerful text analysis and generation capabilities within your database (no separate API calls or pipelines required). You’ll practice using these functions on a practical scenario: analyzing customer reviews and automating responses, all in SQL. The notebook is structured with explanatory markdown cells and interactive exercises. Work through each section, running the SQL commands and observing the results. Feel free to experiment with your own inputs as well!

## 1. Built-In AI Functions Quick Tour

First, familiarize yourself with a few key AI functions provided by Databricks SQL. These functions are pre-trained on language tasks and run on hosted models:
- AI_ANALYZE_SENTIMENT(text) – Analyzes the sentiment of the input text, returning a label such as “POSITIVE”, “NEGATIVE”, or “NEUTRAL”.
- AI_CLASSIFY(text, labels_array) – Classifies the input text into one of the given categories. You provide an array of label strings (e.g., array('Spam','Not Spam')), and it returns the label that best fits the text.
- AI_SUMMARIZE(text) – Produces a concise summary of the input text.
- AI_EXTRACT(text, labels…) – Extracts specified information from the text (returns a JSON with those fields).
- (There are additional functions, but we’ll focus on these core ones for now.)

### Exercise 1: Sentiment Analysis of a Single Phrase
Let’s start with a simple sentiment analysis. Run the query below to test AI_ANALYZE_SENTIMENT on a sample sentence:

In [0]:
%sql
-- What sentiment do we get for a clearly positive statement?
SELECT ai_analyze_sentiment('I absolutely love the new coffee blend! It tastes wonderful.') AS sentiment;

After running, observe the output. The result should be a single value in the “sentiment” column, likely POSITIVE (since the sentence is clearly positive).

Question: If you change the input text to something negative (for example, “I am really unhappy with the service.”), what sentiment is returned? Try it out by editing the string below. This illustrates how the function picks up on positive vs. negative tone.

In [0]:
%sql
-- What sentiment do we get for a clearly positive statement?
SELECT ai_analyze_sentiment('YOUR INPUT HERE') AS sentiment;

### Exercise 2: Classifying Text into Categories
Now, test the AI_CLASSIFY function. We will classify a piece of feedback as either a Complaint or a Praise.

In [0]:
%sql
-- Classify the tone of the feedback as 'Complaint' or 'Praise'
SELECT ai_classify(
    'The product broke after one use and I am very disappointed.',
    array('Complaint', 'Praise')
  ) AS feedback_type;

Your task: Run the above query and check the output. It should return either "Complaint" or "Praise". Given the input text (which is clearly negative), we expect "Complaint".

To confirm, you can also try a positive feedback example. For instance, change the text to “This cake was absolutely delicious and the service was excellent” and run again – you should see "Praise".

In [0]:
%sql
-- Classify the tone of the feedback as 'Complaint' or 'Praise'
SELECT ai_classify(
    'YOUR TEXT HERE',
    array('Complaint', 'Praise')
  ) AS feedback_type;

### Exercise 3: Summarizing a Customer Review
Next, let’s try out AI_SUMMARIZE. Imagine we have a long customer review, and we want a brief summary.

In [0]:
%sql
-- Summarize a verbose customer review
SELECT ai_summarize(
    'I visited the bakery yesterday. The ambiance was nice, and the staff was friendly. 
     I tried the new croissant and it was flaky and buttery, absolutely perfect! 
     I will be coming back for more, even though the coffee was a bit too strong for my taste.'
  ) AS review_summary;

Run the query and examine the review_summary output. It should condense the multi-sentence review into a shorter sentence or two, capturing the key points (e.g., positive about ambiance and croissant, minor issue with coffee strength).

Feel free to modify the input text or write your own long-ish review to see how well the summarization captures the essence.

In [0]:
%sql
-- Summarize a verbose customer review
SELECT ai_summarize(
    'YOUR LONG-ISH REVIEW HERE'
  ) AS review_summary;

## 2. Scenario: Automating Customer Review Analysis with AI

Now that you’ve gotten a feel for individual AI functions, let’s apply them in a realistic scenario. UpperBound Bakehouse receives customer reviews (from surveys or online forms), and we want to automate the analysis of these reviews to improve customer satisfaction. Specifically, for each review we aim to:
1. Determine the sentiment (positive/negative).
2. Decide if the review indicates the customer requires a follow-up from support.
3. If a follow-up is needed (for negative experiences), generate a suggested response message addressing the feedback.

To make this hands-on, we’ll generate some sample customer reviews using an LLM, then use AI Functions to analyze them. Finally, we’ll produce an automated response for the negative ones.

## 2.1 Generating Sample Reviews with AI_QUERY
First, we need some example reviews to work with. We’ll use the AI_QUERY function to generate synthetic review text. AI_QUERY allows us to call a specific model endpoint and get its response directly in SQL. We will use one of Databricks’ foundation models as our LLM.

### Exercise 4: Create a Table of Fake Reviews
Run the following to generate, say, 5 sample reviews. We use a range(5) trick to produce 5 rows, each prompting the model for a new review:

In [0]:
%sql
-- Generate 5 unique synthetic customer reviews using an LLM via AI_QUERY
CREATE OR REPLACE TEMPORARY VIEW sample_reviews AS
SELECT explode(split(ai_query(
         "databricks-meta-llama-3-3-70b-instruct", 
         'Generate 5 unique short customer reviews for bakery products. 
          Include specific details; if negative, mention the issue. 
          Vary the tone across the reviews (some positive, some negative).
          Only include the text of the review. No need for the Here are 5 unique short customer reviews for bakery products: at the beginning
          NO need for the rating or name. I just want the texts of the reviews'
       ), '\n\n')) AS review_text;

This uses the databricks-meta-llama-3-3-70b-instruct model endpoint to generate five different review texts. We store them in a temporary view sample_reviews.

After creation, quickly preview the data:

In [0]:
%sql
SELECT * 
FROM sample_reviews;

You should see five rows of fairly realistic-sounding review text (e.g., “I tried the chocolate muffin and it was stale...”, “The new seasonal pie was fantastic...”, etc.). Each row is just a single column review_text. Keep these reviews in mind as we proceed. 

**Question:** Do the generated reviews include a mix of sentiments? Ideally, yes – some should be positive and some negative or neutral. If they all look similar or too positive, consider regenerating with a slightly tweaked prompt emphasizing varied sentiment.

## 2.2 Analyzing Reviews: Sentiment and Follow-Up Requirement
With our sample reviews in hand, let’s apply AI functions to extract useful information.

### Exercise 5: Analyze Sentiment and Flag for Follow-Up
We’ll create another view that augments each review with two new pieces of information: the sentiment, and a follow-up flag.

In [0]:
%sql
-- Analyze each review: get sentiment and determine if follow-up is needed
CREATE OR REPLACE TEMPORARY VIEW reviewed_analysis AS
SELECT 
  review_text,
  ai_analyze_sentiment(review_text) AS sentiment,
  ai_classify(
      review_text, 
      array('Follow-up Needed', 'No Follow-up Needed')
    ) AS followup_status
FROM sample_reviews;


Run the above. This query goes through each review_text and:
- assigns a sentiment (Positive/Negative/Neutral),
- classifies the review into either "Follow-up Needed" or "No Follow-up Needed". We chose these two labels to indicate whether the feedback is bad enough that customer support should reach out.

Now, inspect the results:

In [0]:
%sql
SELECT * 
FROM reviewed_analysis;

You’ll see each review with a sentiment and a follow-up status. For example, a very negative review might show sentiment = NEGATIVE and followup_status = Follow-up Needed, whereas a glowing review would be POSITIVE and No Follow-up Needed.

**Questions:**
- Do the sentiment labels correctly reflect the tone of the reviews you read earlier? (A quick manual check helps – e.g., a review praising a product should be POSITIVE.)
- Which reviews were flagged as needing follow-up? Why do you think the model marked them? Typically, reviews that mention serious issues, dissatisfaction, or defects should be tagged "Follow-up Needed". Verify if that matches your expectation for each review.

Behind the scenes: AI_CLASSIFY with those two labels is essentially doing a sentiment-based classification, but with a specific business spin: it’s looking for cues of dissatisfaction. The model likely correlates negative sentiment or complaint keywords with "Follow-up Needed". This is an example of how you can customize AI functions for business rules by simply changing the labels or phrasing.

## 2.3 Generating Suggested Responses for Negative Reviews
Finally, for any reviews that require follow-up, we’ll use the LLM to generate a brief response message to the customer. This could save our customer support team time by providing a draft email or reply.

### Exercise 6: Generate Follow-up Messages
Let’s query only the reviews that were flagged as needing follow-up, and for each, ask the LLM to produce a response. We will use AI_QUERY again on the same model, this time feeding it a prompt that includes the review text and asks for a helpful reply.

In [0]:
%sql
-- Generate a suggested response for each review that needs follow-up
SELECT 
  review_text,
  sentiment,
  ai_query(
    "databricks-meta-llama-3-3-70b-instruct",
    CONCAT(
      "Compose a brief, polite response to the following customer comment, acknowledging their issue and offering to help: ```", 
      review_text, 
      "```"
    )
  ) AS suggested_response
FROM reviewed_analysis
WHERE followup_status = 'Follow-up Needed';


**Task:** Execute the above query. It will take each review marked "Follow-up Needed", and send a prompt to the LLM like: "Compose a brief, polite response to the following customer comment... [customer review]." We wrap the review text in backticks or quotes to clearly delineate it in the prompt. 

Results: You should get a table of (review_text, sentiment, suggested_response). The suggested_response will be a few sentences addressing the customer. For example, if the review said a product was stale or service was poor, the response might apologize and offer a replacement or discount: “We’re very sorry to hear about your experience with the muffin. We strive for freshness... Please contact us and we’ll make it right...”. 

Take a moment to read the suggested responses. They should be polite, empathetic, and address the problem.

**QUESTION:** 
Do these AI-generated messages sound appropriate and helpful? Would you edit them before sending to a real customer? It’s important to critically evaluate AI outputs – in many cases they’re a great starting draft, but a human may need to review for tone or specifics. In our case, they should be fairly on-point, since the prompt explicitly asked for a polite, helpful reply.

For completeness, you might also want to see the responses for positive reviews (though we typically wouldn’t follow up on those). You can adjust the WHERE clause to include all reviews (or specifically "No Follow-up Needed") just to see what the model would say. It might produce a generic thank-you note for positive comments if asked.

## 3. (Optional) Using AI_EXTRACT for Structured Extraction

This section is optional and for exploration. Instead of using separate steps for sentiment and classification, Databricks offers an AI_EXTRACT function that can pull out multiple fields from text in one go. If you wanted, you could extract, say, “sentiment” and “issue_type” from a review with one function call.

For instance:

In [0]:
%sql
SELECT ai_extract(
  review_text, 
  array('sentiment','issue_type')
) AS extract_json
FROM sample_reviews;

This would attempt to return a JSON string like: {"sentiment": "NEGATIVE", "followup_reason": "defect in product"} for a negative review complaining about a defect. You would then parse the JSON using JSON functions in SQL. Under the hood, AI_EXTRACT is using the model to fill in the requested labels in a structured format.

We won’t rely on this in our main flow above (to keep things simple and separated), but it’s good to know such capabilities exist for more advanced patterns.

## 4. Wrap-Up and Next Steps

In this notebook, we:
- Used built-in AI Functions (ai_analyze_sentiment, ai_classify, ai_summarize) to directly analyze text in SQL, leveraging Databricks-hosted LLMs (no external infrastructure needed).
- Employed ai_query to call a specific large language model for custom tasks: generating synthetic data and creating tailored responses. This showcases the flexibility to plug in different models (Databricks Foundation Models, open-source LLMs, or external APIs) via the same SQL interface.
- Built a mini pipeline that turns raw unstructured customer reviews into actionable insights (sentiment, follow-up flag) and even auto-generated response drafts – all within a SQL workflow. This is a powerful example of an AI system in the database, useful for customer experience management, support ticket triage, and more.

**Discussion:** Think about how you could incorporate these techniques into your own projects. 

Finally, remember that with great power comes great responsibility:
- Always review AI outputs, especially if they will be customer-facing. Quality can vary, and models might occasionally produce irrelevant or incorrect content.
- Be mindful of cost and performance – LLM calls are not free or instantaneous. We used small numbers of examples here; in production you’d use batching.
- Ensure compliance with data privacy and model usage policies (e.g., don’t send sensitive data to external models without proper handling).

You have now completed the AI SQL Functions exercises! Feel free to experiment further in this notebook – perhaps try summarizing all reviews in one query, or use AI_CLASSIFY with more categories (like “Positive”, “Negative”, “Neutral” on the review text to compare with AI_ANALYZE_SENTIMENT). The more you play with these functions, the more ideas you’ll get for applying them in your AI-powered data pipelines.