In [0]:
%pip install -U databricks-sdk==0.41.0 langchain-community==0.2.16 langchain-openai==0.1.19 mlflow==2.20.2
dbutils.library.restartPython()

Collecting databricks-sdk==0.41.0
  Using cached databricks_sdk-0.41.0-py3-none-any.whl.metadata (38 kB)
Collecting langchain-community==0.2.16
  Downloading langchain_community-0.2.16-py3-none-any.whl.metadata (2.7 kB)
Collecting langchain-openai==0.1.19
  Using cached langchain_openai-0.1.19-py3-none-any.whl.metadata (2.6 kB)
Collecting mlflow==2.20.2
  Using cached mlflow-2.20.2-py3-none-any.whl.metadata (30 kB)
Collecting SQLAlchemy<3,>=1.4 (from langchain-community==0.2.16)
  Using cached sqlalchemy-2.0.41-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.6 kB)
Collecting aiohttp<4.0.0,>=3.8.3 (from langchain-community==0.2.16)
  Using cached aiohttp-3.11.18-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.7 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community==0.2.16)
  Using cached dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting langchain<0.3.0,>=0.2.16 (from langchain-community==0.2.16)
  Using cach


## Compound AI system for your Instagram web campaign marketing for our Cookie Franchise!

<a href="https://youtu.be/UfbyzK488Hk?si=qzMgcSjcBhXOnDBz&t=3496" target="_blank"><img style="float:right; padding: 20px" width="500px" src="https://github.com/databricks-demos/dbdemos-resources/blob/main/images/product/llm-tools-functions/llm-tools-cookie-video.png?raw=true" ></a>

In this notebook, we're going to see how to create a compound AI system to analyze your data and make recommendation on what to publish in your Instagram web campaign, based on your own customer reviews!

we will:
- Create simple SQL functions to analyze your data and registering them in UC
- Creating more advanced python functions in UC
- Use Langchain to bind these functions as tools 
- Create an Agent that can execute these tools and ask it some tough questions


### AI with general intelligence isn't enough!

LLMs are very powerful, but they don't know your business. If we ask an LLM to write a post for our cookies, it'll generate a very general message.

AI needs be customized to your own business with your own data to be relevant!

Let's give it a try and ask a general AI to generate an instagram post for our cookie franchise:

<!-- Collect usage data (view). Remove it to disable collection or disable tracker during installation. View README for more details.  -->
<img width="1px" src="https://ppxrzfxige.execute-api.us-west-2.amazonaws.com/v1/analytics?category=data-science&org_id=1765512908890676&notebook=%2F01-agent-cookie-demo&demo_name=llm-tools-functions&event=VIEW&path=%2F_dbdemos%2Fdata-science%2Fllm-tools-functions%2F01-agent-cookie-demo&version=1">

In [0]:
%sql
-- A not inspiring, general-purpose message :/ 
select ai_gen('Help me write an instagram message for the customers of my Seattle store, we want to increase our sales for the top 1 cookie only.')

"ai_gen(Help me write an instagram message for the customers of my Seattle store, we want to increase our sales for the top 1 cookie only.)"
"Here are a few options for an Instagram message to promote your top-selling cookie and drive sales: **Option 1: Simple and Sweet** ""Cookie lovers, rejoice! Our top-selling cookie is a fan favorite for a reason! Try one today and taste the difference for yourself. Limited time offer: buy a dozen of our signature cookie and get 10% off your purchase! #seattlecookies #cookiesofinstagram #yum"" **Option 2: Create a Sense of Urgency** ""COOKIE ALERT! For a limited time, we're offering a special deal on our #1 best-selling cookie! Buy a half-dozen and get a FREE upgrade to a dozen! Don't miss out on this amazing opportunity to stock up on your favorite treat. Use code COOKIELOVE at checkout. #seattlecookies #limitedtimeoffer #cookies"" **Option 3: Emphasize the Local Love** ""Seattle, we love you! To show our appreciation for our amazing customers, we're highlighting our top-selling cookie - made with love, right here in the Emerald City! Try one today and support local business. Use code SEATTLECOOKIE at checkout to receive 15% off your purchase! #seattlecookies #locallove #supportsmallbusiness"" **Option 4: Get Creative with a Challenge** ""The Great Cookie Challenge! Can you resist the allure of our top-selling cookie? We didn't think so! Take the challenge and try one today. Share a photo of you enjoying our signature cookie and tag us for a chance to win a FREE dozen! #seattlecookies #cookiechallenge #yum"" Choose the one that resonates with your brand and style, or feel free to modify any of these options to fit your needs!"


## Generating personalized, impactful AI with the Data Intelligence Platform

Because Databricks makes it easy to bring all your data together, we can easily leverage your own data and use it to super-charge the AI.

Instead of General Intelligence, you'll have a extra-smart AI, understanding your business.

Let's see how Databricks let us generate ultra-personalize content and instagram messages to drive more cookie love!

### 1/ Prepare and review the dataset

Dbdemos loaded the dataset for you in your catalog. In a real world, you'd have a few Delta Live Table pipeline together with Lakeflow connect to ingest your ERP/Sales Force/website data.

Let's start by reviewing the dataset and understanding how it can help to make personalized recommendations.
*Note: the cookie data now exists in the marketplace, feel free to pull it from here.*

#### Let's focus on the main tables
- **sales.franchises** - Contains information about all our business franchises (store name, location, country, size, ect).   
- **sales.transactions** - Contains sales data for all franchises.
- **media.customer_reviews** - Social media reviews for a bunch of different franchises.   

In [0]:
%run ./_resources/00-init-cookie $reset_all=false

USE CATALOG `main__build`
using catalog.database `main__build`.`dbdemos_agent_tools`


In [0]:
%sql
SELECT * FROM cookies_franchises;

franchiseID,name,city,district,zipcode,country,size,longitude,latitude,supplierID
3000002,Golden Crumbs,San Francisco,Mission,94110,US,XL,-122.4194,37.7593,4000002
3000000,The Crumbly Nook,Sydney,Bondi Beach,2026,Australia,L,151.2763,-33.8915,4000000
3000001,Tokyo Treats,Tokyo,Shibuya,150-0042,Japan,M,139.701,35.6586,4000001
3000003,Sugar Rush,Melbourne,Fitzroy,3065,Australia,S,144.9804,-37.8005,4000003
3000004,Kobe Konfections,Kobe,Kitano,657-0838,Japan,XXL,135.2216,34.6806,4000004
3000005,Floured Fantasies,Los Angeles,Silver Lake,90026,US,XL,-118.2674,34.0877,4000005
3000006,Crumby Delights,Brisbane,West End,4101,Australia,M,153.0093,-27.4792,4000006
3000007,Kyoto Kravings,Kyoto,Gion,605-0811,Japan,S,135.9792,35.0036,4000007
3000008,Doughy Dreams,Chicago,Wicker Park,60622,US,XXL,-87.6782,41.9064,4000008
3000009,The Baking Lab,Perth,Leederville,6007,Australia,L,115.8344,-31.9393,4000009


In [0]:
%sql
SELECT * FROM cookies_transactions

transactionID,customerID,franchiseID,dateTime,product,quantity,unitPrice,totalPrice,paymentMethod,cardNumber
2002961,1000253,3000047,2024-05-14T12:17:01.495952Z,Golden Gate Ginger,8,3,24,amex,378154478982993
2003007,1000226,3000047,2024-05-10T23:10:10.239954Z,Austin Almond Biscotti,36,3,108,mastercard,2244626981238094
2003017,1000108,3000047,2024-05-16T16:34:10.61372Z,Austin Almond Biscotti,40,3,120,mastercard,2490570234487424
2003068,1000173,3000047,2024-05-02T04:31:51.612094Z,Pearly Pies,28,3,84,amex,343808569426192
2003103,1000075,3000047,2024-05-04T23:44:26.902224Z,Pearly Pies,28,3,84,visa,4377080942201798
2003147,1000295,3000047,2024-05-15T16:17:06.25945Z,Austin Almond Biscotti,32,3,96,amex,371093774812677
2003196,1000237,3000047,2024-05-07T11:13:22.469231Z,Tokyo Tidbits,40,3,120,mastercard,5538807345848392
2003329,1000272,3000047,2024-05-06T03:32:16.017968Z,Outback Oatmeal,28,3,84,visa,4872480716880043
2001264,1000209,3000047,2024-05-16T17:32:28.547589Z,Pearly Pies,28,3,84,mastercard,5287105980593305
2001287,1000120,3000047,2024-05-15T08:41:28.406738Z,Austin Almond Biscotti,40,3,120,amex,376211012259783


## 2/ Create our Unity Catalog Functions

The cookie data now exists in your catalog.

We'll now start creating custom functions that our Compound AI system will understand and leverage to gain insight on your business

### Allow your LLM to retrieve the franchises for a given city

In [0]:
%sql
--Now we create our first function. This takes in a city name and returns a table of any franchises that are in that city.
--Note that we've added a comment to the input parameter to help guide the agent later on.
CREATE OR REPLACE FUNCTION
cookies_franchise_by_city ( city_name STRING COMMENT 'City to be searched' )
returns table(franchiseID BIGINT, name STRING, size STRING)
LANGUAGE SQL
-- Make sure to add a comment so that your AI understands what it does
COMMENT 'This function takes in a city name and returns a table of any franchises that are in that city.'
return
(SELECT franchiseID, name, size from cookies_franchises where lower(city)=lower(city_name) order by size desc)

In [0]:
%sql
-- Test the function we just created
SELECT * from cookies_franchise_by_city('Seattle')

franchiseID,name,size
3000038,Dough Delights,XXL
3000014,Sweet Temptations,L


### Sum all the sales for a given franchise, grouping the result per product

The AI will be able to find the top products generating the most revenue

In [0]:
%sql
--This function takes an ID as input, and this time does an aggregate to return the sales for that franchise_id.
CREATE OR REPLACE FUNCTION
cookies_franchise_sales (franchise_id BIGINT COMMENT 'ID of the franchise to be searched')
returns table(total_sales BIGINT, total_quantity BIGINT, product STRING)
LANGUAGE SQL
COMMENT 'This function takes an ID as input, and this time does an aggregate to return the sales for that franchise_id'
return
  (SELECT SUM(totalPrice) AS total_sales, SUM(quantity) AS total_quantity, product 
    FROM cookies_transactions 
      WHERE franchiseID = franchise_id GROUP BY product)

In [0]:
%sql
-- Test the function we just created
SELECT * from cookies_franchise_sales(3000005)

total_sales,total_quantity,product
243,81,Golden Gate Ginger
126,42,Tokyo Tidbits
117,39,Pearly Pies
234,78,Austin Almond Biscotti
291,97,Outback Oatmeal
261,87,Orchard Oasis


### Get the best feedbacks for a given franchise

This will let the AI understand what local customers like about our cookies!

In [0]:
%sql
--This function takes an ID as input, and this time does an aggregate to return the sales for that franchise_id.
CREATE OR REPLACE FUNCTION
cookies_summarize_best_sellers_feedback (franchise_id BIGINT COMMENT 'ID of the franchise to be searched')
returns STRING
LANGUAGE SQL
COMMENT 'This function will fetch the best feedback from a product and summarize them'
return
  SELECT AI_GEN(SUBSTRING('Extract the top 3 reason people like the cookies based on this list of review:' || ARRAY_JOIN(COLLECT_LIST(review), ' - '), 1, 80000)) AS all_reviews
  FROM cookies_customer_reviews
    where franchiseID = franchise_id and review_stars >= 4

In [0]:
%sql
-- Test the function we just created
SELECT cookies_summarize_best_sellers_feedback(3000005)

main.dbdemos_agent_tools.cookies_summarize_best_sellers_feedback(3000005)
"Based on the reviews, the top 3 reasons people like the cookies at Bakehouse in Silver Lake, Los Angeles are: 1. **Unique and delicious flavor combinations**: Reviewers rave about the combination of ingredients and flavors in the cookies, such as the ""symphony of oats and raisins"" in the Outback Oatmeal cookies and the ""fruity delight"" of the Orchard Oasis cookies. 2. **Texture and crunch**: Reviewers enjoy the varying textures of the cookies, including the crunch of the Austin Almond Biscotti and the chewiness of the Outback Oatmeal cookies. 3. **High-quality ingredients and satisfying taste**: Reviewers appreciate the wholesome and satisfying taste of the cookies, with one reviewer describing the Outback Oatmeal cookies as ""wholesome and satisfying"" and another reviewer calling the Austin Almond Biscotti ""utterly addictive""."


## 3/ Testing our AI with Databricks Playground

<img src="https://github.com/databricks-demos/dbdemos-resources/blob/main/images/product/llm-tools-functions/llm-tools-functions-playground.gif?raw=true" width="500px" style="float: right; margin-left: 10px; margin-bottom: 10px;">


Your functions are now available in Unity Catalog!

To try out our functions with playground:
- Open the [Playground](/ml/playground) 
- Select a model supporting tools (like Llama3.1)
- Add the functions you want your model to leverage (`catalog.schema.function_name`)
- Ask a question (for example to write an instagram post for Seattle), and playground will do the magic for you!


### 4/ DIY: Create your own Agent chaining the tools with Langchain

In this step, we're going to define three cruicial parts of our agent:
- Tools for the Agent to use
- LLM to serve as the agent's "brains"
- System prompt that defines guidelines for the agent's tasks

In [0]:
from langchain_community.tools.databricks import UCFunctionToolkit
import pandas as pd
wh = get_shared_warehouse(name = None) #Get the first shared wh we can. See _resources/01-init for details
print(f'This demo will be using the {wh.name} to execute the functions')

def get_tools():
    return (
        UCFunctionToolkit(warehouse_id=wh.id)
        # Include functions as tools using their qualified names.
        # You can use "{catalog_name}.{schema_name}.*" to get all functions in a schema.
        .include(f"{catalog}.{db}.cookies_franchise_by_city", 
                 f"{catalog}.{db}.cookies_franchise_sales", 
                 f"{catalog}.{db}.cookies_summarize_best_sellers_feedback")
        .get_tools())

display_tools(get_tools()) #display in a table the tools - see _resource/00-init for details

This demo will be using the dbdemos-shared-endpoint to execute the functions


name,description,args_schema,return_direct,verbose,callbacks,callback_manager,tags,metadata,handle_tool_error,handle_validation_error,response_format,func,coroutine
main__build__dbdemos_agent_tools__cookies_franchise_by_city,This function takes in a city name and returns a table of any franchises that are in that city.,,False,False,,,,,False,False,content,.func at 0x770c7872a3e0>,
main__build__dbdemos_agent_tools__cookies_franchise_sales,"This function takes an ID as input, and this time does an aggregate to return the sales for that franchise_id",,False,False,,,,,False,False,content,.func at 0x770c9d73fc40>,
ld__dbdemos_agent_tools__cookies_summarize_best_sellers_feedback,This function will fetch the best feedback from a product and summarize them,,False,False,,,,,False,False,content,.func at 0x770c78024860>,


In [0]:
from langchain_community.chat_models.databricks import ChatDatabricks

#We're going to use llama 3.3 because it's tool enabled and works great. Keep temp at 0 to make it more deterministic.
llm = ChatDatabricks(endpoint="databricks-meta-llama-3-3-70b-instruct",
    temperature=0.0,
    streaming=False)

In [0]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_community.chat_models import ChatDatabricks

#This defines our agent's system prompt. Here we can tell it what we expect it to do and guide it on using specific functions. 

def get_prompt(history = [], prompt = None):
    if not prompt:
            prompt = """You are a helpful assistant for a global company that oversees cookie stores. Your task is to help store owners understand more about their products and sales metrics. You have the ability to execute functions as follows: 

            Use the franchise_by_city function to retrieve the franchiseID for a given city name.

            Use the franchise_sales function to retrieve the cookie sales for a given franchiseID.

           Use the cookies_summarize_best_sellers_feedback function to understand what customers like the most.

    Make sure to call the function for each step and provide a coherent response to the user. Don't mention tools to your users. Don't skip to the next step without ensuring the function was called and a result was retrieved. Only answer what the user is asking for. If a user ask to generate instagram posts, make sure you know what customers like the most to make the post relevant."""
    return ChatPromptTemplate.from_messages([
            ("system", prompt),
            ("human", "{messages}"),
            ("placeholder", "{agent_scratchpad}"),
    ])

In [0]:
from langchain.agents import AgentExecutor, create_openai_tools_agent

prompt = get_prompt()
tools = get_tools()
agent = create_openai_tools_agent(llm, tools, prompt)

#Put the pieces together to create our Agent
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

In [0]:
from operator import itemgetter
from langchain.schema.runnable import RunnableLambda
from langchain_core.output_parsers import StrOutputParser

#Very basic chain that allows us to pass the input (messages) into the Agent and collect the (output) as a string
agent_str = ({ "messages": itemgetter("messages")} | agent_executor | itemgetter("output") | StrOutputParser())

In [0]:
#Lets ask our Compound AI Agent to generate an Instagram post. This requires it to:
#     1. Look up what stores are in Chicago
#     2. Use sales data to look up the best selling cookie at that store
# COMING SOON:
#     3. Retrieve reviews from a vector search endpoint
#     4. Generate images from the shutterstock model

answer=agent_str.invoke({"messages": "Help me write an instagram message for the customers of my Seattle store, we want to increase our sales for the top 1 cookie only."})

Error in StdOutCallbackHandler.on_chain_start callback: AttributeError("'NoneType' object has no attribute 'get'")


[32;1m[1;3m
Invoking: `main__build__dbdemos_agent_tools__cookies_franchise_by_city` with `{'city_name': 'Seattle'}`


[0m[36;1m[1;3m{"format": "CSV", "value": "franchiseID,name,size\n3000038,Dough Delights,XXL\n3000014,Sweet Temptations,L\n", "truncated": false}[0m[32;1m[1;3m
Invoking: `main__build__dbdemos_agent_tools__cookies_franchise_sales` with `{'franchise_id': 3000038}`


[0m[33;1m[1;3m{"format": "CSV", "value": "total_sales,total_quantity,product\n108,36,Austin Almond Biscotti\n348,116,Outback Oatmeal\n138,46,Pearly Pies\n189,63,Golden Gate Ginger\n51,17,Tokyo Tidbits\n108,36,Orchard Oasis\n", "truncated": false}[0m[32;1m[1;3m
Invoking: `ld__dbdemos_agent_tools__cookies_summarize_best_sellers_feedback` with `{'franchise_id': 3000038}`


[0m[38;5;200m[1;3m{"format": "SCALAR", "value": "Based on the review, here are the top 3 reasons people like the cookies from Bakehouse in Ballard, Seattle:\n\n1. **Texture**: The reviewer loves that the Outback Oatmeal cookies 

Trace(request_id=tr-b4424f0674fb444292c5d02e59d79a62)

### That's it! Our super customized review is ready:

In [0]:
print(answer)

Based on the sales data and customer feedback, it seems that the Outback Oatmeal cookie is the top-selling cookie at your Seattle store. Customers love its soft and chewy texture, as well as its perfectly spiced flavor. Here's a possible Instagram message to increase sales for this cookie:

"Calling all cookie lovers in Seattle! Our Outback Oatmeal cookie is a fan favorite, and for good reason! Its soft and chewy texture, combined with its perfectly spiced flavor, makes it the perfect treat to brighten up your day. Try one today and taste the difference for yourself! #OutbackOatmeal #CookieLove #SeattleTreats"
