<a href="https://colab.research.google.com/github/Y-YHat/dexter_ai/blob/main/3_1_Agent_Design_in_Databricks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png" alt="Databricks Learning">
</div>



# Agent Design in Databricks

In the previous demo, we build a multi-stage AI system by manually stitching them together. With Agents, we can build the same system in an autonomous way. An agent, typically, has a brain which make the decisions, a planning outline and tools to use.

In this demo, we will create two types of agents. The first agent will use **a search engine, Wikipedia, and Youtube** to recommend a movie, collect data about the movie and show the trailer video.

The second agent is a verys specific type agent; it will allow us to "talk with data" using natural language queries.

**Learning Objectives:**

*By the end of this demo, you will be able to;*

* Build semi-automated systems with LLM agents to perform internet searches and dataset analysis using LangChain.

* Use appropriate tool for the agent task to be achieved.

* Explore LangChain’s built-in agents for specific, advanced workflows.

* Create a Pandas DataFrame Agent to interact with a Pandas DataFrame as needed.


## Requirements

Please review the following requirements before starting the lesson:

* To run this notebook, you need to use one of the following Databricks runtime(s): **14.3.x-cpu-ml-scala2.12 14.3.x-scala2.12**



## Classroom Setup

Before starting the demo, run the provided classroom setup script. This script will define configuration variables necessary for the demo. Execute the following cell:

In [None]:
%pip install --upgrade --quiet langchain==0.1.16 langchain-core langchain_community==0.0.36 langchain-experimental youtube_search wikipedia==1.4.0 duckduckgo-search


[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m817.7/817.7 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m321.8/321.8 kB[0m [31m5.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m10.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m202.5/202.5 kB[0m [31m7.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m302.9/302.9 kB[0m [31m16.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m127.1/127.1 kB[0m [31m9.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m53.0/53.0 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m199.7/199.7 kB[0m [31m6.1

In [None]:
%run ../Includes/Classroom-Setup-04

In [None]:
!pip install openai


Collecting openai
  Downloading openai-1.35.3-py3-none-any.whl (327 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m327.4/327.4 kB[0m [31m5.1 MB/s[0m eta [36m0:00:00[0m
Collecting httpx<1,>=0.23.0 (from openai)
  Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m7.8 MB/s[0m eta [36m0:00:00[0m
Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai)
  Downloading httpcore-1.0.5-py3-none-any.whl (77 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m9.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->openai)
  Downloading h11-0.14.0-py3-none-any.whl (58 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m6.4 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: h11, httpcore, httpx, openai
Successfully installed h11-0.14.0 httpcore-1.0.5 ht

**Other Conventions:**

Throughout this demo, we'll refer to the object `DA`. This object, provided by Databricks Academy, contains variables such as your username, catalog name, schema name, working directory, and dataset locations. Run the code block below to view these details:

In [None]:
print(f"Username:          {DA.username}")
print(f"Catalog Name:      {DA.catalog_name}")
print(f"Schema Name:       {DA.schema_name}")
print(f"Working Directory: {DA.paths.working_dir}")
print(f"Dataset Location:  {DA.paths.datasets}")

## Create an Autonomous Agent (Brixo 🤖)

In the previous demo, we create chains using various prompts and tools combinations go solve a problem defined by the prompt. In chains, we need to define the input parameters and prompts.

In this demo, we will create an agent that can **autonomously reason** about the steps to take and select **the tools** to use for each task.

**🤖 Agent name: Brixo :)**

**✅ Agent Abilities: This agent can help you by suggesting fun activities, pick videos and even write code.**

### Define the Brain of the Agent

LLM is the brain of the agent. We will use **Databricks' DBRX model** as the brain of our agent.

In [None]:
! pip install langchain-community langchain-core




In [None]:
from google.colab import userdata
import os

os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")

In [None]:
import openai
openai.api_key = os.getenv('OPENAI_API_KEY')



In [None]:
! pip install langchain-openai



In [20]:
!pip install openai langchain

import openai
from langchain.llms import OpenAI
from langchain.agents import AgentExecutor
from langchain.agents.react.agent import create_react_agent

# Set the OpenAI API key from environment variable
openai.api_key = os.getenv('OPENAI_API_KEY')

# Create an instance of the OpenAI class with max_tokens parameter
llm_openai = OpenAI(model_name="gpt-3.5-turbo", max_tokens=500)  # Update model_name as needed


Collecting langchain-core<0.2.0,>=0.1.42 (from langchain)
  Using cached langchain_core-0.1.52-py3-none-any.whl (302 kB)
Installing collected packages: langchain-core
  Attempting uninstall: langchain-core
    Found existing installation: langchain-core 0.2.9
    Uninstalling langchain-core-0.2.9:
      Successfully uninstalled langchain-core-0.2.9
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain-openai 0.1.9 requires langchain-core<0.3,>=0.2.2, but you have langchain-core 0.1.52 which is incompatible.[0m[31m
[0mSuccessfully installed langchain-core-0.1.52




In [28]:
from langchain_core.prompts import PromptTemplate


In [37]:
template = """Question: {question}

Answer: Let's think step by step."""

prompt = PromptTemplate.from_template(template)

question = "What NFL team won the Super Bowl in the year Justin Beiber was born?"

llm_openai.invoke(question)

APIRemovedInV1: 

You tried to access openai.ChatCompletion, but this is no longer supported in openai>=1.0.0 - see the README at https://github.com/openai/openai-python for the API.

You can run `openai migrate` to automatically upgrade your codebase to use the 1.0.0 interface. 

Alternatively, you can pin your installation to the old version, e.g. `pip install openai==0.28`

A detailed migration guide is available here: https://github.com/openai/openai-python/discussions/742


In [24]:

import openai
from langchain.llms import OpenAI
from langchain.agents import AgentExecutor
from langchain.agents.react.agent import create_react_agent


# Define your tools and prompt
tools = []  # Define your tools
prompt = "Your custom prompt goes here."

# Create the REACT agent
agent = create_react_agent(llm_openai, tools, prompt)

# Create an executor for the agent
executor = AgentExecutor(agent=agent, tools=tools)

# Example usage
response = executor("What is the weather like today?")
print(response)


AttributeError: 'str' object has no attribute 'input_variables'

In [21]:
# Define your tools and prompt
tools = []  # Define your tools
prompt = "Your custom prompt goes here."

# Create the REACT agent
agent = create_react_agent(llm_openai, tools, prompt)

# Create an executor for the agent
executor = AgentExecutor(agent=agent, tools=tools)

# Example usage
response = executor("What is the weather like today?")
print(response)

AttributeError: 'str' object has no attribute 'input_variables'

In [36]:
import openai
from langchain_openai import OpenAI

# Create an instance of the OpenAI class with the appropriate parameters
llm_openai = OpenAI(model_name="gpt-3.5-turbo-instruct", max_tokens=500)
# from langchain.chat_models import openai

ImportError: cannot import name 'LangSmithParams' from 'langchain_core.language_models.chat_models' (/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py)

In [None]:
!pip install openai langchain




In [22]:
# Create an instance of the OpenAI class with max_tokens parameter
# llm_gpt = openai(model="gpt-4", max_tokens=500)

# Define a function to interact with the model
def ask_gpt(prompt, max_tokens=500):
    response = llm_openai(
        prompt=prompt,
        max_tokens=max_tokens
    )
    return response['choices'][0]['text']

# Example usage
response = ask_gpt("What is the weather like today?", max_tokens=500)
print(response)


  warn_deprecated(


APIRemovedInV1: 

You tried to access openai.ChatCompletion, but this is no longer supported in openai>=1.0.0 - see the README at https://github.com/openai/openai-python for the API.

You can run `openai migrate` to automatically upgrade your codebase to use the 1.0.0 interface. 

Alternatively, you can pin your installation to the old version, e.g. `pip install openai==0.28`

A detailed migration guide is available here: https://github.com/openai/openai-python/discussions/742


In [None]:
from langchain_community.chat_models import ChatDatabricks

# play with max_tokens to define the length of the response
llm_dbrx = ChatDatabricks(endpoint="databricks-dbrx-instruct", max_tokens = 500)

ImportError: Failed to create the client. Please run `pip install mlflow` to install required dependencies.

### Define Tools that the Agent Can Use

Agent can use various tools for completing a task. Here we will define the tools that can be used by **Brixo 🤖**.

In [None]:
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper

from langchain_community.tools import YouTubeSearchTool

from langchain.agents import Tool
from langchain_experimental.utilities import PythonREPL

from langchain_community.tools import DuckDuckGoSearchRun

# Wiki tool for info retrieval
api_wrapper = WikipediaAPIWrapper(top_k_results=1, doc_content_chars_max=100)
tool_wiki = WikipediaQueryRun(api_wrapper=api_wrapper)

# tool to search youtube videos
tool_youtube = YouTubeSearchTool()

# web search tool
search = DuckDuckGoSearchRun()

# tool to write python code
python_repl = PythonREPL()
repl_tool = Tool(
    name="python_repl",
    description="A Python shell. Use this to execute python commands. Input should be a valid python command. If you want to see the output of a value, you should print it out with `print(...)`.",
    func=python_repl.run,
)

# toolset
tools = [tool_wiki, tool_youtube, search, repl_tool]

### Define Planning Logic

While working on tasks, our agent will need to done some reasoning and planning. We can define the format of this plan by passing a prompt.

In [30]:
from langchain.prompts import PromptTemplate

template = '''Answer the following questions as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought:{agent_scratchpad}'''

prompt= PromptTemplate.from_template(template)

### Create the Agent

The final step is to put all these together and build an agent.

In [32]:
! openai migrate

Retrieving Grit CLI metadata from https://api.keygen.sh/v1/accounts/custodian-dev/artifacts/marzano-linux-x64
Fetching release URL from: https://api.keygen.sh//v1/accounts/custodian-dev/artifacts/gouda-linux-x64
Fetching release URL from: https://api.keygen.sh//v1/accounts/custodian-dev/artifacts/marzano-linux-x64
Fetching release URL from: https://api.keygen.sh//v1/accounts/custodian-dev/artifacts/workflow_runner-linux-x64
Fetching release URL from: https://api.keygen.sh//v1/accounts/custodian-dev/artifacts/cli-linux-x64
Fetching release URL from: https://api.keygen.sh//v1/accounts/custodian-dev/artifacts/timekeeper-linux-x64

[2K[1A
[2K[1A
[1m[2mAnalyzing[0m [1m[2mFinding files                                                         [0m
[2K[2AProcessed 0 files and found 0 matches


In [33]:
from langchain.agents import AgentExecutor
from langchain.agents.react.agent import create_react_agent

agent = create_react_agent(llm_openai, tools, prompt)
brixo  = AgentExecutor(agent=agent, tools=tools, verbose=True, handle_parsing_errors=True)
brixo.invoke({"input":
    """What would be a nice movie to watch in rainy weather. Follow these steps.

    First, decide which movie you would recommend.

    Second, show me the trailler video of the movie that you suggest.

    Next, collect data about the movie using search tool and  draw a bar chart using Python libraries. If you can't find latest data use some dummy data as we to show your abilities to the learners. Don't use ``` for python code. Input should be sanitized by removing any leading or trailing backticks. if the input starts with ”python”, remove that word as well. The output must be the result of executed code.

    Finally, tell a funny joke about agents.
    """})



[1m> Entering new AgentExecutor chain...[0m


APIRemovedInV1: 

You tried to access openai.ChatCompletion, but this is no longer supported in openai>=1.0.0 - see the README at https://github.com/openai/openai-python for the API.

You can run `openai migrate` to automatically upgrade your codebase to use the 1.0.0 interface. 

Alternatively, you can pin your installation to the old version, e.g. `pip install openai==0.28`

A detailed migration guide is available here: https://github.com/openai/openai-python/discussions/742


In [35]:
import openai

# Set the OpenAI API key from environment variable
openai.api_key = os.getenv('OPENAI_API_KEY')

# Define a function to interact with the OpenAI API
def ask_gpt(prompt, model="gpt-3.5-turbo", max_tokens=500):
    response = OpenAI.ChatCompletion.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        max_tokens=max_tokens
    )
    return response['choices'][0]['message']['content']

# Example usage
response = ask_gpt("What is the weather like today?")
print(response)


AttributeError: type object 'OpenAI' has no attribute 'ChatCompletion'

## Create an Autonomous Agent 2 (DataQio 🤖)

In this section we will create a quite different agent; this agent will allow us to communicate with our **Pandas dataframe** using natural language.

### Prepare Dataset

First, let's download a dataset from 🤗 and convert it to Pandas dataframe.

In [None]:
from datasets import load_dataset

dataset = load_dataset("maharshipandya/spotify-tracks-dataset")
df = dataset['train'].to_pandas()

### Define the Brain and Tools

Next we will define the model(brain) of our agent and define the toolset to use.

In [None]:
from langchain.agents.agent_types import AgentType
from langchain_experimental.agents.agent_toolkits import create_pandas_dataframe_agent

from langchain_community.chat_models import ChatDatabricks

llm_dbrx = ChatDatabricks(endpoint="databricks-dbrx-instruct", max_tokens = 500)

prefix = """ Input should be sanitized by removing any leading or trailing backticks. if the input starts with ”python”, remove that word as well. Use the dataset provided. The output must start with a new line."""

dataqio = create_pandas_dataframe_agent(
    llm_dbrx,
    df,
    verbose=True,
    max_iterations=3,
    prefix=prefix,
    agent_executor_kwargs={
        "handle_parsing_errors": True
    }
)

### Talk with DataQio 🤖

We are ready to talk with our agent to ask questions about the data.

In [None]:
dataqio.invoke("What is the artist name of most popular song based on popularity?")

In [None]:
dataqio.invoke("What is the total number of rows?")


## Clean up Classroom

Run the following cell to remove lessons-specific assets created during this lesson.

In [None]:
DA.cleanup()


## Conclusion

In this demo, we explored agent design in Databricks, moving beyond manual system stitching to autonomous agent-based systems. Agents, equipped with decision-making branches, planning outlines, and tools, streamline the process. We created two types of agents: one utilizing a search engine, Wikipedia, and YouTube to recommend movies and another enabling natural language data queries. By leveraging LangChain's capabilities, participants learned to build semi-automated systems, choose appropriate tools, and utilize built-in agents for advanced workflows, including interacting with Pandas DataFrames.


&copy; 2024 Databricks, Inc. All rights reserved.<br/>
Apache, Apache Spark, Spark and the Spark logo are trademarks of the
<a href="https://www.apache.org/">Apache Software Foundation</a>.<br/>
<br/><a href="https://databricks.com/privacy-policy">Privacy Policy</a> |
<a href="https://databricks.com/terms-of-use">Terms of Use</a> |
<a href="https://help.databricks.com/">Support</a>