# Introduction
Welcome to the course! Here, we’ll learn everything from simple communication with LLMs to building a complex AI Agent.

First, let's address the question: what is an AI Agent?

*Simple answer*: An AI Agent in an AI Assistant with extra tools. These tools can be anything you program it to be, from querying data, routing requests, looking for key words, searching the internet, running Python code, ordering pizza, you name it.

This course it divided into four modules:
 - **Introduction**
   - We'll learn about the course, the Databricks environment, and a basic way to communicate with the serving LLMs
 - **Simple Chat**
   - We'll learn to use LangGraph to create a simple chat with an LLM, which will function as a basic AI Assistant.
 - **Introducing Tools**
   - In this module we'll introduce the first tools to our agent. We'll learn how to build a more complex graph with them, and an agent network.
 - **Final Agent**
   - In the final module, we'll have built a complex agent with multiple tools and functions.

# Talking to LLMs
If you're using Databricks, go to the "Serving" section in the lower-left corner to view the available models. We should have enough credits to use them comfortably for the duration of the course.
If you're using Google Colab, I'll be providing my own personal API Key to OpenAI. Please don't abuse it.

For this course we'll be using the models:
 - Databricks Serving Points:
   - databricks-llama-4-maverick | Great for general chat
   - databricks-meta-llama-3-1-8b-instruct | Great of following specific instructions

### 1 - Initial Configurations

### 1.1 Installs

* `dotenv` – loads and manages environment variables from a `.env` file.

* `requests` – simple HTTP client for making API calls (e.g., to Databricks endpoints).

In [34]:
%%capture
%pip install \
  dotenv \
  requests

### 1.2 Imports

**What we're importing and why:**

- `requests` - Basic package to make API calls

- `os` - Access environment variables that store our Azure configuration

- `dotenv.load_dotenv` - Reads the `.env` file and safely loads the environmental variables

In [42]:
import requests
import os
from dotenv import load_dotenv
from pprint import pprint

### 1.3 Set up Environmental Variables

All should be declared by now on the `.env` file

In [40]:
# Force reload environment variables, overriding any cached values
load_dotenv('.env', override=True)

print("Databricks URL: ", os.getenv("DATABRICKS_URL"))
print("Databricks Token: ", os.getenv("DATABRICKS_TOKEN")[:10], "..." if os.getenv("DATABRICKS_TOKEN") else "NOT FOUND")
print("Chat Endpoint: ", os.getenv("CHAT_ENDPOINT"))
print("Instruct Endpoint: ", os.getenv("INSTRUCT_ENDPOINT"))

Databricks URL:  https://adb-3104295250330834.14.azuredatabricks.net/
Databricks Token:  dapi887aeb ...
Chat Endpoint:  databricks-llama-4-maverick
Instruct Endpoint:  databricks-meta-llama-3-3-70b-instruct


## 2 - Basic LLM Request

### 2.1 Standalone request

In [45]:
# Lets start by saving our environmental variables into actual python variables for ease of use.
DBKS_TOKEN=os.getenv("DATABRICKS_TOKEN")
DBKS_URL=os.getenv("DATABRICKS_URL")
CHAT_ENDPOINT=os.getenv("CHAT_ENDPOINT")

# 'messages' is the parameter passed to the endpoint—it holds the chat history.
# It's always a list of dictionaries with two keys:
#   - "role" (who sent the message): "user", "assistant", or "system"
#   - "content" (the actual message)
# The "assistant" role refers to the LLM's responses.
# We'll start with a basic interaction using a system prompt and a user message.
messages = [
  {"role":"system", "content":"You're a helpful AI Assistant."},
  {"role":"user", "content":"Can you tell me a funny joke?"}
]

# Headers contain authorization info and content type required by the endpoint
headers = {
    "Authorization": f"Bearer {DBKS_TOKEN}",
    "Content-Type":  "application/json"
}

# The body includes the message history, the temperature (controls creativity), and max tokens for the response
body = {
    "messages":   messages,
    "temperature": 0.7,
    "max_tokens":  1000
}

# Now we can make the request
response = requests.post(
    f"{DBKS_URL}/serving-endpoints/{CHAT_ENDPOINT}/invocations",
    headers=headers,
    json=body
  )

# The response is a raw object—let’s inspect its contents
print("----- RAW Response Object -----")
print(response)

# Convert the response to JSON to access the content.
# The response is stored under ["choices"][0]["message"]["content"].
# By default, only one response is returned (this can be changed, but we won’t worry about it now).
print("\n\n----- JSON Response -----")
pprint(response.json())

# Let's extract just what we need, and print it
print("\n\n----- Relevant Content -----")
print(response.json()["choices"][0]["message"]["content"])

# Optionally, append the assistant’s reply to the chat history
messages.append({"role":"assistant","content":response.json()["choices"][0]["message"]})

# Let's delete the variables for security
del DBKS_TOKEN
del DBKS_URL

----- RAW Response Object -----
<Response [200]>


----- JSON Response -----
{'choices': [{'finish_reason': 'stop',
              'index': 0,
              'logprobs': None,
              'message': {'content': "Why couldn't the bicycle stand up by "
                                     'itself? Because it was two-tired!',
                          'role': 'assistant'}}],
 'created': 1760966682,
 'id': 'chatcmpl_0223781a-fafe-41cd-8144-906f4ddb9e4f',
 'model': 'meta-llama-4-maverick-040225',
 'object': 'chat.completion',
 'usage': {'completion_tokens': 17, 'prompt_tokens': 29, 'total_tokens': 46}}


----- Relevant Content -----
Why couldn't the bicycle stand up by itself? Because it was two-tired!


### 2.2 Reusable Function

In [38]:
# Below is a function that has all that mess inside of it, and returns the response as a string. Easy to use, we'll be using that on the other notebooks.
def databricks_llm(chat_history, model_endpoint, verbose=False):
    
    """Call a Databricks serving endpoint that follows the OpenAI chat format."""
    
    DBKS_TOKEN=os.getenv("DATABRICKS_TOKEN")
    DBKS_URL=os.getenv("DATABRICKS_URL")

    if verbose:
        print("\n=== LLM CALL →", model_endpoint, "===")
        for m in chat_history:
            print(f"{m['role'].upper()}: {m['content']}")

    headers = {
        "Authorization": f"Bearer {DBKS_TOKEN}",
        "Content-Type":  "application/json"
    }
    body = {
        "messages":   chat_history,
        "temperature": 0.7,
        "max_tokens":  1000
    }

    resp = requests.post(f"{DBKS_URL}/serving-endpoints/{model_endpoint}/invocations", headers=headers, json=body)
    resp.raise_for_status()
    content = resp.json()["choices"][0]["message"]["content"]

    if verbose: print("LLM RESPONSE:", content[:300] + ("…" if len(content) > 300 else ""))
    if verbose: print("=== LLM CALL END ===")

    del DBKS_TOKEN
    del DBKS_URL

    return content

# Testing the function
messages_test = [
    {"role":"system", "content":"You are a helpful assistant."},
    {"role":"user", "content":"What's the capital of France?"}
    ]

print(databricks_llm(messages_test, model_endpoint=CHAT_ENDPOINT, verbose=True))



=== LLM CALL → databricks-llama-4-maverick ===
SYSTEM: You are a helpful assistant.
USER: What's the capital of France?
LLM RESPONSE: The capital of France is Paris.
=== LLM CALL END ===
The capital of France is Paris.
