# Agent

Welcome to the fourth part of the tutorial series on building a chatbot over a corpus of private
documents using Large Language Models (LLMs).

In this Notebook, you set up an intelligent agent that knows how to triage a user's query to the
right model and return the answer. If the user's query is about a document, the agent will use the
document retrieval model to find the most relevant document. If the user's query is about a question
that should be answered by looking into an SQL database, the agent will use the SQL model to create
the proper SQL statement and return the answer.

## Table of Contents

1. [Creating the Agent Deployment](#creating-the-agent-deployment)
1. [Conclusion and Next Steps](#conclusion-and-next-steps)

In [None]:
import subprocess

# Creating the Agent Deployment

Apply the deployment spec below to create the agent deployment. You will need to provide the
following environment variables:

- `SQL_URL`: The URL of the SQLCoder model.
- `LLM_URL`: The URL of the Large Language Model.
- `EMBEDDINGS_URL`: The URL of the Document Embeddings model.

You also need to provide the image of the agent. You can build this image using the dockerfile in
the `dockerfiles/agent` directory.


In [None]:
image = "..."  # image name
sql_url = "..."  # sql url endpoint
llm_url = "..."  # llm url endpoint
embeddings_url = "..."  # embeddings url endpoint

agent_deploy = """
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    io.kompose.service: router
  name: router
spec:
  replicas: 1
  selector:
    matchLabels:
      io.kompose.service: router
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      labels:
        io.kompose.network/nvidia-llm: "true"
        io.kompose.service: router
    spec:
      containers:
      - env:
        - name: LANGSERVE_HOST
          value: 0.0.0.0
        - name: LANGSERVE_PORT
          value: "9000"
        - name: SQL_URL
          value: {0}
        - name: LLM_URL
          value: {1}
        - name: EMBEDDINGS_URL
          value: {2}
        image: {3}
        imagePullPolicy: IfNotPresent
        name: chain-server
        ports:
        - containerPort: 9000
          hostPort: 9000
          name: 9000-tcp
          protocol: TCP
      restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
  labels:
    io.kompose.service: router
  name: router
spec:
  ports:
  - name: 9000-tcp
    port: 9000
    protocol: TCP
    targetPort: 9000
  selector:
    io.kompose.service: router
  type: ClusterIP
""".format(sql_url, llm_url, embeddings_url, image)

with open("agent-deploy.yaml", "w") as f:
    f.write(agent_deploy)

subprocess.run(["kubectl", "apply", "-f", "agent-deploy.yaml"])

# Conclusion and Next Steps

Congratulations on completing this crucial step in this tutorial series! You've successfully
deployed an intelligent agent that can triage a user's query to the right model and return the
right answer.