# LLM Agents, NIMs, and Guardrails: A Comprehensive Guide

## Table of Contents
1. [Introduction](#introduction)
2. [LLM Agents with LangChain](#llm-agents-with-langchain)
3. [NVIDIA Inference Models (NIMs) for Inference](#nvidia-inference-models-nims-for-inference)
4. [NeMo Guardrails for Controlled LLM Output](#nemo-guardrails-for-controlled-llm-output)
5. [Conclusion](#conclusion)

## Introduction

This notebook explores three important concepts in the world of Large Language Models (LLMs):

1. LLM Agents using LangChain
2. NVIDIA Inference Models (NIMs) for inference
3. NeMo Guardrails for controlling LLM output

We'll dive into each of these topics, providing code examples and explanations to help you understand and implement these powerful tools.

## LLM Agents with LangChain

LangChain is a powerful framework for developing applications powered by language models. It allows us to create agents that can use tools and make decisions based on natural language inputs.

### Setup

First, let's install the necessary packages:

In [None]:
!pip install -q langchain openai langchain-community langchain-openai langchain-experimental

Now, let's set up some CSS styling for better output display:

In [None]:
from IPython.display import HTML, display

def set_css():
    display(HTML('''
    <style>
        pre {
            white-space: pre-wrap;
        }
    </style>
    '''))
get_ipython().events.register('pre_run_cell', set_css)

### Creating a NASA Agent

We'll create an agent that can interact with NASA's API to find and display information about space:

In [None]:
from langchain.agents import AgentType, initialize_agent
from langchain_community.agent_toolkits.nasa.toolkit import NasaToolkit
from langchain_community.utilities.nasa import NasaAPIWrapper
from langchain_openai import ChatOpenAI
from google.colab import userdata

# Initialize the OpenAI model
llm = ChatOpenAI(temperature=0, openai_api_key=userdata.get('OPENAI_API_KEY'), model_name="gpt-4-turbo")

# Set up the NASA toolkit
nasa = NasaAPIWrapper()
toolkit = NasaToolkit.from_nasa_api_wrapper(nasa)

# Initialize the agent
agent = initialize_agent(
    toolkit.get_tools(), llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True, handle_parsing_errors=True
)

# Use the NASA agent to find moon pictures
response = agent.run("""
    Can you find three pictures of the moon published between the years 2014 and 2020?
    Give me the two links to the images.
    """)

print(response)

In [None]:
from langchain.agents import Tool
from langchain_experimental.utilities import PythonREPL
python_repl = PythonREPL()

# You can create the tool to pass to an agent
repl_tool = Tool(
    name="python_repl",
    description="A Python shell. Use this to execute python commands. Input should be a valid python command. If you want to see the output of a value, you should print it out with `print(...)`.",
    func=python_repl.run,
)

agent = initialize_agent(
    [repl_tool], llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True, handle_parsing_errors=True
)


In [None]:
agent("""
  Solve the 2D Poisson Eqn in a 5x5 grid with an interesting BC, plot the results with seaborn to show the waves
""")

This agent demonstrates how LangChain can be used to create task-specific agents that interact with external APIs and process natural language requests.

## NVIDIA Inference Models (NIMs) for Inference

NVIDIA Inference Models (NIMs) provide a way to use powerful language models as a service. Let's explore how to use them with the Meta Llama 3.1 405B model.

First, install the OpenAI package:

In [None]:
!pip install -q openai

Now, let's use the Meta Llama 3.1 405B model through NVIDIA's API:

In [None]:
from openai import OpenAI
from google.colab import userdata
import os

# Initialize the OpenAI client with NVIDIA's API
client = OpenAI(
  base_url = "https://integrate.api.nvidia.com/v1",
  api_key = userdata.get("NVIDIA_API")
)

# Generate a limerick about GPU computing
completion = client.chat.completions.create(
  model="meta/llama-3.1-405b-instruct",
  messages=[{"role":"user","content":"Write a limerick about the wonders of GPU computing."}],
  temperature=0.2,
  top_p=0.7,
  max_tokens=1024,
  stream=True
)

# Print the generated limerick
for chunk in completion:
  if chunk.choices[0].delta.content is not None:
    print(chunk.choices[0].delta.content, end="")

This example showcases how to use NVIDIA's API to access powerful language models for various tasks.

## NeMo Guardrails for Controlled LLM Output

NeMo GuardRails is a powerful tool for controlling the output of language models. It allows us to define rules and flows for how the model should respond to certain types of queries.

First, let's install the necessary packages:

In [None]:
!pip install -q nemoguardrails openai nest_asyncio langchain-nvidia-ai-endpoints

Now, let's set up our guardrails:

In [None]:
# Now, let's import the required modules
import os
import asyncio
import nest_asyncio
from nemoguardrails import RailsConfig, LLMRails
from nemoguardrails.streaming import StreamingHandler
from nemoguardrails.context import streaming_handler_var
from google.colab import userdata


In [None]:

YAML_CONFIG =f"""
models:
  - type: main
    engine: nim
    model: meta/llama-3.1-405b-instruct
    parameters:
      base_url: https://integrate.api.nvidia.com/v1
      api_key: {userdata.get('NVIDIA_API')}
      """
import os
import asyncio
from nemoguardrails import RailsConfig, LLMRails

# Try to import and apply nest_asyncio, but don't fail if it's not available
try:
    import nest_asyncio
    nest_asyncio.apply()
except ImportError:
    pass

# # YAML configuration for the model
# YAML_CONFIG = """
# models:
#   - type: main
#     engine: openai
#     model: gpt-3.5-turbo
# """

# Colang configuration for defining flows and rules
COLANG_CONFIG = """
# define niceties
define user express greeting
    "hello"
    "hi"
    "what's up?"

define flow greeting
    user express greeting
    bot express greeting
    bot ask how are you

# define limits
define user ask politics
    "what are your political beliefs?"
    "thoughts on the president?"
    "left wing"
    "right wing"

define bot answer politics
    "I'm an AI assistant, I don't discuss political beliefs."

define flow politics
    user ask politics
    bot answer politics
    bot offer help
"""

# Combine YAML and Colang configs
config = RailsConfig.from_content(yaml_content=YAML_CONFIG, colang_content=COLANG_CONFIG)

# Initialize the LLMRails object
rails = LLMRails(config)

# Asynchronous function to chat with the bot
async def chat_with_bot_async(user_input):
    response = await rails.generate_async(messages=[{"role": "user", "content": user_input}])
    return response.get("content", "No response generated.")

# Synchronous function to chat with the bot
def chat_with_bot_sync(user_input):
    response = rails.generate(messages=[{"role": "user", "content": user_input}])
    return response.get("content", "No response generated.")

# Function to chat with the bot (handles both sync and async)
def chat_with_bot(user_input):
    try:
        # Try to run asynchronously
        loop = asyncio.get_event_loop()
        if loop.is_running():
            return loop.create_task(chat_with_bot_async(user_input))
        else:
            return loop.run_until_complete(chat_with_bot_async(user_input))
    except RuntimeError:
        # If we're not in an async environment, use the sync version
        return chat_with_bot_sync(user_input)

# Test the bot
def run_tests():
    print("Testing general knowledge question:")
    response = chat_with_bot("What is the capital of France?")
    print(f"Bot response: {response}")

    print("\nTesting political question:")
    response = chat_with_bot("What are your thoughts on the current president?")
    print(f"Bot response: {response}")

    print("\nTesting greeting:")
    response = chat_with_bot("Hi there!")
    print(f"Bot response: {response}")

if __name__ == "__main__":
    run_tests()
COLANG_CONFIG = """
# define niceties
define user express greeting
    "hello"
    "hi"
    "what's up?"

define flow greeting
    user express greeting
    bot express greeting
    bot ask how are you

# define limits
define user ask politics
    "what are your political beliefs?"
    "thoughts on the president?"
    "left wing"
    "right wing"

define bot answer politics
    "I'm a shopping assistant, I don't like to talk of politics."

define flow politics
    user ask politics
    bot answer politics
    bot offer help
"""

# Combine YAML and Colang configs
config = RailsConfig.from_content(yaml_content=YAML_CONFIG, colang_content=COLANG_CONFIG)
rails = LLMRails(config)

# Function to chat with the bot
def chat_with_bot(user_input):
    response = rails.generate(messages=[{"role": "user", "content": user_input}])
    return response.get("content", "No response generated.")

# Test the bot
print("Testing general knowledge question:")
response = chat_with_bot("What is the capital of France?")
print(f"Bot response: {response}")

print("\nTesting another question:")
response = chat_with_bot("Can you explain photosynthesis?")
print(f"Bot response: {response}")

In [None]:
# Define our guardrails configuration
config = RailsConfig.from_content(
    """
    define user ask about weather
      "What's the weather like?"
      "How's the weather today?"
      "Is it going to rain?"

    define bot provide weather info
      "I'm sorry, but I don't have access to real-time weather information. You might want to check a weather app or website for the most up-to-date forecast."

    define flow
      user ask about weather
      bot provide weather info

    define user ask about politics
      "What do you think about the current political situation?"
      "Who should I vote for?"
      "What's your opinion on [political topic]?"

    define bot refuse politics
      "I'm an AI assistant designed to help with general information and tasks. I don't discuss political opinions or provide voting advice. For factual information on political processes or current events, please consult reliable news sources or government websites."

    define flow
      user ask about politics
      bot refuse politics

    define user ask general question
      "What is the capital of France?"
      "How do I bake a cake?"
      "Can you explain photosynthesis?"

    define bot provide general info
      "I'd be happy to help with that general knowledge question. [Insert relevant information here]"

    define flow
      user ask general question
      bot provide general info
    """
)

In [None]:
# Combine YAML and Colang configs
config = RailsConfig.from_content(yaml_content=YAML_CONFIG, colang_content=COLANG_CONFIG)
rails = LLMRails(config)

# Function to chat with the bot
def chat_with_bot(user_input):
    response = rails.generate(messages=[{"role": "user", "content": user_input}])
    return response.get("content", "No response generated.")

In [None]:
print("Testing weather question:")
print(chat_with_bot("What's the weather like today?"))

In [None]:
print("\nTesting political question:")
print(chat_with_bot("Who should I vote for in the next election?"))

In [None]:
print("\nTesting general knowledge question:")
print(chat_with_bot("What is the capital of France?"))

This example demonstrates how to use NeMo Guardrails to control the output of an LLM, ensuring it follows specific rules and flows in its responses.

## Conclusion

In this notebook, we've explored three powerful tools in the world of Large Language Models:

1. LangChain for creating LLM agents
2. NVIDIA Inference Models (NIMs) for accessing powerful language models
3. NeMo Guardrails for controlling LLM output

Each of these tools offers unique capabilities that can be leveraged to create sophisticated AI applications. By combining them, you can build robust, controllable, and powerful AI systems that can handle a wide range of tasks while adhering to specific rules and guidelines.

As you continue to explore these technologies, remember to consider the ethical implications of AI systems and always strive to create responsible and beneficial applications.