**Install Nvidia Llamma SDK for Python**

In [64]:
!pip install --upgrade --quiet llama-index-llms-nvidia llama-index-embeddings-nvidia

**Initialize the environment with Nvidia API Key**

In [65]:
import getpass
import os

#del os.environ['NVIDIA_API_KEY']  ## delete key and reset
if os.environ.get("NVIDIA_API_KEY", "").startswith("nvapi-"):
    print("Valid NVIDIA_API_KEY already in environment. Delete to reset")
else:
    nvapi_key = getpass.getpass("NVAPI Key (starts with nvapi-): ")
    assert nvapi_key.startswith(
        "nvapi-"
    ), f"{nvapi_key[:5]}... is not a valid key"
    os.environ["NVIDIA_API_KEY"] = nvapi_key

Valid NVIDIA_API_KEY already in environment. Delete to reset


**Trying out the Nvidia Foundational AI API Endpoint**

In [66]:
# Import the OpenAI module from the openai package
from openai import OpenAI

# Create an instance of the OpenAI client, configuring it to interact with NVIDIA's API
client = OpenAI(
  base_url = "https://integrate.api.nvidia.com/v1",  # Set the base URL for NVIDIA's API
  api_key = nvapi_key
)

def ask_question(user_input):
  # Create a chat completion request using the specified model and parameters
  chat_response = client.chat.completions.create(
  model="meta/llama3-8b-instruct",  # Specify the model to use for generating the completion
  messages=[{"role":"user","content":user_input}],  # Provide the prompt to the model
  temperature=0.5,  # Set the temperature for the generation to control randomness (0.5 is moderate)
  top_p=1,  # Use nucleus sampling with top-p set to 1, meaning no filtering by cumulative probability
  max_tokens=1024,  # Limit the response to a maximum of 1024 tokens
  stream=False  # Enable streaming to receive the response in chunks
  )
  return chat_response.choices[0].message.content  # Extract and return the generated response text

completion = ask_question("what's the current exchange rate of USD to SGD?")

print(completion)

The current exchange rate of USD to SGD (United States Dollar to Singapore Dollar) can fluctuate constantly, so I'll provide you with the latest rate as of my knowledge cutoff. Please note that exchange rates may change rapidly and may not reflect the current rate.

As of [insert current date], the exchange rate is approximately:

1 USD = 1.37 SGD

Please note that this rate is subject to change and may not reflect the current rate. I recommend checking with a reliable currency conversion website or a financial institution for the most up-to-date exchange rate.

Here are some reliable sources to check the current exchange rate:

1. XE.com: A popular online currency conversion website that provides real-time exchange rates.
2. Google Currency Converter: A built-in currency converter in Google search that provides real-time exchange rates.
3. Singapore Exchange (SGX): The official website of the Singapore Exchange provides exchange rates for various currencies, including USD to SGD.
4. Y

**Improving the LLM**

Here, we try out Nvidia's API for llama AI model and to ensure we can access NVIDIA AI Foundation Endpoints through the OpenAI package. We can seamlessly leverage NVIDIA’s AI capabilities and integrate into our chat later on.

The model is smart enough to provide alternative ways to get the exchange rate instead of confidently response with a number (hallucination). However, it is generally better to provide the exchange rate right away to the user.

This is where function calling feature comes in. With function calling feature, you can create a tool that can retrieve latest exchange rate for the agent. You will do that in the following sections.

Firstly, let's create a function that can get the current exchange rate using the API.

**Import the required module**

In [67]:
from llama_index.llms.nvidia import NVIDIA
from llama_index.core.tools import FunctionTool
from llama_index.embeddings.nvidia import NVIDIAEmbedding

**Create a function to call Currency Exchange API in real time**

In [68]:
import requests
# define a function to get exchange rate
def get_fx_rate(base_currency: str, target_currency: str):
    """
    Fetches the current exchange rate between two currencies.

    Args:
        base_currency: The base currency (e.g., "USD").
        target_currency: The target currency (e.g., "SGD").

    Returns:
        The exchange rate information as a json response,
        or None if the rate could not be fetched.
    """

    url = f"https://hexarate.paikama.co/api/rates/latest/{base_currency}?target={target_currency}"
    response = requests.get(url)
    if response.status_code == 200:
        return response.json()


# test the function
get_fx_rate("USD", "SGD")

{'status_code': 200,
 'data': {'base': 'USD',
  'target': 'SGD',
  'mid': 1.33945,
  'unit': 1,
  'timestamp': '2024-12-01T00:00:26.330Z'}}

The function is working as intented. Next we craete a Tool and we can attach the function to it so the agent can use it.

In [69]:
convert_tool = FunctionTool.from_defaults(fn=get_fx_rate)

Now, you are ready to create an agent with a tool that has access to the function to retrive currency exchange rate.

Before we create the agent, we also want to add some simple math functions.

*Addition function*

In [70]:
def add(a: int, b: int) -> int:
    """Add two integers and returns the result integer"""
    return a + b
add_tool = FunctionTool.from_defaults(fn=add)

*Multiplication function*

In [71]:
def multiply(a: int, b: int) -> int:
    """Multiple two integers and returns the result integer"""
    return a * b
multiply_tool = FunctionTool.from_defaults(fn=multiply)

Now, you are truely ready to create an agent with a set of tools to do meaningful currency conversion.

**Building the agent to call external API for current rate checking.**

Firstly we define an LLM model for the agent to use. This can be anything. You can use the agent list to show all available models to use. In our example we will just stick to a basic smaller language model.

In [72]:
llm = NVIDIA("meta/llama-3.1-8b-instruct")

In [73]:
from llama_index.core.agent import FunctionCallingAgent

agent = FunctionCallingAgent.from_tools(
    [convert_tool, add_tool, multiply_tool],
    llm=llm,
    verbose=True,
)

In [74]:
response = agent.chat("What's the current exchange rate of USD to SGD?")
print(str(response))

> Running step 7dec0cbe-e0e1-4c4a-b701-37933e4bd363. Step input: What's the current exchange rate of USD to SGD?
Added user message to memory: What's the current exchange rate of USD to SGD?
=== Calling Function ===
Calling function: get_fx_rate with args: {"base_currency": "USD", "target_currency": "SGD"}
=== Function Output ===
{'status_code': 200, 'data': {'base': 'USD', 'target': 'SGD', 'mid': 1.33945, 'unit': 1, 'timestamp': '2024-12-01T00:00:26.330Z'}}
> Running step 37f82287-ce4e-4c12-9dec-eebebbb9ecba. Step input: None
=== LLM Response ===
The current exchange rate of USD to SGD is 1.33945.
The current exchange rate of USD to SGD is 1.33945.


Here we can see that the response is different from the initially calling the base foundational model endpoint. The agent is now returning a function call response.

**Agent with ReAct - Reasoning and Action**

ReAct, short for "Reasoning and Acting," is a framework designed to enhance the capabilities of large language models (LLMs) by combining reasoning and action planning. This approach allows LLMs to generate reasoning traces and task-specific actions in an interleaved manner. Essentially, ReAct enables the model to think through a problem, plan actions, and interact with external sources to gather additional information

In [75]:
from llama_index.core.agent import ReActAgent

In [76]:
agent = ReActAgent.from_tools([convert_tool, add_tool, multiply_tool], llm=llm, verbose=True)

In [77]:
response = agent.chat("I have 100 in my wallet, my wife gave me 30 just now. What is the total I can convert from SGD to USD Do it step by step")

> Running step efe875e3-b81d-45bb-9426-caf4a4c800a7. Step input: I have 100 in my wallet, my wife gave me 30 just now. What is the total I can convert from SGD to USD Do it step by step
[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: add
Action Input: {'a': 100, 'b': 30}
[0m[1;3;34mObservation: 130
[0m> Running step 23cd4137-884c-4ddb-8809-a12a26cad21f. Step input: None
[1;3;38;5;200mThought: I need to get the current exchange rate between SGD and USD to convert the total amount from SGD to USD.
Action: get_fx_rate
Action Input: {'base_currency': 'SGD', 'target_currency': 'USD'}
[0m[1;3;34mObservation: {'status_code': 200, 'data': {'base': 'SGD', 'target': 'USD', 'mid': 0.746575, 'unit': 1, 'timestamp': '2024-12-01T00:01:26.548Z'}}
[0m> Running step 7fa91382-5cb7-408e-a90c-8ef1955dedca. Step input: None
[1;3;38;5;200mThought: I have the current exchange rate, now I need to use it to convert the t

By integrating reasoning and acting, ReAct helps LLMs to:


*   Induce, track, and update action plans: The model can create and adjust plans based on the reasoning process.

*   Handle exceptions: It can manage unexpected situations by reasoning through them.

*   Interface with external sources: The model can gather information from knowledge bases or environments to support its reasoning







This synergy between reasoning and acting improves the model's performance on various tasks, such as question answering and decision-making, and enhances human interpretability, trustworthines.

In our example, by introducing function calling, we are able to get information real time and supplement it to our LLM to return a response from a user. We then further enchance it but introducing agent giving it ability to use Tools. Lastly by giving the agent ability to reason, we can make it more human like.

**Enhancements**

1. Introduce trip planner agent that can come up with itinerary and budget.
2. Pair the agent to this example to calculate a budget based on local currency.
3. Deploy in a cloud container environement.