# Create a simple chatbot using LangChain and <br> NVIDIA AI Foundation Endpoints or NVIDIA NIM for LLMs

Please see [here](https://python.langchain.com/docs/integrations/text_embedding/nvidia_ai_endpoints) if you need help with generating the `NVIDIA_API_KEY`

In [1]:
import os
from dotenv import load_dotenv
load_dotenv('../.env')

True

In [2]:
# Initialize LLM

import os
from langchain_nvidia_ai_endpoints import ChatNVIDIA

# NVIDIA AI Foundation Endpoints
llm = ChatNVIDIA(model="mistralai/mixtral-8x7b-instruct-v0.1")

# NVIDIA NIMs for LLMs
# llm = ChatNVIDIA(base_url="http://localhost:8000/v1", model="meta/llama3-8b-instruct")

In [3]:
# Use ChatNVIDIA model in LangChain Expression Language (LCEL)

from langchain_core.prompts import ChatPromptTemplate
from langchain_nvidia_ai_endpoints import ChatNVIDIA
from langchain_core.output_parsers import StrOutputParser

prompt = ChatPromptTemplate.from_messages([
    ("system", (
        "You are a helpful and friendly AI!"
        "Your responses should be concise and no longer than two sentences."
        "Say you don't know if you don't have this information."
    )),
    ("user", "{question}")
])

chain = prompt | llm | StrOutputParser()

In [4]:
print(chain.invoke({"question": "What's the difference between a GPU and a CPU?"}))

A GPU, or Graphics Processing Unit, is a specialized type of processor designed to quickly render graphics, while a CPU, or Central Processing Unit, handles general computational tasks. They have different architectures and strengths, with GPUs excelling at parallel processing and CPUs at single-thread performance.


In [5]:
print(chain.invoke({"question": "What does the A in the NVIDIA A100 stand for?"}))

I'm sorry, I don't have that specific information. The naming conventions of companies like NVIDIA often have internal logic, but they don't always make public the exact meaning of each letter or number in their product codes.


In [6]:
print(chain.invoke({"question": "How much memory does the NVIDIA H200 have?"}))

I'm sorry, I don't have the specific information about the NVIDIA H200 graphics card's memory size. It would be best to check the official NVIDIA website or product specifications for the most accurate information.


- The LLM can't provide an answer to the last question because it's limited by the information it was trained on. These models are only equipped with knowledge up to a certain point in time and don't have access to proprietary or recent data. As a result, they may struggle to address inquiries related to new or proprietary information.
  
- The following notebooks will demonstrate how we can use NeMo Retriever Embedding Microservice, NeMo Retriever Reranking Microservice, and NeMo Retriever to create a RAG workflow and provide new knowledge to the LLM.