# RAG Using LLM Endpoints from MLIS

## Importing the Libraries

### One-Time Environment Setup (Cert & Dependencies)

The following commands should be run **once** at the beginning of your session:

```python
!pip install -r requirements.txt -qq > /dev/null 2>&1
!cat my-private-ca-pcai-1.crt >> $(python -m certifi)

In [None]:
!pip install -r requirements.txt -q

### Append Custom CA Certificate to Python's Trusted Cert Store

The following command appends a custom certificate (`renewed-pcai1-crt.pem`) to Python's certifi CA bundle, allowing Python tools like `requests` to trust internal HTTPS endpoints signed by this CA:


In [None]:
!cat /mnt/shared/ca-renewed/renewed-pcai1-crt.pem >> $(python -m certifi)

### Restart the Kernel

After running the setup commands above:

Go to the Jupyter menu and select:  
**`Kernel` → `Restart Kernel`**  

This step is necessary to activate:
- The newly installed packages.
- The updated CA certificates in the Python runtime.


### After Restart: Run the Remaining Notebook Cells

Now that the kernel has restarted, begin executing the remaining notebook cells starting from your LangChain and Weaviate imports.


In [None]:
from langchain_nvidia_ai_endpoints import ChatNVIDIA
from langchain_nvidia_ai_endpoints.reranking import NVIDIARerank
from langchain_weaviate.vectorstores import WeaviateVectorStore
from langchain.chains import RetrievalQA
from langchain_community.document_loaders import PyPDFLoader
from langchain.chains import RetrievalQA
from langchain.retrievers.contextual_compression import ContextualCompressionRetriever
from langchain.text_splitter import CharacterTextSplitter
import weaviate

## Fetching the Secret Token for RAG Essentials

In [None]:
import weaviate, os
from weaviate.classes.init import Auth

#getting the auth token
secret_file_path = "/etc/secrets/ezua/.auth_token"

with open(secret_file_path, "r") as file:
    token = file.read().strip()

## Connecting to Weaviate

In [None]:
domain = ".cluster.local"
http_host = "weaviate.hpe-weaviate.svc.cluster.local"
grpc_host = "weaviate-grpc.hpe-weaviate.svc" + domain
weaviate_headers = {"x-auth-token": token}
#weaviate_headers = {"x-auth-token": "wrong token"}

client = weaviate.connect_to_custom(
    http_host=http_host,        # Hostname for the HTTP API connection
    http_port=80,              # Default is 80, WCD uses 443
    http_secure=False,           # Whether to use https (secure) for the HTTP API connection
    grpc_host=grpc_host,        # Hostname for the gRPC API connection
    grpc_port=50051,              # Default is 50051, WCD uses 443
    grpc_secure=False,           # Whether to use a secure channel for the gRPC API connection
    headers=weaviate_headers,
    skip_init_checks=False
)

print(client.is_ready())

## Connecting to LLM through MLIS

In [None]:
llm_api_key = "Paste the Auth token from mlis for LLM"
llm_endpoint_mlis = "Paste the endpoint for LLM from MLIS"

In [None]:
llm = ChatNVIDIA(
    base_url=llm_endpoint_mlis,
    model="meta/llama3-8b-instruct",
    api_key=llm_api_key,
    temperature=0.5,
    max_tokens=1024,
    top_p=1.0,
)
llm.invoke("List some of the technology offerings from HPE")

## Data Extraction and Processing

In [None]:
# Replace with the path to your PDF
pdf_path = "./HPE.pdf"

# Load PDF file
loader = PyPDFLoader(pdf_path)
documents = loader.load()

# Split into manageable chunks
text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
docs = text_splitter.split_documents(documents)

for doc in docs:
    doc.metadata={}

## Vector Store Initialization

In [None]:
from langchain_ollama import OllamaEmbeddings

vector = WeaviateVectorStore.from_documents(docs, embedding=OllamaEmbeddings(model = "nomic-embed-text:latest", base_url="https://ollama.pcai1.genai1.hou"), client=client, index_name="RAG", text_key="Rag".lower() + "_key")


## Retriever Initialization

In [None]:
retriever=vector.as_retriever()

## User Query

In [None]:
query = "which GPU powers the HPE ProLiant Compute DL384 Gen12 ?"

## Output

In [None]:
chain = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)
chain.invoke(query)