# Ollama RAG Assistant for Quantum Technologies

Authered by Naufer Nusran - June 8th, 2024.

This tutorial, walks through the steps to implement a simple Retrieval-Augmented Generation (RAG) based on llama3 - the open-source large language model (LLM) developed by Meta, and using the LangChain NLP framework. Two different queries are chosen to demonstrate the advantage of a RAG implementation that can tailer the LLM to provide more accurate and domain-specific responses:

testQuery1 = "What do you know about Evaporated Glass Trap (EGT) Platform?"

testQuery2 = "What do you know about Novera QPU?"


### References

    1. https://ollama.com   
    2. https://python.langchain.com/v0.1/docs/get_started/quickstart/
    
### Dependancies

In [1]:
#!ollama pull llama3
#!pip install langchain
#!pip install langchain_community
#!pip install beautifulsoup4
#!pip install faiss-cpu

### Test LLM 

In [2]:
from langchain_community.llms import Ollama
from langchain_core.prompts import ChatPromptTemplate

llm = Ollama(model="llama3") 

testQuery1 = "What do you know about Evaporated Glass Trap (EGT) Platform?"
testQuery2 = "What do you know about Novera QPU?"

In [3]:
response = llm.invoke(testQuery1)
print(response)

The Evaporated Glass Trap (EGT) platform! That's a fascinating topic.

From what I've gathered, the EGT platform is an emerging technology that uses evaporated glass to create a unique trapping mechanism. This innovative approach has been gaining attention in various fields, including chemistry, biology, and environmental science.

Here are some key points about the Evaporated Glass Trap (EGT) Platform:

1. **Principle**: The EGT platform works by evaporating glass particles onto a substrate, creating a thin film that can trap and hold specific molecules or particles. This trapping mechanism allows for the selective capture of substances with unique properties.
2. **Applications**: The EGT platform has been explored in various areas, such as:
	* Environmental monitoring: Tracking pollutants, detecting chemical contaminants, and monitoring water quality.
	* Biomedical research: Studying biomarkers, capturing cells or proteins, and understanding disease mechanisms.
	* Materials science: 

In [4]:
response = llm.invoke(testQuery2)
print(response)

Novera's Quantum Processing Unit (QPU) is a novel AI-acceleration chip that utilizes quantum computing principles to accelerate certain types of machine learning workloads. Here are some key facts about the Novera QPU:

1. **Quantum-inspired architecture**: The QPU is designed to mimic the principles of quantum computing, but without actually using quantum bits (qubits). This allows for significant speedups in specific AI tasks.

2. **FPGA-based implementation**: The QPU is built on top of field-programmable gate arrays (FPGAs), which enables reconfigurability and customization for different workloads.

3. **Neural network acceleration**: The QPU is optimized for accelerating neural networks, particularly those with specific types of convolutional layers, such as image classification tasks.

4. **Up to 10x speedup**: Novera claims that their QPU can deliver up to 10 times the performance of traditional CPUs or GPUs on certain AI workloads.

5. **Energy efficiency**: The QPU is designed

$\color{red}{\text{ This is not good. In both queries, the LLM is helucinating! So, let's try some prompt engineering.}}$



In [5]:
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an Assistant for Quantum Technologies. Your response should be short and accurate. If you are not sure say you don't know."),
    ("user", "{query}")
])

llm_app = prompt | llm

In [6]:
response = llm_app.invoke({"query": testQuery1})
print(response)

I'm familiar with the Evaporated Glass Trap (EGT) platform! It's a quantum computing architecture designed to enhance the performance and scalability of quantum processors. The EGT platform uses evaporated glass surfaces to trap and manipulate individual atoms, enabling precise control over qubits. This innovative approach aims to improve quantum computing capabilities and pave the way for more robust and reliable quantum processing.


In [7]:
response = llm_app.invoke({"query": testQuery2})
print(response)

Novera QPU (Quantum Processing Unit) is a high-performance computing platform designed by Novera Systems, Inc. for processing quantum algorithms. It's a hybrid classical-quantum architecture that enables efficient execution of quantum-inspired workloads on standard hardware. Does that help?


$\color{red}{\text{Still helucinating. This is where RAG can help!}}$


## Build RAG

    1. Load External Knowledge Base
    2. Generate Embeddings and Build an Index 
    3. Create Retrieval Chain

### 1. Load External Knowledge Base


In [8]:
from langchain_community.document_loaders import WebBaseLoader

#loader = WebBaseLoader("https://ionq.com/resources/glossary")
#docs = loader.load()

urls = [
        "https://ionq.com/resources/glossary",
        "https://www.rigetti.com/rigetti-computing-news"
       ] 

docs = []
for url in urls:
    loader = WebBaseLoader(url)
    docs.extend(loader.load())
    
len(docs)



2

### 2. Generate Embeddings and Build Index

In [10]:
from langchain_community.embeddings import OllamaEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS

embeddings_llm = OllamaEmbeddings(model="llama3")

text_splitter = RecursiveCharacterTextSplitter()

documents = text_splitter.split_documents(docs)

len(documents)

8

In [11]:
vector_index = FAISS.from_documents(documents, embeddings_llm)

In [12]:
retriever = vector_index.as_retriever()

relevant_docs = retriever.invoke({"input": testQuery1})

len(relevant_docs)

4

### 3. Create Retrieval Chain

In [13]:
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.documents import Document
from langchain.chains import create_retrieval_chain

prompt = ChatPromptTemplate.from_template("""
You are an Assistant for Quantum Technologies. Your response should be short and accurate. Answer the question based on the provided context and your internal knowledge.
Give priority to context and if not sure then say you are unaware of the topic:

<context> {context} </context>

Question: {input}
""")


document_chain = create_stuff_documents_chain(llm, prompt)
retrieval_chain = create_retrieval_chain(retriever, document_chain)

In [18]:
response = retrieval_chain.invoke({"input": testQuery1})

print(response["answer"])

I'm not aware of any information about an "Evaporated Glass Trap" (EGT) platform. It's possible that this is a proprietary technology or a concept that has not been publicly disclosed. If you could provide more context or clarify what you mean by EGT, I may be able to help you better.


In [19]:
response = retrieval_chain.invoke({"input": testQuery2})

print(response["answer"])

Based on the provided context, I can tell you that Novera QPU is Rigetti Computing's first commercially available Quantum Processing Unit (QPU). It features a 9-qubit chip with tunable couplers for fast two-qubit operations and a 5-qubit chip for testing single-qubit operations. The Novera QPU is built on Rigetti's fourth-generation Ankaa-class architecture.

Additionally, Rigetti has also launched the Novera QPU Partner Program to enable high-performing, on-premises quantum computing by creating an ecosystem of quantum computing hardware, software, and service providers who build and offer integral components of a functional quantum computing system.
