# RAG - Retrieval-Augmented Generation

# Recap

![With and Without RAG](notebook_images/rag-with-without.png "With and Without RAG") 

![RAG](notebook_images/rag-before-after.png "RAG")

# RAG

## Retrieval
- Setup a knowledge base 
- Retrieve documents relevant to the user query

## Generation
- Using LLMs
- Use the retrieved documents as context


In [39]:
# basic imports
import os
import json
import logging
import sys
import pandas as pd

from dotenv import load_dotenv
load_dotenv(override=True)

# create and configure logger
logging.basicConfig(level=logging.INFO, datefmt='%Y-%m-%dT%H:%M:%S',
                    format='%(asctime)-15s.%(msecs)03dZ %(levelname)-7s : %(name)s - %(message)s',
                    handlers=[logging.StreamHandler(sys.stdout)]
                    )
# create log object with current module name
log = logging.getLogger(__name__)

## RAG - *Retrieval*-Augmented Generation

### Knowledge base
- Create a knowledge base with "your" data

#### Retrieval steps
1. Prepare data 
2. Create a database and insert data
3. Search the database and retrieve relevant documents according to the search query.

### 1. Prepare data
- Load data from different sources
- Will be using some polymer chemistry research papers (in `docs` folder).

PS : check the licensing agreement of the docs before doing any GenAI techniques. For testing purposes, with a small group of people, this should be ok.

### 1.1 Data Loaders
- Langchain provides different [data loaders](https://python.langchain.com/docs/how_to/#document-loaders) for different file types
- Eg: Langchain CSVLoader is essentially a wrapper for Python [csv.DictReader class](https://docs.python.org/3/library/csv.html#csv.DictReader)
- Data is loaded into Langchain Document object [format](https://api.python.langchain.com/en/latest/documents/langchain_core.documents.base.Document.html)

Code loads all files in a directory. For now, we have only PDF files.



In [2]:
# data loaders
from langchain_community.document_loaders import CSVLoader, DataFrameLoader, PyPDFLoader, Docx2txtLoader, UnstructuredRSTLoader, DirectoryLoader


class DataLoaders:
    """
    various data loaders
    loads all data in a directory
    """
    def __init__(self, data_dir_path):
        self.data_dir_path = data_dir_path
    
    def csv_loader(self):
        csv_loader_kwargs = {
                            "csv_args":{
                                "delimiter": ",",
                                "quotechar": '"',
                                },
                            }
        dir_csv_loader = DirectoryLoader(self.data_dir_path, glob="**/*.csv", use_multithreading=True,
                                    loader_cls=CSVLoader, 
                                    loader_kwargs=csv_loader_kwargs,
                                    )
        return dir_csv_loader
    
    def pdf_loader(self):
        dir_pdf_loader = DirectoryLoader(self.data_dir_path, glob="**/*.pdf",
                                    loader_cls=PyPDFLoader,
                                    )
        return dir_pdf_loader
    
    def word_loader(self):
        dir_word_loader = DirectoryLoader(self.data_dir_path, glob="**/*.docx",
                                    loader_cls=Docx2txtLoader,
                                    )
        return dir_word_loader
    
    def rst_loader(self):
        rst_loader_kwargs = {
                        "mode":"single"
                        }
        dir_rst_loader = DirectoryLoader(self.data_dir_path, glob="**/*.rst",
                                    loader_cls=UnstructuredRSTLoader, 
                                    loader_kwargs=rst_loader_kwargs
                                    )
        return dir_rst_loader
    

In [20]:
# load data
data_dir_path =  os.getenv('DATA_DIR_PATH', "data")
data_loader = DataLoaders(data_dir_path=data_dir_path)
log.info("Loading files from directory %s", data_dir_path)
dir_csv_loader = data_loader.csv_loader()
dir_word_loader = data_loader.word_loader()
dir_pdf_loader = data_loader.pdf_loader()
dir_rst_loader = data_loader.rst_loader()
csv_data = dir_csv_loader.load()
word_data = dir_word_loader.load()
pdf_data = dir_pdf_loader.load()
rst_data = dir_rst_loader.load()

2024-11-06T12:11:10.033Z INFO    : __main__ - Loading files from directory data


### 1.2 Data format
- Document class has 
    - `page_content` : textual content
    - `metadata` : metadata about the document. Can be user-defined. 
        - by default, each file type has its own metadata content
        - Eg: PDF file has `source` and `page`

![Langchain document class](notebook_images/langchain-document-class.png "Langchain document class")

In [21]:
for doc in pdf_data:
    print(doc)
    break

page_content='www.afm-journal.de© 2020 Wiley-VCH GmbH 2006683 (1 of 9)Full PaPer
Direct Ink Writing of Polymer Composite Electrolytes 
with Enhanced Thermal Conductivities
Meng Cheng, Ajaykrishna Ramasubramanian, Md Golam Rasul, Yizhou Jiang, Yifei Yuan, 
Tara Foroozan, Ramasubramonian Deivanayagam, Mahmoud Tamadoni Saray, 
Ramin Rojaee, Boao Song, Vitaliy Robert Yurkiv, Yayue Pan, Farzad Mashayek, 
and Reza Shahbazian-Yassar*
Proper distribution of thermally conductive nanomaterials in polymer 
batteries offers new opportunities to mitigate performance degradations 
associated with local hot spots and safety concerns in batteries. Herein, a 
direct ink writing (DIW) method is utilized to fabricate polyethylene oxide (PEO) composite polymers electrolytes (CPE) embedded with silane-treated 
hexagonal boron nitride (S-hBN) platelets and free of any volatile organic 
solvents. It is observed that the S-hBN platelets are well aligned in the printed CPE during the DIW process. The in-plane 

In [25]:
print("Number of PDF documents: ", len(pdf_data))

Number of PDF documents:  125


### 1.3 Format into text and metadata
- Convert data to a list of texts and metadata 
- Only textual content is required for search implementation
- Metadata can be used for filtering the data


In [22]:
# get text and metadata from the data
def get_text_metadatas(csv_data=None, pdf_data=None, word_data=None, rst_data=None):
    """
    Each document class has page_content and metadata properties
    Separate text and metadata content from Document class
    Have custom metadata if needed
    """
    csv_texts = [doc.page_content for doc in csv_data]
    # custom metadata
    csv_metadatas = [{'source': doc.metadata['source'], 'row_page': doc.metadata['row']} for doc in csv_data]   # metadata={'source': 'filename.csv', 'row': 0}
    pdf_texts = [doc.page_content for doc in pdf_data]
    pdf_metadatas = [{'source': doc.metadata['source'], 'row_page': doc.metadata['page']} for doc in pdf_data]  # metadata={'source': 'data/filename.pdf', 'page': 8}
    word_texts = [doc.page_content for doc in word_data]
    word_metadatas = [{'source': doc.metadata['source'], 'row_page': ''} for doc in word_data] 
    rst_texts = [doc.page_content for doc in rst_data]
    rst_metadatas = [{'source': doc.metadata['source'], 'row_page': ''} for doc in rst_data]         # metadata={'source': 'docs/images/architecture/index.rst'}

    texts = csv_texts + pdf_texts + word_texts + rst_texts
    metadatas = csv_metadatas + pdf_metadatas + word_metadatas + rst_metadatas
    return texts, metadatas


texts , metadatas = get_text_metadatas(csv_data, pdf_data, word_data, rst_data)

In [34]:
print("Number of PDF texts: ", len(texts))
print("Number of PDF metadata: ", len(metadatas))

Number of PDF texts:  125
Number of PDF metadata:  125


### 1.4 Chunking

![Chunk Optimization](notebook_images/rag-chunking1.png "Chunk Optimization")

### 1.4 Chunking
- Split texts into chunks
- Return a list of document chunks (list of langchain [document class](https://api.python.langchain.com/en/latest/documents/langchain_core.documents.base.Document.html))

In [23]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.schema import Document
from typing import List

text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
        chunk_size=1000,
        chunk_overlap=200,
        separators=[
            "\n\n", "\n", ". ", " ", ""
        ]  # try to split on paragraphs... fallback to sentences, then chars, ensure we always fit in context window
    )

docs: List[Document] = text_splitter.create_documents(texts=texts, metadatas=metadatas)


In [24]:
print(docs[0])

page_content='www.afm-journal.de© 2020 Wiley-VCH GmbH 2006683 (1 of 9)Full PaPer
Direct Ink Writing of Polymer Composite Electrolytes 
with Enhanced Thermal Conductivities
Meng Cheng, Ajaykrishna Ramasubramanian, Md Golam Rasul, Yizhou Jiang, Yifei Yuan, 
Tara Foroozan, Ramasubramonian Deivanayagam, Mahmoud Tamadoni Saray, 
Ramin Rojaee, Boao Song, Vitaliy Robert Yurkiv, Yayue Pan, Farzad Mashayek, 
and Reza Shahbazian-Yassar*
Proper distribution of thermally conductive nanomaterials in polymer 
batteries offers new opportunities to mitigate performance degradations 
associated with local hot spots and safety concerns in batteries. Herein, a 
direct ink writing (DIW) method is utilized to fabricate polyethylene oxide (PEO) composite polymers electrolytes (CPE) embedded with silane-treated 
hexagonal boron nitride (S-hBN) platelets and free of any volatile organic 
solvents. It is observed that the S-hBN platelets are well aligned in the printed CPE during the DIW process. The in-plane 

In [None]:
print("Number of documents: ", len(docs))

### 1.5 Vector Embeddings
![representing langugage](notebook_images/representing-language.png "Representing langugage")

Image source: MIT Deep Learning course [slides](https://introtodeeplearning.com/slides/6S191_MIT_DeepLearning_L2.pdf)

#### Embeddings
- Vector representation of text. 
- Individual words are represented has real-valued vectors.
- Captures semantic meaning and relationships of the text in a high-dimensional space. 
- Words that have similar meaning have similar representation.
- Eg : [GloVE](https://nlp.stanford.edu/projects/glove/), [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard) in HF.

![vector embeddings](notebook_images/vectors.png "Vector embeddings")

We will be using [OpenAI text embedding model](https://platform.openai.com/docs/guides/embeddings/embedding-models) with 8191 vector dimension.


In [8]:
# embeddings 
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()

### RAG - Retrieval Steps

~~1. Prepare data~~

2. Create a database and insert data

3. Search the database and retrieve relevant documents according to the search query.



### RAG - Retrieval Steps

### 2. Create a database and insert data

**Vector database** (Beginners [blog 1](https://medium.com/data-and-beyond/vector-databases-a-beginners-guide-b050cbbe9ca0), Pinecone [blog 2](https://www.pinecone.io/learn/vector-database/))

- Efficiently store, index and search high-dimensional data
- Store data as vector embeddings
- Optimized for fast retrieval and similarity search
- Calculate the distance between user query embedding and other data points

![Vector DB](notebook_images/vectorDB-comparison.png "Vector DB")


## RAG - *Retrieval*-Augmented Generation

### 2.1 Vector DB Retrieval

![Vector DB Retrieval](notebook_images/vectordb-retrieval.png "Vector DB Retrieval")

### 2.2 Vector Store

- We will use [Qdrant](https://qdrant.tech/) vector store for this example
- For today we will use local memory as the vector store
- Qdrant has a docker image that can be used to create a vector store and hosted remotely

Eg: [Qdrant docker container running locally](http://localhost:6333/dashboard)

- Blog post on vector stores [link](https://medium.com/google-cloud/vector-databases-are-all-the-rage-872c888fa348)

### 2.2 Vector Store

![Inserting into DB](notebook_images/inserting-db.png "Inserting into DB")

Source Credits : [Blog.demir](https://blog.demir.io/hands-on-with-rag-step-by-step-guide-to-integrating-retrieval-augmented-generation-in-llms-ac3cb075ab6f)


In [26]:
# creating a qdrant vector store in local memory

from langchain_community.vectorstores import Qdrant

# qdrant collection name
collection_name = os.getenv('QDRANT_COLLECTION_NAME', "data-collection")

# create vector store in local memory
vectorstore = Qdrant.from_documents(
    documents=docs,
    embedding=embeddings,
    location=":memory:",  # Local mode with in-memory storage only
    collection_name=collection_name,
    )

2024-11-06T12:12:37.962Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2024-11-06T12:12:39.865Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2024-11-06T12:12:41.871Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2024-11-06T12:12:43.316Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2024-11-06T12:12:44.658Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2024-11-06T12:12:45.392Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


### RAG - Retrieval Steps

~~1. Prepare data~~

~~2. Create a vector store and insert data~~

3. Search the vector store and retrieve relevant documents

## 3. Retrieve relevant documents
Create a retriever from the vector store

In [27]:
# Retriever to retrieve relevant snippets
retriever = vectorstore.as_retriever()

### RAG - Retrieval Steps

~~1. Prepare data~~

~~2. Create a vector store and insert data~~

~~3. Search the vector store and retrieve relevant documents~~

## RAG - Retrieval-Augmented *Generation*

### LLM

- Pre-trained transformer models 
- Trained to predict the next word (token), given some input text.
- Open-source models - HuggingFace [leaderboard](https://huggingface.co/collections/open-llm-leaderboard/llm-leaderboard-best-models-652d6c7965a4619fb5c27a03)

- For HandsOn - OpenAI GPT-4o-mini, and Ollama Llama3.2:3.2B model


![LLM prompting](notebook_images/rag-prompting.png "LLM Prompting")

## 4. Call LLM

![LLM prompting](notebook_images/rag-prompting.png "LLM Prompting")

- LilianWeng [blog post](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/), [medium blog post](https://medium.com/thedeephub/llm-prompt-engineering-for-beginners-what-it-is-and-how-to-get-started-0c1b483d5d4f#:~:text=In%20essence%2C%20a%20prompt%20is,you%20want%20it%20to%20do.) on prompt engineering


### 4.1 Prompting

- Use [Langchain hub](https://smith.langchain.com/hub) to pull prompts
    - easy to share and reuse prompts
    - can see what are the popular prompts for specific use cases
    - Eg: [rlm/rag-prompt](https://smith.langchain.com/hub/rlm/rag-prompt)

![RLM RAG prompt](notebook_images/rlm-rag-prompt.png "rlm/rag-prompt")

- Use a prompt template [Langchain PromptTemplate](https://api.python.langchain.com/en/latest/prompts/langchain_core.prompts.prompt.PromptTemplate.html) to generate custom prompts
    - includes input parameters that can be dynamically changed
    
```
qa_prompt_template = """Use the following pieces of context to answer the question at the end. Please follow the following rules:
    1. If the question has some initial findings, use that as context.
    2. If you don't know the answer, don't try to make up an answer. Just say **I can't find the final answer but you may want to check the following sourcess** and add the source documents as a list.
    3. If you find the answer, write the answer in a concise way and add the list of sources that are **directly** used to derive the answer. Exclude the sources that are irrelevant to the final answer.

    {context}

    Question: {question}
    Helpful Answer:"""

rag_chain_prompt = PromptTemplate.from_template(qa_prompt_template) 
```


In [14]:
# prompting

from langchain import hub
prompt = hub.pull("rlm/rag-prompt")

## 4.2 Call LLM
- We will use 
    - OpenAI GPT-4o-mini and 
    - Ollama llama3.2 model (hosted by NCSA)
- Each model has its own formats and parameters

In [28]:
# formatting the documents as a string before calling the LLM

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

#### Without RAG

In [16]:
# call open ai GPT-4o-mini
from langchain_openai import ChatOpenAI

# create a chat openai model
llm: ChatOpenAI = ChatOpenAI(
            temperature=0,
            model="gpt-4o-mini",
            max_retries=500,
        )

In [17]:
# call GPT4o-mini. 
# No RAG. Not giving any instructions/context to the LLM.

llm.invoke("What is the capital of the world?")

# Notice the OpenAI LLM response format: content , metadata

2024-11-06T10:06:53.653Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


AIMessage(content='There is no official "capital of the world," as each country has its own capital city. However, some cities are often referred to as global capitals due to their significant influence in international politics, finance, culture, and diplomacy. Examples include New York City (home to the United Nations), London, and Washington, D.C. Each of these cities plays a crucial role on the world stage, but there is no single city that serves as the capital of the entire world.', response_metadata={'token_usage': {'completion_tokens': 95, 'prompt_tokens': 15, 'total_tokens': 110, 'prompt_tokens_details': {'cached_tokens': 0, 'audio_tokens': 0}, 'completion_tokens_details': {'reasoning_tokens': 0, 'audio_tokens': 0, 'accepted_prediction_tokens': 0, 'rejected_prediction_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_0ba0d124f1', 'finish_reason': 'stop', 'logprobs': None}, id='run-78e59ba3-36f9-4689-a8b7-791266b4d593-0', usage_metadata={'input_toke

In [35]:
llm.invoke("What is the material used?")

2024-11-06T16:45:25.718Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


AIMessage(content='Could you please provide more context or specify what you are referring to? The material used can vary widely depending on the subject, such as construction, clothing, technology, or art.', response_metadata={'token_usage': {'completion_tokens': 36, 'prompt_tokens': 13, 'total_tokens': 49, 'prompt_tokens_details': {'cached_tokens': 0, 'audio_tokens': 0}, 'completion_tokens_details': {'reasoning_tokens': 0, 'audio_tokens': 0, 'accepted_prediction_tokens': 0, 'rejected_prediction_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_9b78b61c52', 'finish_reason': 'stop', 'logprobs': None}, id='run-eab6bc66-bb48-4616-8a1f-e2e40d4c0d94-0', usage_metadata={'input_tokens': 13, 'output_tokens': 36, 'total_tokens': 49})

In [36]:
llm.invoke("What was the print temperature, speed, nozzle diameter?")

2024-11-06T16:46:04.109Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


AIMessage(content='To provide accurate information, I need to know the specific material or 3D printer you are referring to, as print temperature, speed, and nozzle diameter can vary widely depending on the type of filament (like PLA, ABS, PETG, etc.) and the printer model. \n\nFor example, here are some general guidelines for common materials:\n\n1. **PLA (Polylactic Acid)**\n   - Print Temperature: 180-220°C\n   - Print Speed: 40-100 mm/s\n   - Nozzle Diameter: 0.4 mm (common, but can vary)\n\n2. **ABS (Acrylonitrile Butadiene Styrene)**\n   - Print Temperature: 220-250°C\n   - Print Speed: 40-80 mm/s\n   - Nozzle Diameter: 0.4 mm (common, but can vary)\n\n3. **PETG (Polyethylene Terephthalate Glycol-Modified)**\n   - Print Temperature: 220-250°C\n   - Print Speed: 40-100 mm/s\n   - Nozzle Diameter: 0.4 mm (common, but can vary)\n\nIf you provide more details about the specific material or printer, I can give you more tailored information!', response_metadata={'token_usage': {'comple

## 5. RAG 

![RAG system](notebook_images/rag-system.png "RAG system")

### 5. RAG Chain
Combining it all together

- Context is the retrieved docs from the retriever/vector db
- RunnablePassthrough() is used to pass the user query as is to the chain
- format_docs is used to format the documents as a string
- prompt is the prompt used to call LLM with
- llm is used to call the LLM
- StrOutputParser() is used to parse the output from the LLM

In [29]:
# rag chain
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

openai_rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [30]:
# call openai rag chain
openai_rag_chain.invoke("What material is used?")


2024-11-06T12:13:52.648Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2024-11-06T12:13:54.651Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


'The materials used are poly(vinylidene fluoride) (PVDF) and poly(vinylidene fluoride-co-hexaﬂuoropropylene) (PVDF-HFP). These polymers are prepared through direct-ink-writing techniques for applications in sensing and energy storage. They exhibit various morphologies and properties suitable for advanced electronic devices.'

In [31]:
openai_rag_chain.invoke("What is the yield stress value or unit?")


2024-11-06T12:14:27.490Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2024-11-06T12:14:28.708Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


'The yield stress values for the materials mentioned are 1000 Pa for NIPAM/Lap and 300 Pa for NIPAM/Lap/NaPyrPh. Yield stress is typically measured in pascals (Pa). It represents the minimum stress required to induce flow in a material.'

In [32]:
openai_rag_chain.invoke("Is there any epoxy, epoxy-based resin?")

2024-11-06T12:15:04.344Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2024-11-06T12:15:05.890Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


'Yes, there are epoxy-based resins, including those used in 3D printing technologies. These resins can be formulated with various additives, such as acrylates and nanoparticles, to enhance their properties for specific applications. The context mentions the use of epoxy oligomers and the development of inks for direct-ink write (DIW) printing methods.'

In [33]:
openai_rag_chain.invoke("What was the print temperature, speed, nozzle diameter?")

2024-11-06T12:15:08.719Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2024-11-06T12:15:11.185Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


'The print temperature was initially set to 45°C and later increased to 80°C for post-curing. The printing speed was a maximum of 180 mm/s, and the nozzle diameter used was 1.63 mm. Additionally, a 22 GA nozzle with an inner diameter of 0.41 mm was also mentioned for different printed architectures.'

In [None]:
# call openai rag chain
# This should ideally give "I dont know" because of the custom prompt
openai_rag_chain.invoke("What is the capital of the world?")

### RAG - LLM
- Lets try with Llama3.2:3.2B model.
- This is locally hosted in NCSA system.

In [40]:
# call ollama llama3.2:latest

from langchain_community.llms import Ollama

ollama_api_key = os.getenv('OLLAMA_API_KEY')
ollama_headers = {"Authorization": f"Bearer {ollama_api_key}"}

# create a ollama model
ollamallm: Ollama = Ollama(
    base_url="https://ollama.software.ncsa.illinois.edu/ollama",
    model="llama3.2:latest",
    headers=ollama_headers
    )

In [41]:
# call llama3 model
# No RAG. Simple LLM call.
ollamallm.invoke("What is the capital of the world?")

# notice difference in response format.

'There is no single "capital of the world". Each country has its own capital city, and there isn\'t a universally recognized capital that represents all nations.\n\nThat being said, if we consider the headquarters of international organizations or global institutions, some examples could be:\n\n* New York City (United Nations Headquarters)\n* Geneva (Council of Europe)\n* Brussels (European Union)\n* Paris (Organization for Economic Co-operation and Development)\n\nHowever, it\'s essential to note that these cities are not necessarily " capitals" in the classical sense, as they don\'t serve as the capital of a country.\n\nIf you\'re looking for information on international organizations or global institutions, I\'d be happy to help!'

In [42]:
# ollama rag chain
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

ollama_rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | ollamallm
    | StrOutputParser()
)

In [43]:
# call ollama rag chain
ollama_rag_chain.invoke("Who is the president of USA?")

2024-11-06T17:00:03.244Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


"I can help you with that. However, I don't see a specific problem or question in the text provided. The text appears to be a collection of citations from various scientific papers and journals.\n\nIf you could provide more context or clarify what you would like me to do (e.g., summarize the papers, identify key findings, etc.), I'll do my best to assist you."

In [44]:
## adding sources to openai rag chain

from langchain_core.runnables import RunnableParallel

openai_rag_chain_from_docs = (
    RunnablePassthrough.assign(context=(lambda x: format_docs(x["context"])))
    | prompt
    | llm
    | StrOutputParser()
)

openai_rag_chain_with_source = RunnableParallel(
    {"context": retriever, "question": RunnablePassthrough()}
).assign(answer=openai_rag_chain_from_docs)

In [45]:
# call openai rag chain with source
# this will return the answer and the sources (context)
openai_rag_chain_with_source.invoke("What material is used?")

2024-11-06T17:02:16.314Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2024-11-06T17:02:17.952Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'context': [Document(metadata={'source': 'data/Macro Materials   Eng - 2021 - Pinto - Direct‐Ink‐Writing of Electroactive Polymers for Sensing and Energy Storage.pdf', 'row_page': 10, '_id': '58d542e5e82344668a92af9695791a6d', '_collection_name': 'delta-collection'}, page_content='www.advancedsciencenews.com www.mame-journal.de\n[23] S.P.Muduli,S.Parida,S.K.Rout,S.Rajput,M.Kar, Mater.Res.Express\n2019,6,095306.\n[24] D.Yang,Y.Chen, J.Mater.Sci.Lett. 1987,6,599.\n[25] L.Ruan,X.Yao,Y.Chang,L.Zhou,G.Qin,X.Zhang, Polymers2018,\n10,228.\n[26] a)B.Wang,H.-X.Huang, Composites,PartA 2014,66,16;b)K.Polat,\nAppl.Phys.A 2020,126,497.\n[ 2 7 ]R .G o n ç a l v e s ,D .M i r a n d a ,A .M .A l m e i d a ,M .M .S i l v a ,J .M .\nMeseguer-Dueñas,J.L.G.Ribelles,S.Lanceros-Méndez,C.M.Costa,\nSustainableMater.Technol. 2019,21,e00104.\n[28] a)P.Costa,J.Silva,V.Sencadas,C.M.Costa,F.W.J.VanHattum,J.G.\nRocha, S. Lanceros-Mendez, Carbon2009,47, 2590; b) J. Vicente, P.\nCosta,S.Lanceros-Mendez,J.M.Abete,A.I

In [46]:
openai_rag_chain_with_source.invoke("What is the yield stress value or unit?")

2024-11-06T17:02:28.159Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2024-11-06T17:02:29.479Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'context': [Document(metadata={'source': 'data/Adv Materials Inter - 2018 - Rauzan - A Printing‐Centric Approach to the Electrostatic Modification of Polymer Clay.pdf', 'row_page': 5, '_id': '81f0eabfa59f4a2997d4d75efb1c4076', '_collection_name': 'delta-collection'}, page_content='www.advancedsciencenews.com© 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim 1701579 (6 of 8)\nwww.advmatinterfaces.deThe complex moduli of the composite inks were measured \nwith a 109.5 µ m gap to determine how much pressure must \nbe applied to the material before it flows. The minimum \nstress to induce flow, or yield stress, is defined as the stress \nat G ʹ–G″ crossover (Figure 4c), and those for NIPAM/Lap \nand NIPAM/Lap/NaPyrPh are 1000 and 300 Pa, respectively. The yield stress measured in the bulk rheology can be com-\npared with the apparent shear stress applied at the wall of \nthe nozzle (\nτy) during extrusion of filament as calculated by \nEquation (1)PP r\nL()\n2flow atmτ=−\nγ  (1)\nwhere Pfl

In [47]:
openai_rag_chain_with_source.invoke("Is there any epoxy, epoxy-based resin?")

2024-11-06T17:02:38.844Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2024-11-06T17:02:40.575Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'context': [Document(metadata={'source': 'data/c7sm02362f.pdf', 'row_page': 0, '_id': '81a838c271e649fab6ecdedcabbb06c2', '_collection_name': 'delta-collection'}, page_content='the past, epoxy thermosets were directly printed using epoxy\ndiacrylate viaphoto-induced radical polymerization or epoxy\noligomer viaUV light-induced cationic polymerization.\n17Although\nt h eh i g hm o d u l u sa n dy i e l ds t r e s sc a nb eo b t a i n e d ,t h es t r a i na tbreak is low at room temperatur e. In addition, the cationic\ninitiators used in photocurable epoxy are more expensive, makingthem hard to use in large scale applications. As an alternative,other low-cost 3D printing technologies such as the direct-inkwrite (DIW) approach\n18,19have been developed to print thermally\ncurable epoxy resin. In the work by Lewis’ group,19the cellular\nstructures (in a 2D pattern and with several layers in thethickness direction) were printed using epoxy compositesink reinforced by chopped carbon fibers 

In [48]:
openai_rag_chain_with_source.invoke("What was the print temperature, speed, nozzle diameter?")

2024-11-06T17:02:43.773Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2024-11-06T17:02:46.069Z INFO    : httpx - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'context': [Document(metadata={'source': 'data/He_2022_Int._J._Extrem._Manuf._4_015301.pdf', 'row_page': 11, '_id': '9087b53a9aca44fca4ee8b55ba9f48b0', '_collection_name': 'delta-collection'}, page_content='using a rheometer (TA Instruments, AR-G2, New Castle, DE,\nUSA).Therheometerwasequippedwithparallelplateswitha\ndiameterof20mmdiameter.Totesttheinkviscosity,theresin\nwas filled in the 1 mm gap between the two plates. The shear\nrate changed between 10−3and 200 s−1. To determine the\nmodulus, the parallel plates oscillated at 1 Hz, and the stress\nlevel ranged between 0.01 Pa and 104Pa.\n5.2. DIW 3D printing\nThe 3D printing was performed using a customized DIW\nprinter. The printable ink was loaded in a plastic syringe with\nadepositionnozzle(1.63mmdiameter).Thesyringewascon-\nnected to a digital pump (Ultimus V high precision dispenser,\nelectron fusion devices (EFD)) to control the deposition pres-\nsure at 80 psi. A Makebot moving stage was used to control\nthe motion of the de

## RAG Steps

1. Prepare data 
2. Create a vector store and insert into db
3. Search the vector store and retrieve relevant documents
4. Call LLM with the user query and the retrieved documents
4. Return the LLM response to the user