# 3. LLM Pipeline
- based on [LangChain QA](https://python.langchain.com/docs/use_cases/question_answering/) and [LangChain RAG over in-memory documents](https://python.langchain.com/docs/use_cases/question_answering/in_memory_question_answering)
- **make sure to place your OpenAI API key in `.env`**

### ToDos
- use LLM to generate answer to user
- retrieve all (relevant) chunks from all retrieved sources
    - check for each chunk whether it is relevant for the user query
    - Chain --> let LLM summarize/decide for relevance
    - use [refine](https://python.langchain.com/docs/modules/chains/document/refine) or [map reduce](https://python.langchain.com/docs/modules/chains/document/map_reduce)
- rewrite user query / get additional queries for more relevant retrieval
    - see [MultiQueryRetriever](https://python.langchain.com/docs/modules/data_connection/retrievers/MultiQueryRetriever)


## Prerequisites

In [1]:
import os
import openai

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

openai.api_key = os.environ['OPENAI_API_KEY']

In [2]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma

embedding_function = OpenAIEmbeddings(show_progress_bar=True)

### Load database from disk

In [3]:
vectordb = Chroma(persist_directory="./chroma_db", embedding_function=embedding_function)

In [4]:
# Entries in database is equivalent to number of chunks created previously
print(vectordb._collection.count())

57834


## Query the database

In [5]:
query = "wer setzt sich mit augmented reality auseinander?"

### Using `semantic similarity search`
- we can specify `k` - the number of documents (here: chunks) retrieved

In [6]:
results_sss = vectordb.similarity_search(query, k=4)

  0%|          | 0/1 [00:00<?, ?it/s]

In [7]:
# Example of first result retrieved
print(results_sss[0].page_content)

Augmented Reality, Mixed Reality, Virtual Reality, Information Visualization, Digital Health, Mobile Apps, Audio & Music Processing. Aus- und Fortbildung Dr. oec. publ. UniZHResearch Assistant at Multimedia Lab / Institut für Informatik / Universität Zürich Beruflicher Werdegang Gründer und ehem. Geschäftsführer der Perspectix AG Mitglied in Netzwerken IEEE ACM Schweizerische Gesellschaft für Medizinische Informatik (SGMI) Projekte Mixed Reality im Anlagenbau / ProjektleiterIn / Projekt laufend


In [8]:
print(results_sss[0].metadata["source"])

acke


In [9]:
# Number of different sources retrieved
sources = set()
for result in results_sss:
    sources.add(result.metadata["source"])

print(len(sources))
print(sources)

3
{'weei', 'acke', 'webw'}


In [10]:
len(results_sss)

4

### Using `maximum marginal relevance`
- strives to achieve both relevance to the query and diversity among the results

In [11]:
results_mmr = vectordb.max_marginal_relevance_search(query, k=4)

  0%|          | 0/1 [00:00<?, ?it/s]

In [12]:
# Number of different sources retrieved
sources_mmr = set()
for result in results_mmr:
    sources_mmr.add(result.metadata["source"])

print(len(sources_mmr))
print(sources_mmr)

4
{'shmt', 'mede', 'acke', 'pach'}


In [13]:
results_mmr[2]

Document(page_content='von Windenergieanlagen mit Augmented Reality / Teammitglied / Projekt abgeschlossen Wasserregime Sur Chaunt Blais, St. Moritz / Teammitglied / Projekt abgeschlossen Beschneiung Minschuns - Umweltverträglichkeitsbericht (UVB) / Teammitglied / Projekt abgeschlossen Umweltverträglichkeitsbericht (UVB) Umfahrungsstrasse Sta. Maria, Kt. GR / Teammitglied / Projekt abgeschlossen Publikationen Beiträge in wissenschaftlicher Zeitschrift, peer-reviewed Bergauer, Miro; Dembicz, Iwona; Boch, Steffen;', metadata={'source': 'pach'})

## Approach 1: Building the context for our LLM prompt based on chunks
- we can retrieve all chunks for a retrieved source (`"shorthandSymbol"`) and use these as inputs to `refine` our context / prompt
- **TBD**

In [47]:
complete_query = vectordb.get(where={"source": results_mmr[3].metadata["source"]})
all_chunks_from_single_source = complete_query["documents"]

In [None]:
print(complete_query)

In [48]:
print(len(all_chunks_from_single_source))
print(all_chunks_from_single_source[0])

13
Martin Schuler Martin Schuler ZHAW Angewandte Linguistik Institut für Übersetzen und Dolmetschen Theaterstrasse 15c 8400 Winterthur +41 (0) 58 934 62 21 martin.schuler@zhaw.ch Persönliches Profil Leitungsfunktion Leitung Usability-Labor Tätigkeit an der ZHAW Wiss. Mitarbeiter im Forschungs- und Arbeitsbereich Technikkommunikationaktuelle Lehrtätigkeit an der ZHAWVorlesungen- Usability mit Schwerpunkt Web-Usability- LokalisierungSeminare- Usability-Testing- Projekt in der Technischen


In [49]:
from langchain.schema import Document

In [50]:
chunk_list = []
i= 0
for chunk in all_chunks_from_single_source:
    metadata = {"source": i}
    chunk_list.append(Document(page_content=chunk, metadata=metadata))
    i += 1

In [51]:
len(chunk_list)

13

In [52]:
from langchain.chains.qa_with_sources.loading import load_qa_with_sources_chain
from langchain.llms import OpenAI

# Load the refine QA chain
chain = load_qa_with_sources_chain(OpenAI(), chain_type="refine", verbose=True)

# Define the input documents and the question
docs = chunk_list
question = f"antworte auf deutsch. deine antwort soll nur das enthalten, was zur beantwortung der frage beiträgt. wenn nichts dazu beiträgt, dann lasse deine antwort einfach leer. die frage lautet: warum wurde diese person gefunden mit dieser suchanfrage: {query}"

# Run the refine QA chain
result = chain({"input_documents": docs, "question": question}, return_only_outputs=True)

# Retrieve the output answer
answer = result["output_text"]

print(answer)



[1m> Entering new RefineDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mContext information is below. 
---------------------
Content: Martin Schuler Martin Schuler ZHAW Angewandte Linguistik Institut für Übersetzen und Dolmetschen Theaterstrasse 15c 8400 Winterthur +41 (0) 58 934 62 21 martin.schuler@zhaw.ch Persönliches Profil Leitungsfunktion Leitung Usability-Labor Tätigkeit an der ZHAW Wiss. Mitarbeiter im Forschungs- und Arbeitsbereich Technikkommunikationaktuelle Lehrtätigkeit an der ZHAWVorlesungen- Usability mit Schwerpunkt Web-Usability- LokalisierungSeminare- Usability-Testing- Projekt in der Technischen
Source: 0
---------------------
Given the context information and not prior knowledge, answer the question: antworte auf deutsch. deine antwort soll nur das enthalten, was zur beantwortung der frage beiträgt. wenn nichts dazu beiträgt, dann lasse deine antwort einfach leer. die frage lautet: warum wurde diese per

### Use our database as a retriever for LangChain
- see [Agent with retrieval tool](https://python.langchain.com/docs/use_cases/question_answering/conversational_retrieval_agents)

In [5]:
retriever = vectordb.as_retriever()

**Retriever tool**

In [7]:
from langchain.agents.agent_toolkits import create_retriever_tool

In [9]:
tool = create_retriever_tool(
    retriever, 
    "search_zhaw_employees",
    "Searches the user profiles of the ZHAW employees."
)
tools = [tool]

result["output"]**Agent constructor**

In [10]:
from langchain.agents.agent_toolkits import create_conversational_retrieval_agent

In [11]:
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(temperature = 0)

In [12]:
agent_executor = create_conversational_retrieval_agent(llm, tools, verbose=True)

In [13]:
result = agent_executor({"input": "hi, im bob"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mHello Bob! How can I assist you today?[0m

[1m> Finished chain.[0m


In [14]:
result["output"]

'Hello Bob! How can I assist you today?'

In [15]:
result = agent_executor({"input": "whats my name?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mYour name is Bob.[0m

[1m> Finished chain.[0m


In [16]:
result = agent_executor({"input": "is there someone at ZHAW who works on augmented reality?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `search_zhaw_employees` with `{'query': 'augmented reality'}`


[0m

  0%|          | 0/1 [00:00<?, ?it/s]

[36;1m[1;3m[Document(page_content='Augmented reality for the visually impaired : navigation aid and scene semantics for indoor use cases [Paper]. In: 2022 IEEE 12th International Conference on Consumer Electronics (ICCE-Berlin). 12th IEEE International Conference on Consumer Technology (ICCE-Berlin), Berlin, Germany, 2-6 September 2022. IEEE. Verfügbar unter: https://doi.org/10.1109/ICCE-Berlin 56473.2022.9937109', metadata={'source': 'fath'}), Document(page_content='Augmented Reality, Mixed Reality, Virtual Reality, Information Visualization, Digital Health, Mobile Apps, Audio & Music Processing. Aus- und Fortbildung Dr. oec. publ. UniZHResearch Assistant at Multimedia Lab / Institut für Informatik / Universität Zürich Beruflicher Werdegang Gründer und ehem. Geschäftsführer der Perspectix AG Mitglied in Netzwerken IEEE ACM Schweizerische Gesellschaft für Medizinische Informatik (SGMI) Projekte Mixed Reality im Anlagenbau / ProjektleiterIn / Projekt laufend', metadata={'source': 'ack

In [18]:
result = agent_executor({"input": "can you tell me more about the people working on augmented reality at ZHAW?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI apologize, but I don't have access to specific information about the individuals working on augmented reality at ZHAW. The search results only provided limited details about their research papers and projects. If you are interested in learning more about the people working on augmented reality at ZHAW, I recommend visiting the ZHAW website or contacting the university directly for more information.[0m

[1m> Finished chain.[0m


## Approach 2: Adding all user profiles to a custom chatbot with `embedchain`
- see in *"4. Chat with a profile.ipynb"* on more background

In [14]:
from embedchain import App
zhaw_user_bot = App()

#### Add the source profile to the chatbot (based on `"shorthandSymbol"`)

In [15]:
for source in sources_mmr:
    zhaw_user_bot.add(f"https://www.zhaw.ch/de/ueber-uns/person/{source}")

Inserting batches from 0 to 20 in chromadb
Successfully saved https://www.zhaw.ch/de/ueber-uns/person/pach (DataType.WEB_PAGE). New chunks count: 20
Inserting batches from 0 to 16 in chromadb
Successfully saved https://www.zhaw.ch/de/ueber-uns/person/acke (DataType.WEB_PAGE). New chunks count: 16
Inserting batches from 0 to 9 in chromadb
Successfully saved https://www.zhaw.ch/de/ueber-uns/person/shmt (DataType.WEB_PAGE). New chunks count: 9
Inserting batches from 0 to 47 in chromadb
Successfully saved https://www.zhaw.ch/de/ueber-uns/person/mede (DataType.WEB_PAGE). New chunks count: 47


#### Query the chatbot

In [23]:
zhaw_user_bot.query("What did Mr. Ackermann work on so far?")

'Mr. Ackermann has worked on Human-Centered Computing at ZHAW Zürcher Hochschule für Angewandte Wissenschaften.'

**Caveats**

- approach probably bloated and unnecessary, but proof of concept
- prompt needs to be changed to German
- retrieval does not work as intended
- no follow-up questions / no memory used**