# 3. LLM Pipeline
- based on [LangChain QA](https://python.langchain.com/docs/use_cases/question_answering/) and [LangChain RAG over in-memory documents](https://python.langchain.com/docs/use_cases/question_answering/in_memory_question_answering)
- **make sure to place your OpenAI API key in `.env`**

### ToDos
- use LLM to generate answer to user
- retrieve all (relevant) chunks from all retrieved sources
    - check for each chunk whether it is relevant for the user query
    - Chain --> let LLM summarize/decide for relevance
    - use [refine](https://python.langchain.com/docs/modules/chains/document/refine) or [map reduce](https://python.langchain.com/docs/modules/chains/document/map_reduce)
- rewrite user query / get additional queries for more relevant retrieval
    - see [MultiQueryRetriever](https://python.langchain.com/docs/modules/data_connection/retrievers/MultiQueryRetriever)


## Prerequisites

In [1]:
import os
import openai

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

openai.api_key = os.environ['OPENAI_API_KEY']

In [2]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma

embedding_function = OpenAIEmbeddings(show_progress_bar=True)

### Load database from disk

In [3]:
vectordb = Chroma(persist_directory="./chroma_db", embedding_function=embedding_function)

In [4]:
# Entries in database is equivalent to number of chunks created previously
print(vectordb._collection.count())

57834


## Query the database

In [5]:
query = "wer setzt sich mit augmented reality auseinander?"

### Using `semantic similarity search`
- we can specify `k` - the number of documents (here: chunks) retrieved

In [6]:
results_sss = vectordb.similarity_search(query, k=4)

  0%|          | 0/1 [00:00<?, ?it/s]

In [7]:
# Example of first result retrieved
print(results_sss[0].page_content)

Augmented Reality, Mixed Reality, Virtual Reality, Information Visualization, Digital Health, Mobile Apps, Audio & Music Processing. Aus- und Fortbildung Dr. oec. publ. UniZHResearch Assistant at Multimedia Lab / Institut für Informatik / Universität Zürich Beruflicher Werdegang Gründer und ehem. Geschäftsführer der Perspectix AG Mitglied in Netzwerken IEEE ACM Schweizerische Gesellschaft für Medizinische Informatik (SGMI) Projekte Mixed Reality im Anlagenbau / ProjektleiterIn / Projekt laufend


In [8]:
print(results_sss[0].metadata["source"])

acke


In [9]:
# Number of different sources retrieved
sources = set()
for result in results_sss:
    sources.add(result.metadata["source"])

print(len(sources))
print(sources)

3
{'webw', 'acke', 'weei'}


In [10]:
len(results_sss)

4

### Using `maximum marginal relevance`
- strives to achieve both relevance to the query and diversity among the results

In [11]:
results_mmr = vectordb.max_marginal_relevance_search(query, k=4)

  0%|          | 0/1 [00:00<?, ?it/s]

In [12]:
# Number of different sources retrieved
sources_mmr = set()
for result in results_mmr:
    sources_mmr.add(result.metadata["source"])

print(len(sources_mmr))
print(sources_mmr)

4
{'pach', 'acke', 'shmt', 'mede'}


In [13]:
results_mmr[2]

Document(page_content='von Windenergieanlagen mit Augmented Reality / Teammitglied / Projekt abgeschlossen Wasserregime Sur Chaunt Blais, St. Moritz / Teammitglied / Projekt abgeschlossen Beschneiung Minschuns - Umweltverträglichkeitsbericht (UVB) / Teammitglied / Projekt abgeschlossen Umweltverträglichkeitsbericht (UVB) Umfahrungsstrasse Sta. Maria, Kt. GR / Teammitglied / Projekt abgeschlossen Publikationen Beiträge in wissenschaftlicher Zeitschrift, peer-reviewed Bergauer, Miro; Dembicz, Iwona; Boch, Steffen;', metadata={'source': 'pach'})

## Approach 1: Building the context for our LLM prompt based on chunks
- we can retrieve all chunks for a retrieved source (`"shorthandSymbol"`) and use these as inputs to `refine` our context / prompt
- **TBD**

In [28]:
complete_query = vectordb.get(where={"source": results_mmr[0].metadata["source"]})
all_chunks_from_single_source = complete_query["documents"]

In [29]:
print(len(all_chunks_from_single_source))
print(all_chunks_from_single_source[0])

63
Prof. Dr. Irina Nast Prof. Dr. Irina Nast ZHAW Gesundheit Institut für Physiotherapie Katharina-Sulzer-Platz 9 8400 Winterthur +41 (0) 58 934 65 16 irina.nast@zhaw.ch Persönliches Profil Tätigkeit an der ZHAW Dozentin und wissenschaftliche Mitarbeiterinwww.zhaw.ch/de/gesundheit/institute-fachstelle/institut-fuer-physiotherapie.html Lehrtätigkeit in der Weiterbildung Reflektierte Praxis - Wissenschaft verstehen WBK Evidence-based Health Care: Methodische Grundlagen MAS Managed Health Care MAS


## Approach 2: Adding all user profiles to a custom chatbot with `embedchain`
- see in *"4. Chat with a profile.ipynb"* on more background

In [14]:
from embedchain import App
zhaw_user_bot = App()

#### Add the source profile to the chatbot (based on `"shorthandSymbol"`)

In [15]:
for source in sources_mmr:
    zhaw_user_bot.add(f"https://www.zhaw.ch/de/ueber-uns/person/{source}")

Inserting batches from 0 to 20 in chromadb
Successfully saved https://www.zhaw.ch/de/ueber-uns/person/pach (DataType.WEB_PAGE). New chunks count: 20
Inserting batches from 0 to 16 in chromadb
Successfully saved https://www.zhaw.ch/de/ueber-uns/person/acke (DataType.WEB_PAGE). New chunks count: 16
Inserting batches from 0 to 9 in chromadb
Successfully saved https://www.zhaw.ch/de/ueber-uns/person/shmt (DataType.WEB_PAGE). New chunks count: 9
Inserting batches from 0 to 47 in chromadb
Successfully saved https://www.zhaw.ch/de/ueber-uns/person/mede (DataType.WEB_PAGE). New chunks count: 47


#### Query the chatbot

In [23]:
zhaw_user_bot.query("What did Mr. Ackermann work on so far?")

'Mr. Ackermann has worked on Human-Centered Computing at ZHAW Zürcher Hochschule für Angewandte Wissenschaften.'

**Caveats**

- approach probably bloated and unnecessary, but proof of concept
- prompt needs to be changed to German
- retrieval does not work as intended
- no follow-up questions / no memory used**