# Test Docker-Compose Setup

In this notebook, we'll test the docker-compose setup. First, open a terminal, 
`cd` into the root directory of this repo and run the following command (c.f.
[README.md](../README.md) for more details):
```bash
# .env must be available at the root as shown in .env.example
docker compose up -d
```

You should see a series of logs indicating a successful run of your services.
If so, run the following cells to test your services.

This notebook is designed to be minimalist and as standalone as possible, so 
don't hesitate to re-use it and adjust it to your needs.

## Setup

### Imports

In [1]:
# import required libs
import arxiv
from rich import print
import qdrant_client
from langchain_community.embeddings import HuggingFaceHubEmbeddings
from langchain.schema.document import Document
from langchain.vectorstores import Qdrant
from langchain_community.llms import VLLMOpenAI
from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate, PromptTemplate
from langchain.chains import RetrievalQA

### Config

In [2]:
# set constants
QDRANT_URL = "http://localhost"
QDRANT_PORT = 6335
EMBED_API_URL = "http://localhost:8081"
GEN_API_URL = "http://localhost:8000/v1"

In [3]:
import dotenv
envvars = dotenv.dotenv_values()

## Connect to Microservices

In [4]:
# connect to embed_api
embed_model = HuggingFaceHubEmbeddings(model=EMBED_API_URL)

# test it !
text = "What is deep learning?"
query_result = embed_model.embed_query(text)
print(f"Embeddings from '{envvars['EMBED_MODEL_ID']}'")
query_result[:3]

  from .autonotebook import tqdm as notebook_tqdm


[-0.0076529826, -0.02522551, -0.024398882]

In [5]:
# connect to gen_api
llm = VLLMOpenAI(
    openai_api_key="EMPTY",
    openai_api_base=GEN_API_URL,
    model_name=envvars["GEN_MODEL_ID"],
    # model_kwargs={"stop": ["."]},
)

# test it !
answer = llm.predict("Write me a poem about Machine Learning.")
print(f"Generation from '{envvars['GEN_MODEL_ID']}'")
print(answer)

In [6]:
# connect to qdrant
client = qdrant_client.QdrantClient(
    url=QDRANT_URL,
    port=6333
    # api_key="<qdrant-api-key>", # For Qdrant Cloud, None for local instance
)

# test it
collections = client.get_collections()
for c in collections.collections:
    client.delete_collection(c.name)
    print(f"Deleted collection '{c.name}'")

## Data Collection

In [7]:
# run a query through Arxiv API
search = arxiv.Search(
    query="ti:deep AND ti:learning",
    max_results=300,
    sort_by=arxiv.SortCriterion.Relevance,
)

# display some results
for result in list(search.results())[:2]:
    print(f"--- '{result.title}' [{result.published}] ---")
    print(result.summary, "\n")

  for result in list(search.results())[:2]:


## Feed Vectorstore

In [8]:
# create Langchain documents from query results
docs = [
    Document(
        page_content=result.summary,
        metadata={
            "title": result.title,
            "publish_date": result.published,
        },
    )
    for i, result in enumerate(search.results())
]

  for i, result in enumerate(search.results())


In [9]:
# sore results in qdrant
qdrant = Qdrant.from_documents(
    docs, 
    embed_model, 
    url=QDRANT_URL, 
    # prefer_grpc=True, 
    collection_name="arxiv_documents",
    batch_size=16,
)

## QA using Qdrant & vLLM

In [10]:
# from langchain import hub
# prompt = hub.pull("rlm/rag-prompt", api_url="https://api.hub.langchain.com")

prompt = ChatPromptTemplate(
    input_variables=['context', 'question'], 
    messages=[
        HumanMessagePromptTemplate(
            prompt=PromptTemplate(
                input_variables=['context', 'question'], 
                template="You are an assistant for question-answering tasks. " \
                    "Use the following pieces of retrieved context to answer " \
                    "the question. If you don't know the answer, just say "\
                    "that you don't know. Use three sentences maximum and "\
                    "keep the answer concise.\nQuestion: {question} "\
                    "\nContext: {context} \nAnswer:"
            )
        )
    ]
)

In [11]:
# declare Q&A chain
qa_chain = RetrievalQA.from_chain_type(
    llm=llm, 
    retriever=qdrant.as_retriever(
        search_type="similarity",
        search_kwargs={
            "k": 4,
            "filter": None,
        }
    ), 
    chain_type_kwargs={"prompt": prompt},
    return_source_documents=True,
)

In [12]:
# run Q&A chain
question = "Give examples of how Vision Transformers can be applied to real-life problems."
result = qa_chain({"query": question})

# display the answer
print(result["result"])

In [13]:
print(result)