This notebook shows how to do Question Answering using Langchain + Weaviate:

Some imports:

In [None]:
import os

import weaviate
from langchain.chains.qa_with_sources import load_qa_with_sources_chain
from langchain.docstore.document import Document
from langchain.document_loaders import GutenbergLoader
from langchain.embeddings.cohere import CohereEmbeddings
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma, Weaviate
from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch

Create a dataset:

In [None]:
# Grimms' Fairy Tales by Jacob Grimm and Wilhelm Grimm
loader = GutenbergLoader("https://www.gutenberg.org/files/2591/2591-0.txt")

documents = loader.load()
text_splitter = CharacterTextSplitter(
    chunk_size=500, chunk_overlap=0, length_function=len
)

docs = text_splitter.split_documents(documents)


Set up weaviate:

In [None]:
WEAVIATE_URL = "http://weaviate:8080"
client = weaviate.Client(
    url=WEAVIATE_URL,
    additional_headers={"X-OpenAI-Api-Key": os.environ["OPENAI_API_KEY"]},
)


In [None]:
client.schema.delete_all()
client.schema.get()
schema = {
    "classes": [
        {
            "class": "Paragraph",
            "description": "A written paragraph",
            "vectorizer": "text2vec-openai",
            "moduleConfig": {"text2vec-openai": {"model": "ada", "type": "text"}},
            "properties": [
                {
                    "dataType": ["text"],
                    "description": "The content of the paragraph",
                    "moduleConfig": {
                        "text2vec-openai": {
                            "skip": False,
                            "vectorizePropertyName": False,
                        }
                    },
                    "name": "content",
                },
            ],
        },
    ]
}

client.schema.create(schema)


In [None]:
vectorstore = Weaviate(client, "Paragraph", "content", ["source"])


Store the docs:

In [None]:
text_meta_pair = [(doc.page_content, doc.metadata) for doc in docs]

texts, meta = list(zip(*text_meta_pair))

vectorstore.add_texts(texts, meta)


['7ada8598-fd83-4ead-994b-ef6fa79d1aac',
 'fb98e3e8-95c1-4bdb-82c2-c956dbf44d2e',
 'b6d9949e-3ba9-4913-b434-73ac439f6494',
 'd96c66ef-ce9d-4cb1-a5c8-5e9f430ff176',
 '53fda11a-9ccc-49d2-bf2c-95c0d0b63202',
 '0d5466e5-c183-45b2-8b98-0ae2a9055170',
 'f9fdac74-8388-45c1-9856-cbe408d1a1e5',
 '18d46baa-8b1c-4fd1-a6ec-74dadf1af85f',
 '389c5c05-02ed-4b01-9b37-c1699c102a1a',
 '241b368a-597a-4636-97ef-5fb483b1f5be',
 '6d47dc98-9ba3-4f3b-8ee2-e2f5cda7db97',
 '348b3098-74dc-4299-87a9-1fc7e074f6d8',
 'fcd6085d-f513-45f7-b3b0-23949f5543a7',
 '3f1ec97a-1224-46f4-84a1-95564215ea7d',
 '749bf142-d957-48e0-ac1f-fc230417d5a4',
 '77cd74ef-8b33-4c39-9085-562fcfc13203',
 '7f2b667e-5fd9-4509-9254-87d4be6cd79c',
 '3f225183-c45e-477d-a8ee-19d495fe7145',
 '22751080-6561-45f1-a8d8-8fedaca3cbe7',
 '0011d299-3938-4f0e-80ce-c6f9c0f5328b',
 '82f726ae-4c0b-4a6e-b145-8af4ffb92f99',
 'c8d9da31-7735-4612-a415-05101b29a11d',
 '5b7a5e3f-3d90-49ef-a241-96cc222b7e80',
 '26d24b76-382d-4cc4-9475-318591abd6e2',
 '9e129a4e-b3c8-

Build the QnA chain:

In [9]:
template = """Given the following extracted parts of a long document and a question, create a final answer with references ("SOURCES"). 
If you don't know the answer, just say that you don't know. Don't try to make up an answer.
ALWAYS return a "SOURCES" part in your answer.

QUESTION: {question}
=========
{summaries}
=========
"""
PROMPT = PromptTemplate(template=template, input_variables=["summaries", "question"])

chain = load_qa_with_sources_chain(
    OpenAI(temperature=0), chain_type="stuff", prompt=PROMPT
)


Ask a question:

In [11]:
question = "Why did the witch angry at the mirror?"

# retrieve
docs = vectorstore.similarity_search(question, top_k=5)

# create answer
answer = chain({"input_documents": docs, "question": question})


Given the following extracted parts of a long document and a question, create a final answer with references ("SOURCES"). 
If you don't know the answer, just say that you don't know. Don't try to make up an answer.
ALWAYS return a "SOURCES" part in your answer.

QUESTION: Why did the witch angry at the mirror?
Content: said the young lady; ‘he has already lost his wealth.’ Then the witch


was very angry, and said, ‘Such a cloak is a very rare and wonderful


thing, and I must and will have it.’ So she did as the old woman told


her, and set herself at the window, and looked about the country and


seemed very sorrowful; then the huntsman said, ‘What makes you so sad?’


‘Alas! dear sir,’ said she, ‘yonder lies the granite rock where all the


costly diamonds grow, and I want so much to go there, that whenever I
Source: https://www.gutenberg.org/files/2591/2591-0.txt

Content: swimming in the middle of it. The witch placed herself on the shore,


threw breadcrumbs in, and went to endl

Show the answer:

In [16]:
print(answer["output_text"])


The witch was angry at the mirror because it told her that someone else was fairer than her. When she asked the mirror "Of all the ladies in the land, who is fairest, tell me, who?", the mirror replied "Thou, lady, art loveliest here, I ween; But lovelier far is the new-made queen." This made the witch so angry that she set out to see the bride. 

SOURCES: https://www.gutenberg.org/files/2591/2591-0.txt


Show the input documents:

In [18]:
for doc in answer["input_documents"]:
    print(doc.page_content)
    print("*" * 80)


said the young lady; ‘he has already lost his wealth.’ Then the witch


was very angry, and said, ‘Such a cloak is a very rare and wonderful


thing, and I must and will have it.’ So she did as the old woman told


her, and set herself at the window, and looked about the country and


seemed very sorrowful; then the huntsman said, ‘What makes you so sad?’


‘Alas! dear sir,’ said she, ‘yonder lies the granite rock where all the


costly diamonds grow, and I want so much to go there, that whenever I
********************************************************************************
swimming in the middle of it. The witch placed herself on the shore,


threw breadcrumbs in, and went to endless trouble to entice the duck;


but the duck did not let herself be enticed, and the old woman had to


go home at night as she had come. At this the girl and her sweetheart


Roland resumed their natural shapes again, and they walked on the whole


night until daybreak. Then the maiden changed herself 