This notebook shows how to do Question Answering using Langchain + Weaviate:

Some imports:

In [1]:
import os

import weaviate
from langchain.chains.qa_with_sources import load_qa_with_sources_chain
from langchain.document_loaders import GutenbergLoader
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Weaviate


Create a dataset:

In [2]:
# Grimms' Fairy Tales by Jacob Grimm and Wilhelm Grimm
loader = GutenbergLoader("https://www.gutenberg.org/files/2591/2591-0.txt")

documents = loader.load()
text_splitter = CharacterTextSplitter(
    chunk_size=500, chunk_overlap=0, length_function=len
)

docs = text_splitter.split_documents(documents)


Set up weaviate:

In [3]:
WEAVIATE_URL = "http://weaviate:8080"
client = weaviate.Client(
    url=WEAVIATE_URL,
    additional_headers={"X-OpenAI-Api-Key": os.environ["OPENAI_API_KEY"]},
)


In [4]:
client.schema.delete_all()
client.schema.get()
schema = {
    "classes": [
        {
            "class": "Paragraph",
            "description": "A written paragraph",
            "vectorizer": "text2vec-openai",
            "moduleConfig": {"text2vec-openai": {"model": "ada", "type": "text"}},
            "properties": [
                {
                    "dataType": ["text"],
                    "description": "The content of the paragraph",
                    "moduleConfig": {
                        "text2vec-openai": {
                            "skip": False,
                            "vectorizePropertyName": False,
                        }
                    },
                    "name": "content",
                },
            ],
        },
    ]
}

client.schema.create(schema)


In [5]:
vectorstore = Weaviate(client, "Paragraph", "content", ["source"])


Store the docs:

In [6]:
text_meta_pair = [(doc.page_content, doc.metadata) for doc in docs]

texts, meta = list(zip(*text_meta_pair))

vectorstore.add_texts(texts, meta)


{'error': [{'message': 'update vector: failed with status: 500 error: The server had an error while processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if the error persists. (Please include the request ID 21172893251aa916c9fc019475c8d6b3 in your message.)'}]}
{'error': [{'message': 'update vector: failed with status: 500 error: The server had an error while processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if the error persists. (Please include the request ID 3c7297370852bce09e08ca27c8e5dfac in your message.)'}]}
{'error': [{'message': 'update vector: failed with status: 500 error: The server had an error while processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if the error persists. (Please include the request ID 337b7ae029fa283d5af4ce45553e72c5 in your mes



['0095e77d-f487-4332-bc72-d2b6a78bc576',
 '0c26d40a-4bea-457f-b96f-864b7d25ba4e',
 '156f01a9-e72e-4657-b6d7-4e45b297e2ae',
 '341c6a88-18ef-4b76-afc2-d086e4050e74',
 'a9f815e2-c6a2-403c-a022-98bf27e34071',
 'bed0f6bd-9a99-4104-9856-1c56e2458f14',
 'd3ea2684-c8af-4221-adce-2a85bdcbe208',
 'a6406b20-77df-45cf-b9cc-e8f76bad22b0',
 'cf80778a-b1f8-462b-9816-c87ed2b7edb7',
 '96285ab8-f681-4065-b2a6-48908fe13bc0',
 '62dec518-e42b-429d-b217-45473de2f66b',
 '53c1ee42-70c9-4119-a2cb-b98a1c812029',
 '92f9d26f-1264-4057-9adf-2100c32f22d6',
 '5ae43cef-1bf4-4a03-b638-3da8966cb071',
 'ff5b0a49-0492-4a65-82bc-d904df41684a',
 'c084aab3-9be1-44d2-85a2-073d739ef031',
 '7b72bfa7-8acf-4f08-951b-570c8cd53575',
 '972ecdf5-56db-4345-9f33-5a0425b36d90',
 '8832efd0-2ede-4085-827b-750dac2d10be',
 'd3a66557-3cfc-4c42-b956-747c5217d51f',
 '4c283c92-4457-4956-b91c-10a8063ed53f',
 '5c1509e5-d743-44c9-8eaa-ab128158b7c3',
 '30ebb595-3a13-4d29-bf26-12ee5cf2f09a',
 'bc8ff3f8-00d4-4efb-8fa1-83e5244606e3',
 'fecbb87b-62b4-

Build the QnA chain:

In [7]:
template = """Given the following extracted parts of a long document and a question, create a final answer with references ("SOURCES"). 
If you don't know the answer, just say that you don't know. Don't try to make up an answer.
ALWAYS return a "SOURCES" part in your answer.

QUESTION: {question}
=========
{summaries}
=========
"""
PROMPT = PromptTemplate(template=template, input_variables=[
                        "summaries", "question"])

chain = load_qa_with_sources_chain(
    OpenAI(temperature=0), chain_type="stuff", prompt=PROMPT
)


Ask a question:

In [8]:
question = "Why did the witch angry at the mirror?"

# retrieve
docs = vectorstore.similarity_search(question, top_k=5)

# create answer
answer = chain({"input_documents": docs, "question": question})


Show the answer:

In [9]:
print(answer["output_text"])


The question does not appear to be related to any of the content provided. Therefore, I do not know the answer to the question. SOURCES: https://www.gutenberg.org/files/2591/2591-0.txt


Show the input documents:

In [10]:
for doc in answer["input_documents"]:
    print(doc.page_content)
    print("*" * 80)


swimming in the middle of it. The witch placed herself on the shore,


threw breadcrumbs in, and went to endless trouble to entice the duck;


but the duck did not let herself be enticed, and the old woman had to


go home at night as she had come. At this the girl and her sweetheart


Roland resumed their natural shapes again, and they walked on the whole


night until daybreak. Then the maiden changed herself into a beautiful


flower which stood in the midst of a briar hedge, and her sweetheart
********************************************************************************
and as she was dressing herself in fine rich clothes, she looked in the


glass and said:





 ‘Tell me, glass, tell me true!


  Of all the ladies in the land,


  Who is fairest, tell me, who?’





And the glass answered:





 ‘Thou, lady, art loveliest here, I ween;


  But lovelier far is the new-made queen.’





When she heard this she started with rage; but her envy and curiosity


were so great, that s