Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do you create combability with ConversationalRetrievalChain #58

Closed
Haste171 opened this issue May 22, 2023 · 4 comments
Closed

How do you create combability with ConversationalRetrievalChain #58

Haste171 opened this issue May 22, 2023 · 4 comments

Comments

@Haste171
Copy link

The following code snippet is an example of how you can stream a response from Langchain's ConversationalRetrievalChain into the console but I don't understand how you can add compatibility to Lanarky. This documentation doesn't make a whole lot of sense to me: https://lanarky.readthedocs.io/en/latest/advanced/custom_callbacks.html

from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.llms import OpenAI
from langchain.chains import ConversationalRetrievalChain
import pinecone
from langchain.vectorstores import Pinecone
from langchain.prompts.prompt import PromptTemplate

from langchain.chains.llm import LLMChain
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.chains.question_answering import load_qa_chain
from langchain.chat_models import ChatOpenAI

OPENAI_API_KEY = '...'
PINECONE_API_KEY = '...' # replace with your key
PINECONE_ENV = '...' # replace with your environment
PINECONE_INDEX = '...' # replace with your index name

# Construct a ConversationalRetrievalChain with a streaming llm for combine docs
# and a separate, non-streaming llm for question generation
llm = OpenAI(temperature=0, openai_api_key=OPENAI_API_KEY)
# streaming_llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()], temperature=0, openai_api_key=OPENAI_API_KEY)
streaming_llm = ChatOpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()], openai_api_key=OPENAI_API_KEY, temperature=0, verbose=True)

QA_V2 = """You are a helpful AI assistant. Use the following pieces of context to answer the question at the end.
# If you don't know the answer, just say you don't know. DO NOT try to make up an answer.
# If the question is not related to the context, politely respond that you are tuned to only answer questions that are related to the context.
# Use as much detail when as possible when responding.

# {context}

# Question: {question}
# All answers should be in MARKDOWN (.md) Format:"""

qap = PromptTemplate(
    template=QA_V2, input_variables=["context", "question"]
)

CD_V2 = """Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question.

Chat History:
{chat_history}
Follow Up Input: {question}
All answers should be in MARKDOWN (.md) Format:
Standalone question:"""

cdp = PromptTemplate.from_template(CD_V2)



question_generator = LLMChain(llm=llm, prompt=cdp)
doc_chain = load_qa_chain(streaming_llm, chain_type="stuff", prompt=qap)


pinecone.init(api_key=PINECONE_API_KEY,environment=PINECONE_ENV)
embeddings = OpenAIEmbeddings(model='text-embedding-ada-002', openai_api_key=OPENAI_API_KEY)
vectorstore = Pinecone.from_existing_index(index_name=PINECONE_INDEX, embedding=embeddings, text_key='text', namespace='testing_rtd1')

qa = ConversationalRetrievalChain(retriever=vectorstore.as_retriever(), combine_docs_chain=doc_chain, return_source_documents=True, question_generator=question_generator)


chat_history = []
query = input('Enter Question: ')
result = qa({"question": query, "chat_history": chat_history})
@Haste171
Copy link
Author

I guess my question is how do you change the inputs and outputs to use Lanarky and FastAPI...

@Haste171
Copy link
Author

Figured out the issue.

@talhaanwarch
Copy link

@Haste171 it would be great if you post the solution too

@brejz
Copy link

brejz commented Jul 30, 2023

@talhaanwarch I managed to get this working with following. I am using Weaviate as a vector store. I struggled quite alot trying to get this work, and im still not 100% sure why it works with this library instead of just using langchain, will need to dig into their codebase abit more.

In the meantime you can refer to this example:

from dotenv import load_dotenv
from fastapi import FastAPI
from fastapi.templating import Jinja2Templates
from langchain.chains import ConversationalRetrievalChain, LLMChain
from langchain.chains.conversational_retrieval.prompts import CONDENSE_QUESTION_PROMPT
from langchain.chains.question_answering import load_qa_chain
from langchain.chat_models import ChatOpenAI
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.vectorstores.weaviate import Weaviate
from lanarky import LangchainRouter
import weaviate
from langchain.prompts import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder,
    SystemMessagePromptTemplate,
    PromptTemplate,
)
from langchain.memory import ConversationBufferMemory
from langchain.chains.conversational_retrieval.prompts import CONDENSE_QUESTION_PROMPT


load_dotenv()

app = FastAPI(title="ConversationalRetrievalChainDemo")


chatTemplate = """
Answer the question based on the chat history(delimited by <hs></hs>) and context(delimited by <ctx> </ctx>) below.
-----------
<ctx>
{context}
</ctx>
-----------
<hs>
{chat_history}
</hs>
-----------
Question: {question}
Answer:
"""

PROMPT = PromptTemplate(
    input_variables=["context", "question", "chat_history"], template=chatTemplate
)


def create_chain():
    weaviate_client = weaviate.Client("http://localhost:8080")

    vectorstore: Any = Weaviate(weaviate_client, "Idx_664773d4e6", "text")

    question_generator = LLMChain(
        llm=ChatOpenAI(
            temperature=0,
            streaming=True,
        ),
        prompt=PROMPT,
    )

    doc_chain = load_qa_chain(
        llm=ChatOpenAI(
            temperature=0,
            streaming=True,
        ),
        chain_type="stuff",
    )

    memory = ConversationBufferMemory(
        return_messages=True,
        memory_key="chat_history",
        max_token_limit=20,
        prompt=PROMPT,
    )

    chain = ConversationalRetrievalChain.from_llm(
        llm=ChatOpenAI(
            temperature=0,
            streaming=True,
        ),
        retriever=vectorstore.as_retriever(),
        memory=memory,
        combine_docs_chain_kwargs={"prompt": PROMPT},
        verbose=True,
    )

    return chain


chain = create_chain()


langchain_router = LangchainRouter(
    langchain_url="/chat", langchain_object=chain, streaming_mode=0
)


app.include_router(langchain_router)

if __name__ == "__main__":
    import uvicorn

    uvicorn.run(host="0.0.0.0", port=8000, app=app)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants