Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change MultiQuery Prompt, Add Hybrid Search (BM25 + Embedding), Cohere Reranker & LLM Chain Filter #247

Merged
merged 9 commits into from
Jan 10, 2024

Conversation

davidgxue
Copy link
Collaborator

@davidgxue davidgxue commented Jan 4, 2024

Description

  • Changed MultiQueryRetriever's prompt and parameter so we now keep user's original query + 2 reworded queries (whereas 3 different reworded queries)
  • Added Hybrid Search with alpha = 0.5 (equal weight between BM25/TF-TID & embedding model) and retrieving 100 documents for each prompt
  • Added Cohere Reranker (keeping 10 documents) after documents are combined from the 3 prompts from multi-query retriever
  • Added LLM Chain Filter (using GPT 3.5) to double check each remaining 10 documents is relevant to the question asked
    • This essentially uses GPT 3.5 to ask if document X is relevant to the question (with a YES/NO) response
    • Added our custom boolean parser as the default one in LangChain can be too strict causing error raising in specific cases

Technical Changes

  • Added api/ask_astro/chains/custom_llm_filter_prompt.py because the default implementation of boolean parser in LangChain library enforces that the GPT-3.5 used must either output "YES" or "NO" or else it raises an error. I essentially took their boolean parser from here and re-implemented such that 1. we only check YES/NO is CONTAINED in the response (as sometimes LLM can output "YES.") 2. If neither, then don't raise an error but just return True (in our use case we just don't filter out/still consider this doc relevant).
  • Main changes are in api/ask_astro/chains/answer_question.py
  • Added cohere package in api/pyproject.toml

Notes

  • Need further more extensive evaluation and comparison on a full Q&A dataset to determine improvements (and potential regressions).
  • My initial test/evaluation of the same 74 questions that my teammates have used in the past can be found on langsmith here

@davidgxue davidgxue self-assigned this Jan 4, 2024
@davidgxue davidgxue linked an issue Jan 4, 2024 that may be closed by this pull request
@davidgxue davidgxue marked this pull request as ready for review January 4, 2024 02:29
@davidgxue davidgxue changed the title Change MultiQuery Prompt, Add Hybrid Search (BM25 + Embedding), Cohere Reranker and LLM Chain Filter Change MultiQuery Prompt, Add Hybrid Search (BM25 + Embedding), Cohere Reranker & LLM Chain Filter Jan 4, 2024
Copy link

cloudflare-pages bot commented Jan 4, 2024

Deploying with  Cloudflare Pages  Cloudflare Pages

Latest commit: 992beb5
Status: ✅  Deploy successful!
Preview URL: https://1fdf887b.ask-astro.pages.dev
Branch Preview URL: https://hybrid-search-reword-and-rer.ask-astro.pages.dev

View logs

from langchain_core.prompts import PromptTemplate


class CustomBooleanOutputParser(BaseOutputParser[bool]):
Copy link
Collaborator Author

@davidgxue davidgxue Jan 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this is a changed implementation from langchain. The original code looked like this https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/output_parsers/boolean.py

I implemented this parser because of an unfixed issue on LangChain here langchain-ai/langchain#11408 where their check on the Yes/NO is way too strict and throws unwanted errors during runtime.

Copy link
Collaborator

@sunank200 sunank200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vatsrahul1001 did you get a chance to test this? which weaviate index should be used @davidgxue ?

@vatsrahul1001
Copy link
Collaborator

vatsrahul1001 commented Jan 8, 2024

@vatsrahul1001 did you get a chance to test this? which weaviate index should be used @davidgxue ?

@sunank200 no I have not tested this end to end, Also as per Steven's commentcan you readthedocs stuff is withdrawn?

@sunank200
Copy link
Collaborator

@vatsrahul1001 did you get a chance to test this? which weaviate index should be used @davidgxue ?

@sunank200 no I have not tested this end to end, Also as per Steven's commentcan you readthedocs stuff is withdrawn?

@vatsrahul1001 yes. The readthedocs from astro-sdk is not there in database now

@vatsrahul1001
Copy link
Collaborator

@vatsrahul1001 did you get a chance to test this? which weaviate index should be used @davidgxue ?

@sunank200 no I have not tested this end to end, Also as per Steven's commentcan you readthedocs stuff is withdrawn?

@vatsrahul1001 yes. The readthedocs from astro-sdk is not there in database now

Ok, I will test it tomorrow then

@davidgxue
Copy link
Collaborator Author

Yes, I checked with Rahul and he will get onto checking the response quality today!

@vatsrahul1001
Copy link
Collaborator

Yes, I checked with Rahul and he will get onto checking the response quality today!

@davidgxue, I have completed the testing and observed an overall improvement in the quality of responses. Even for basic Astro SDK questions, responses have improved, even without the Astro SDK docs. However, for questions in Ask Astro that were specifically designed from the docs, the responses have degraded, which is as expected.

Results

Copy link
Collaborator

@sunank200 sunank200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@davidgxue davidgxue merged commit c680cc9 into main Jan 10, 2024
8 checks passed
@davidgxue davidgxue deleted the hybrid_search_reword_and_rerank branch January 10, 2024 01:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants