New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed regex bug in RetrievalQAWithSources in previous update #9898
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Ignored Deployment
|
Hey @baskaryan we have a demo coming up and I was hoping we could merge and deploy a new version of this as soon as possible. Thank you for your consideration! |
@@ -120,9 +120,9 @@ def validate_naming(cls, values: Dict) -> Dict: | |||
|
|||
def _split_sources(self, answer: str) -> Tuple[str, str]: | |||
"""Split sources from answer.""" | |||
if re.search(r"SOURCES?[:\s]", answer, re.IGNORECASE): | |||
if re.search(r"SOURCES?[:]\s", answer, re.IGNORECASE): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are the square brackets still needed if there's only one element?
also what's the failure mode here? is it that "sources: " would be split before the ending space? does that matter given we strip the source?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @baskaryan thanks for the response. You're right I don't need the square braces.
We were looking for "colon or white space" previously.
The failure was the fact that if we had the word "source" or "sources" in our answer it would cut the answer off.
Example:
The source of truth for the test subject. SOURCES: 28-pl.
Current Answer:
The
Expected Answer:
The source of truth for the test subject.
SOURCES: 28pl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can also do away with the whitespace regex, since we perform a strip. Let me update my PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just fixed it!
@@ -27,6 +27,12 @@ | |||
"This Agreement is governed by English law.\n", | |||
"28-pl", | |||
), | |||
( | |||
"According to the sources, the agreement is governed by English law.\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with a comma after sources this would've worked before right? should we drop the comma
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought I had updated it, my bad! Thanks for pointing it out, I've updated it now.
thank you @nik1097!! |
Please make sure your PR is passing linting and testing before submitting. Run
make format
,make lint
andmake test
to check this locally.See contribution guidelines for more information on how to write/run tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include: