Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/query engine improvements #713

Merged
merged 9 commits into from
Mar 29, 2024
Merged

Conversation

collindutter
Copy link
Member

No description provided.

@collindutter collindutter force-pushed the feature/query-engine-improvements branch 3 times, most recently from 0b905ee to 6bb2735 Compare March 27, 2024 16:44
@collindutter collindutter force-pushed the feature/query-engine-improvements branch from 6bb2735 to b49cad3 Compare March 27, 2024 17:04
@collindutter collindutter marked this pull request as ready for review March 28, 2024 16:25

for artifact in artifacts:
text_segments.append(artifact.value)

message = self.template_generator.render(
user_message = self.user_template_generator.render(
metadata=metadata,
query=query,
text_segments=text_segments,
rulesets=J2("rulesets/rulesets.j2").render(rulesets=rulesets),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should remove rulesets from here.


for artifact in artifacts:
text_segments.append(artifact.value)

message = self.template_generator.render(
user_message = self.user_template_generator.render(
metadata=metadata,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metadata should be in the system message.


for artifact in artifacts:
text_segments.append(artifact.value)

message = self.template_generator.render(
user_message = self.user_template_generator.render(
metadata=metadata,
query=query,
text_segments=text_segments,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Segments can be moved into the system message as well, so the LLM is more "grounded."

@collindutter collindutter force-pushed the feature/query-engine-improvements branch 3 times, most recently from fec940e to fa11f43 Compare March 28, 2024 18:51
@collindutter collindutter force-pushed the feature/query-engine-improvements branch from fa11f43 to de2dc6b Compare March 28, 2024 18:52
@collindutter collindutter force-pushed the feature/query-engine-improvements branch from 3a04f78 to b78d009 Compare March 28, 2024 18:56
Copy link
Contributor

@dylanholmes dylanholmes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Across my own testing, I'm seeing a significant improvement.
For my own 106 test cases, I see 46 are just as good, 25 are just as bad, 31 that were bad are now good, and only 2 that were good are now bad. (Bad just means failing to provide an answer, saying something like "I don't know").

One side observation about a qualitative difference in the answers that the prompt now mentions the "context information" pretty often.

For example:
Before: "I could not find an answer."
After: "The context information does not provide any information about XYZ".
Before: "There are 23 things."
After: "There are 23 things listed in the provided context information."

As most things LLM, it doesn't always mention "context information" all the time, but way more often then before (which was basically never).

Approving because aside from mentioning "context information" subjectively too much, the retrieval seemed to improved quite a bit.

Copy link
Member

@vasinov vasinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor cleanup needed but approving, so it's not blocked.

text_segments=text_segments,
rulesets=J2("rulesets/rulesets.j2").render(rulesets=rulesets),
)
user_message = self.user_template_generator.render(query=query, text_segments=text_segments)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need to pass text_segments here.

@vasinov vasinov merged commit 8e04019 into dev Mar 29, 2024
6 checks passed
@vasinov vasinov deleted the feature/query-engine-improvements branch March 29, 2024 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants