Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[context] add support for --additional_context option for query subcommand #37

Closed
signebedi opened this issue Mar 29, 2023 · 3 comments
Closed

Comments

@signebedi
Copy link
Owner

We should add an --additional_context option to the query subcommand that expects either a string or file path, which it will bolt onto the request string (where? at the start? at the end?).

This allows users to extend their questions with details that might not typically be formatted like a question, like stack traces with line breaks, etc.

We should then write some logic to structure this context (removing line breaks, etc) and applying length limits and/or keyword tokenization to it before bolting it onto the request string.

@signebedi signebedi changed the title [context] add support for outside context for query subcommand [context] add support for --additional_context option for query subcommand Mar 29, 2023
@signebedi
Copy link
Owner Author

signebedi commented Apr 2, 2023

When submitting ChatCompletions, this can be stored (after being deformatted with \n and other problem characters removed) under the 'system' tag of the message prompt.

With standard completions, we can just prepend prompts with whatever amount of the context that our token count will permit... OR, we can include it in the tokenized body of text...

Either way, we will need to add unittests for this.

@signebedi
Copy link
Owner Author

When we bootstrap additional_context, I think we need to privilege the other sources of context first. So, the order of precedence is (from highest to lowest):

  1. question
  2. gptty context (eg. past questions and responses)
  3. additional context (eg. the additional user provided context)

We have three cases:

  1. ChatCompletions
  2. Completions with keyword tokenization [context] allow keyword-only context using RAKE to minimize API usage fees #25
  3. Completions without keyword tokenization

In the cases of ChatCompletions, we should prepend the context whatever number of tokens we can add before reading our max token count, after having added all other context to the context as a "system" dictionary, see #31.

In the case of Completions with keyword tokenization, we should pass the additional context to return_most_common_phrases by prepending it the text parameter passed therein. Therefore, we do not seriously risk going over our token count.

In the case of Completions without keyword tokenization, we should prepend the context string with whatever number of tokens we can add before reading our max token count, after having added all other context to the context as a standard string.

@signebedi
Copy link
Owner Author

[tests] test additional_context passed to get_context
Now that we've added support for additional_context in gptty.context:get_context(), we should add some tests where we bootstrap additional context to the three different cases and validate the structure of the returned context, as well as its length.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant