-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new providers: Koboldcpp and Goinfer #22
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
@@ -231,3 +232,12 @@ export const randomString = (): string => { | |||
} | |||
return result; | |||
}; | |||
|
|||
export function llamaMaxTokens(prompt: string, ctx: number) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this as a temporary solution but as mentioned here (#21 (comment)), I'm not sure this will work long term.
Not an issue for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposal: #21 (comment)
return ctx - n; | ||
} | ||
|
||
export function formatPrompt(prompt: string, template: string, systemMessage: string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will likely change or be removed entirely if/when the solution outlined in #21 (comment) is implemented, but works great for the purposes of this PR.
this.apiUrl = options?.apiUrl ?? DEFAULT_API_URL; | ||
} | ||
|
||
/* async complete(params: KoboldInferParams, options?: { signal?: AbortSignal }): Promise<string> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with leaving this commented code here as a point of reference. It has value in that sense.
Curious: do you see a reason for wingman to ever support anything other than a streamed response? Do you think each provider should support buffered responses like this, or no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see no reason to use anything else than streamed response: every backend provider supports it, this is a de facto standard. I am in favor of completely remove the completion
method as it has no value actually and it would simplify the code, which is better for maintenance and implementation of new providers
No description provided.