You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So the "max_tokens" feature from open ai sounds great, but in reality it just cuts the response off at the given limit. So if the response is 700 tokens but you ask it to cap it off at 512, then it will literally just cut the response off at 512.
... The AI discusses how challenges are part of",
It doesn't care if it doesn't make any sense, it just drops anything after 512 which makes the summary less cohesive. I do like being able to limit the summary because if not, you can have a really long summary after a long chat conversation which can get really expensive.
What I've found works better than sending in "max_tokens" is to make it part of the prompt. "Summarize the following in 512 tokens or less". This will make sure that the response is less than a certain amount, without cutting off the response prematurely.
It would also be great if this number was easily configurable from langchain so we didn't have to worry about the config file when finding which amount works best for our use case.
Update:
Here's a draft PR with my proposed solution minus making it configurable from langchain. I've never worked on a Go project before but you should be able to see what I'm trying to do here. Let me know what you think and I can keep working on it if you'd like. #130
The text was updated successfully, but these errors were encountered:
So the "max_tokens" feature from open ai sounds great, but in reality it just cuts the response off at the given limit. So if the response is 700 tokens but you ask it to cap it off at 512, then it will literally just cut the response off at 512.
... The AI discusses how challenges are part of",
It doesn't care if it doesn't make any sense, it just drops anything after 512 which makes the summary less cohesive. I do like being able to limit the summary because if not, you can have a really long summary after a long chat conversation which can get really expensive.
What I've found works better than sending in "max_tokens" is to make it part of the prompt. "Summarize the following in 512 tokens or less". This will make sure that the response is less than a certain amount, without cutting off the response prematurely.
It would also be great if this number was easily configurable from langchain so we didn't have to worry about the config file when finding which amount works best for our use case.
Update:
Here's a draft PR with my proposed solution minus making it configurable from langchain. I've never worked on a Go project before but you should be able to see what I'm trying to do here. Let me know what you think and I can keep working on it if you'd like. #130
The text was updated successfully, but these errors were encountered: