Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Caching to OpenAI chat #603

Merged
merged 6 commits into from
Feb 1, 2024

Conversation

adamgordonbell
Copy link
Contributor

Caching was no longer happening in newest version. Reintroducing here for OpenAI chat.

Not sure if this is the ideal area to insert caching or if model.call could somehow cache for all.

@slundberg
Copy link
Contributor

Thanks @adamgordonbell ! I am just about merge some extensive client/server updates, then I'll come back and try and loop this in.

@slundberg
Copy link
Contributor

One quick question on the PR. Adding poetry support sounds nice, but do you know if there is a way to do that without creating two copies of our dependency lists? (we can also just leave that out for now if it slows things down too much)

@adamgordonbell
Copy link
Contributor Author

One quick question on the PR. Adding poetry support sounds nice, but do you know if there is a way to do that without creating two copies of our dependency lists? (we can also just leave that out for now if it slows things down too much)

It's outside of my area of knowledge but I think so. Though I do think the cpp files probably complicate that. I can remove that from this PR, just helped with my development.

Thanks @adamgordonbell ! I am just about merge some extensive client/server updates, then I'll come back and try and loop this in.

Awesome!

@slundberg
Copy link
Contributor

Merged this with the latest updates. Thanks again!

@slundberg slundberg merged commit 496bc53 into guidance-ai:main Feb 1, 2024
4 checks passed
@prescod
Copy link

prescod commented Feb 18, 2024

Thank. you for adding this @adamgordonbell and @slundberg . Caching was about 50% of the reason I used Guidance. It's easy to implement myself but adding it to each project becomes tedious.

@maximegmd
Copy link

Why was this merged?

  • It cannot be disabled
  • It doesn't distinguish models, so calling gpt4 with a prompt after calling with gpt3.5 will return the gpt3.5 data
  • There is no doc on resetting the cache

@Harsha-Nori
Copy link
Collaborator

Harsha-Nori commented May 18, 2024

Hi @maximegmd, I was running into similar challenges with caching, and dropped it as part of a simplification we're making to the codebase (#820).

I think this PR was a strong initial concept (thank you @adamgordonbell!!), but as we've started changing the structure of guidance in the last few months, I'd like to pursue a more universal solution (beyond just OpenAI) that also addresses your concerns. If you want to use a no-cache version of the codebase today, just install from source.

@adamgordonbell
Copy link
Contributor Author

Yeah, this caching implementation had some weakenesses. Excited to see what the next iteration is.

Why was this merged?

  • It cannot be disabled

It was disabled for a call when the temperature was above 0. I thought that made sense, as with 0 temp you are asking for a deterministic result, which temp 0 and caching will give.

@maximegmd
Copy link

Yeah, this caching implementation had some weakenesses. Excited to see what the next iteration is.

Why was this merged?

  • It cannot be disabled

It was disabled for a call when the temperature was above 0. I thought that made sense, as with 0 temp you are asking for a deterministic result, which temp 0 and caching will give.

Except it didn't distinguish between models so with temperature 0 you get the cache of whichever model was used first with a specific prompt and we were doing model comparisons so we scratched our heads a little to understand why all OpenAI models had the same score haha

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants