Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic FewShot GPTClassifier: does it cache Embd locally? #55

Open
KennyNg-19 opened this issue Aug 18, 2023 · 6 comments
Open

Dynamic FewShot GPTClassifier: does it cache Embd locally? #55

KennyNg-19 opened this issue Aug 18, 2023 · 6 comments

Comments

@KennyNg-19
Copy link

I wonder:

  1. if DynamicFewShotGPTClassifier will cache embeddings by OpenAI locally for the 1st time calling it.
  2. And can we access them as embeds can be used in other cases, so that we can save some budget?
@iryna-kondr
Copy link
Owner

Hello, @KennyNg-19
The embeddings are stored inside the estimator and in theory can be accessed. However, reusing them for other use cases might not be easily achievable. Could you elaborate how exactly would you like to reuse the embeddings?

@KennyNg-19
Copy link
Author

KennyNg-19 commented Sep 12, 2023

Hi, @iryna-kondr
As the whole dataset gets embeddings before fewshotclassifier runs, so their embeddings cached locally may be used in other downstream tasks after the classfication task, like semantic search or similarity comparison.

If we cannot use the embeddings generated here, the embedding functions(especially paid API service) will be called again, which increases cost.

@math-sasso
Copy link

I am having the same problem. I dont want to recreate the embeddings at every request. I wanna do it once and reuse (both embeddings + fitted classifier) it for future calls in my system.

@AndreasKarasenko
Copy link
Contributor

AndreasKarasenko commented May 7, 2024

One additional point to consider: if we rerun experiments at a later date it would be nice to simply point to preexisting embeddings instead of re-embedding them. So same exact task, same exact data.

@iryna-kondr is this something you might consider implementing?

@iryna-kondr
Copy link
Owner

Hi, @AndreasKarasenko. You can pickle the estimator (with embeddings) and then load it at a later date. See our discussion here: https://discord.com/channels/1112768381406425138/1125476385750782012/1125478710427009044

@AndreasKarasenko
Copy link
Contributor

Thanks for the info! Based off of that I figured out a way to get the data and embedding lists so I can store them locally. I think this issue can be closed now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants