Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tuning of model and dataset retrievers #314

Open
neubig opened this issue Sep 1, 2023 · 1 comment
Open

Tuning of model and dataset retrievers #314

neubig opened this issue Sep 1, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@neubig
Copy link
Collaborator

neubig commented Sep 1, 2023

Currently our model and dataset retrievers are not perfect, and it would be good to have a way to make them better.

One way we can do so is by explicitly training the model/dataset retrievers to:

  1. Retrieve multiple datasets (models) and run the prompt2model pipeline with all of them
  2. Take the resulting accuracy scores, and train the retriever so that the retriever gives higher scores to datasets (models) that give higher accuracy scores for the full pipeline

This would result in a training objective that explicitly rewards retrieving of datasets (models) that give high accuracy.

This would also be helpful for #285 , as it would reduce the need for human intervention when selecting models.

@neubig neubig added the enhancement New feature or request label Sep 1, 2023
@zhaochenyang20
Copy link
Collaborator

Also, here is something related:

https://github.com/stanfordnlp/dspy

Vijay and I actually thought about using LLM to automatically select columns and datasets, but just by prompting a row LLM, it is somehow impractical. Now, with DSPy, it seems that we can achieve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants