Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Chatllama] Merge the datasets to create more insightful training data #321

Open
3 tasks
PierpaoloSorbellini opened this issue Mar 31, 2023 · 2 comments
Open
3 tasks
Assignees
Labels
chatllama Issue related to the ChatLLaMA module good first issue Good for newcomers

Comments

@PierpaoloSorbellini
Copy link
Collaborator

Description

Currently the dataset supported can be used alternatively to each other,
It would be nice to add diversity in the training data to select recipes to merge this dataset and create more insightful trainings.

TODO

  • Define what parameters needs to be specified to create a "recipe" for the dataset and add them to the config files
  • Expand the dataset class to allow parameters from the config file to generate the appropriate dataset mixture.
  • Evaluate the possible increase in model quality due to different "recipes" used.
@PierpaoloSorbellini PierpaoloSorbellini added good first issue Good for newcomers chatllama Issue related to the ChatLLaMA module labels Mar 31, 2023
@mohsinmahmood12
Copy link

Hi there! I want to work on this issue

@diegofiori
Copy link
Collaborator

Hello @mohsinmahmood12, thanks a lot for the interest in ChatLLaMA. I assigned the issue to you! Let us know if you face any difficulties with the task 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chatllama Issue related to the ChatLLaMA module good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants