Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommend the dataset #22

Closed
xihuai18 opened this issue Mar 31, 2023 · 2 comments
Closed

Recommend the dataset #22

xihuai18 opened this issue Mar 31, 2023 · 2 comments

Comments

@xihuai18
Copy link

xihuai18 commented Mar 31, 2023

  1. The dataset used in SFT by ColossalAI: https://github.com/XueFuzhao/InstructionWild
  2. A summary of available datasets: https://zhuanlan.zhihu.com/p/615277009
@xihuai18 xihuai18 changed the title Recommend the dataset https://github.com/XueFuzhao/InstructionWild Recommend the dataset Mar 31, 2023
@PhoebusSi
Copy link
Owner

Thank you for reminding us that we will soon incorporate these data sets in the future.

@xihuai18
Copy link
Author

xihuai18 commented Apr 1, 2023

Will crawling the data from ShareGPT (as done by Google Bard and Vicuna) be possible? I think the conversations shared by real-person users of ChatGPT are very-high quality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants