Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JSON as new finetuning dataset #26

Open
CBvanYperen opened this issue Jun 17, 2020 · 3 comments
Open

Add JSON as new finetuning dataset #26

CBvanYperen opened this issue Jun 17, 2020 · 3 comments

Comments

@CBvanYperen
Copy link

Hi,

The documentation refers to a tutorial on how a new TFDS dataset can be created.

I am trying to create and use my own finetuning dataset. Currently I have a JSON with text strings and corresponding summary strings which I would like to use for finetuning. Honestly, the tutorial was of very little help to me as I found it very complicated, I was also not able to find any other very informative sources concerning the TFDS dataset creation so I would really appreciate if you could provide some instructions. I understand it might be a lot to ask but I would really appreciate your help!

@JingqingZ
Copy link
Collaborator

Does the solution mentioned here #21 (comment) may help?

@CBvanYperen
Copy link
Author

Thanks! That certainly helped to get the test dataset up and running. However, I would still like to make and implement a finetuning dataset. Is it possible to use a .tfrecord file for the training dataset as well?

If not, could I use the DatasetBuilder (as described in the tutorial) to transform a .tfrecord file to a TFDS dataset?

@JingqingZ
Copy link
Collaborator

Yes, a new dataset in .tfrecord for fine-tuning should work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants