You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be very useful to have the training data returned from generate_data_parallel.py script available to download, for both the pile and packed cases.
I appreciate this may be a large amount of memory, and therefore difficult to host, so there is no expectation of course!
But it would avoid people needing to run the costly data generation process locally in order to experiment with the training.
The text was updated successfully, but these errors were encountered:
Hey, quick heads up. The links in the table for the README are mis-matched. The pile links leads to packed data, and vice versa. Also a small type for the word "this" beforehand. Both should be easy to fix! :)
It would be very useful to have the training data returned from
generate_data_parallel.py
script available to download, for both the pile and packed cases.I appreciate this may be a large amount of memory, and therefore difficult to host, so there is no expectation of course!
But it would avoid people needing to run the costly data generation process locally in order to experiment with the training.
The text was updated successfully, but these errors were encountered: