Create the license-compliant version of the Pile: PubMed Central #74
Labels
data catalog
Gathering data from data sources
language modeling script
Need Language Modeling loading script
wontfix
This will not be worked on
Subset of The Pile.
PubMed Central: this was downloaded in a license-compliant fashion.
The text was updated successfully, but these errors were encountered: