Releases: masanorihirano/llm-japanese-dataset
Releases · masanorihirano/llm-japanese-dataset
1.0.3
- dropped nan input and output in alt
CC-BY-SA 4.0 version has 9,074,340 lines
MIT version has 1,481,970 lines
This release is automatically generated.
Please see the pull request for more details.
#89
1.0.2
In Wikipedia summary:
- removed samples of blank output
- version of Wikipedia was updated to 20240101
CC-BY-SA 4.0 version has 9,074,350 lines
MIT version has 1,481,980 lines
This release is automatically generated.
Please see the pull request for more details.
#85
vanilla-1.0.2
In Wikipedia summary:
- removed samples of blank output
- version of Wikipedia was updated to 20240101
CC-BY-SA 4.0 version has 2,492,588 lines
MIT version has 296,422 lines
This release is automatically generated.
Please see the pull request for more details.
#86
1.0.1
- drop alpaca
This release is automatically generated.
Please see the pull request for more details.
#79
vanilla-1.0.1
CC-BY-SA 4.0 version has 2,463,624 lines
MIT version has 296,422 lines
This release is automatically generated.
Please see the pull request for more details.
#78
1.0.0
Added:
- jqac
- wikipedia ja typo corpus
CC-BY-SA 4.0 version has 9,097,388 lines
MIT version has 1,533,982 lines
This release is automatically generated.
Please see the pull request for more details.
#72
0.1.1
Bug fix in JESC datasets.
CC-BY-SA 4.0 version has 1,811,964 lines
MIT version has 348,424 lines
This release is automatically generated.
Please see the pull request for more details.
#67
vanilla-1.0.0
Added:
- jqac
- wikipedia ja typo corpus
CC-BY-SA 4.0 version has 2,515,626 lines
MIT version has 348,424 lines
This release is automatically generated.
Please see the pull request for more details.
#69
vanilla-0.1.0
CC-BY-SA 4.0 version has 1,811,964 lines
MIT version has 348,424 lines
This release is automatically generated.
Please see the pull request for more details.
#65
0.1.0
CC-BY-SA 4.0 version has 8,393,726 lines
MIT version has 1,533,982 lines
This release is automatically generated.
Please see the pull request for more details.
#49