Skip to content

Release v0.1.1

Compare
Choose a tag to compare
@fracpete fracpete released this 15 Feb 03:45
· 67 commits to main since this release
  • added classification domain
  • added from-jsonlines-cl reader and to-jsonlines-cl writer for classification data in JSON lines format
  • added filter pretrain-sentences-to-classification to turn pretrain data into classification data (with a predefined label)
  • added filter classification-label-map that can generate a label string/int map
  • the to-llama2-format filter now has the --skip_tokens options to leave out the [INST] [/INST] tokens
  • added from-parquet-cl reader and to-parquet-cl writer for classification data in Parquet database format
  • added from-csv-cl/from-tsv-cl readers and to-csv-cl/to-tsv-cl writers for classification data in CSV/TSV file format