Skip to content

Commit

Permalink
feat(corprep): add new file for tokenizing dataset
Browse files Browse the repository at this point in the history
  • Loading branch information
entelecheia committed Jul 17, 2023
1 parent 2db1824 commit 6fb827e
Showing 1 changed file with 11 additions and 0 deletions.
11 changes: 11 additions & 0 deletions src/corprep/conf/pipe/tokenize_dataset.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
defaults:
- __general_external_funcs__

run: corprep.datasets.preprocessing.tokenize.tokenize_dataset
run_with:
num_proc: 1
batched: true
text_col: bodyText
verbose: false
use_pipe_obj: true
return_pipe_obj: false

0 comments on commit 6fb827e

Please sign in to comment.