A hub of OpusFilter configurations
This repository provides a collection of OpusFilter configurations and welcomes contributions by users of the software. The purpose is to enable re-use of tested data filtering pipelines and in order to improve replicability of experiments and published results.
Feel free to create pull requests to add your own configuration files. We organize the repository by categories and each config file should be complemented with a short description in markdown to further explain its use. You can also use that page to advertize your paper, experiment or model for which you have used OpusFilter with that configuration.
Current categories:
- publications: work related to published papers
- corpora: configurations for specific corpora amd data sets
- models: configuration files used to train specific models
- examples: miscellaneous configuration examples