-
Notifications
You must be signed in to change notification settings - Fork 13
Issues: hplt-project/OpusCleaner
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Laser filter error: ValueError: could not convert string to float: b''
#156
opened Apr 17, 2024 by
eu9ene
One universal filter configuration to run on all datasets sequentially
enhancement
New feature or request
#151
opened Jan 22, 2024 by
mzeidhassan
Using the fix-quotes filter, and viewing the changes, makes it look as though the file has been replaced.
#149
opened Jan 19, 2024 by
bhaddow
Automatically derive filters based on a clean sample provded by the user.
#148
opened Jan 17, 2024 by
PinzhenChen
add a separater between selected fillters and the filter pool
enhancement
New feature or request
#147
opened Jan 17, 2024 by
PinzhenChen
Should OpusCleaner have the notion of a "project"?
enhancement
New feature or request
#146
opened Jan 12, 2024 by
bhaddow
Fails to install requirements-all.txt on Python 3.10
bug
Something isn't working
good first issue
Good for newcomers
help wanted
Extra attention is needed
#145
opened Jan 12, 2024 by
bhaddow
There should be sensible defaults for filters wherever possible
enhancement
New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
#144
opened Jan 10, 2024 by
bhaddow
Using the "detokenizer" filter rule gives an error
bug
Something isn't working
component:filter
Related to or suggestion for a new filter block
#143
opened Jan 10, 2024 by
bhaddow
The configuration of data searching and downloading directories is not linked
bug
Something isn't working
#142
opened Jan 8, 2024 by
bhaddow
Support monolingual datasets
enhancement
New feature or request
help wanted
Extra attention is needed
#141
opened Jan 7, 2024 by
jelmervdl
4 tasks
Cutting off internet during download leaves the download in a broken state.
#134
opened Nov 15, 2023 by
gregtatum
num_mismatch discards some useful entries
bug
Something isn't working
#132
opened Nov 15, 2023 by
gregtatum
Refactor filters as transfomers & scorers
enhancement
New feature or request
#130
opened Oct 31, 2023 by
jelmervdl
Configure the diff view to select diffs between different steps
#129
opened Oct 31, 2023 by
jindrahelcl
Tooltip that says which filter did it
enhancement
New feature or request
#127
opened Oct 30, 2023 by
jindrahelcl
Show overlap scores when downloading datasets
component:ui
Issues related to the interface
enhancement
New feature or request
#121
opened Oct 2, 2023 by
jelmervdl
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.