-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename PretokenizationPipeline and PosttokenizationPipeline #244
Comments
I don't like the names ending with I think it's best to remove PosttokenizationPipeline and PretokenizationPipeline altogether and transfer all the logic inside Field and MultioutputField. Rn, The bigger issue is that MultioutputField is not a subclass of Field, so we are duplicating some logic. |
True, I would also like for MultioutputField to be somehow unified with Field (perhaps not directly as MultioutputField contains Fields). I believe we had some offline discussion on this, and the way forward is to separate hooks out (and potentially rename them to some name, undecided yet), and leave Field and MultioutputField as it for the time being (as they require a larger overhaul). |
I don't think |
Maybe simplify |
Why not just do the python thing and drop all suffixes, going with |
Actually, I think the naming is not that relevant in this case. Both PosttokenizationPipeline and PretokenizationPipeline are only used internally by Field so keeping them private is the way to go imo. |
I'd agree with @mariosasko 's previous comment as they are only used internally. We can underscore them and be OK with this. |
PosttokenizationPipeline and PretokenizationPipeline sounds too similar to pipeline.py module and should be renamed to be more related to hooks.
Suggestions:
Further, hook code could be extracted to a separate module and unified using a general HookQueue interface that defines
add_hook
andclear
methods to avoid copying of those.The text was updated successfully, but these errors were encountered: