NLPC is a project with NLP competitions solutions, and a pipeline that's being used by all of them.
You can find .ipynb files in notebooks/
At the moment we have solutions of:
- Plenty of useful interfaces based on clean architecture principles, for example:
- ModelWithTransformer - to easily load weights from pretrain file or HuggingFace servers, reset weights etc.
- PseudoLabeler: to easily predict test data and add pseudo labels to train set
- Container - to download data and pass them to Datasets
- Submitter - to make a submission in one line of code
- DataProcessor
- Others
- Convenient components to train and use model:
- Trainer: supports checkpoints, evaluations, progress bars
- WeightsUpdater: update weights with a built-in GradScaler, amp etc.
- ModelManager: object composed of torch model and data processor makes fitting, submitting and any other operations agnostic to many pytorch features.
- Others