Skip to content

A complete reusable pipeline for text classification using different huggingface models, Weights and Biases, and Hydra

License

Notifications You must be signed in to change notification settings

seanbenhur/resusable_text_classification_template

Repository files navigation

🔥A Resusable Template for Text Classification 🔥

This is a complete resusable template for text classification powered with hydra, that lets you create State of the Art models without much effort.Currenlty this template supports transformer models and transformer architectures with different pooling strategies, you can use hydra to configure the arguments

🙌Using this template in your project

  • First, click the Use as template button and create yor repo
  • Clone the repo in your local machine, For example git clone https://github.com/seanbenhur/resusable_text_classification_template.git
  • Navigate to the src directory and open your terminal
  • Run python main.py to start training.
  • Since this project uses hydra, it supports multirun, please refer to hydra documentation for more info
  • After training the model use, python inference.py to run batch prediction on test dataset

🏅How this speeds up your development

I have been doing various text classification tasks for the past six months, working on different datasets, both personal and intern work, I find that most of the code that I write can be resused for many datasets, Though libraries like simple transformers exists, but I feel hard to configure those. Since I have been using these template for my projects, I belive this will help you also to create robust models with ease of workflow :)
This serves as a template to quickly have these things setup in your repo. Projects created from this template can easily be trained and deployed. It becomes hassle free and easy to debug too.

⚙Files to edit for setting up the project

  • You can edit any of the files in this repo, since this is a template, it is fullly configurable
  • Edit all the yaml configuration files in the `configs` directory
  • Edit all the .md files that are inside the folders, to your needs
  • Edit the LICENSE you may need a different one
  • Edit the requirements.txt and requirements-extra.txt (optional).

🙋‍♂️I want to Contribute!!

  • This repo is open to contribution, there are so many other methods, than text classification, if you would like to add any other model, I would highly appreciate it!
  • If you have found any bug or, if there is anything wrong in the repo structure/design/code, feel free to raise an issue
  • Please refer CONTRIBUTING.md for further details

🙏Inspiration

  • I would like to thank rhitsigh for his awesome notebooks, I learned more about transformers on those

About

A complete reusable pipeline for text classification using different huggingface models, Weights and Biases, and Hydra

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages