Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/hf callback #903

Merged
merged 6 commits into from
Nov 29, 2023
Merged

Feature/hf callback #903

merged 6 commits into from
Nov 29, 2023

Conversation

ArshaanNazir
Copy link
Collaborator

@ArshaanNazir ArshaanNazir commented Nov 29, 2023

Description

This PR adds a callback class to be used in transformers model training. Transformers callbacks are objects that can customize the behavior of the training loop in the PyTorch or Keras Trainer. They can inspect the training loop state and take decisions (like early stopping) or perform actions (like logging, saving, or evaluation). LangTest utilizes this functionality by implementing an automatic testing callback. The callback class is flexible and customizable and can be easily integrated with any transformers model.

Usage

Create a callback instance with one line and then use it in the callbacks of trainer:

my_callback = LangTestCallback(...)
trainer = Trainer(..., callbacks=[my_callback])
Parameter Description
task Task for which the model is to be evaluated (text-classification or ner)
data The data to be used for evaluation. A dictionary providing flexibility and options for data sources. It should include the following keys:
  • data_source (mandatory): The source of the data.
  • subset (optional): The subset of the data.
  • feature_column (optional): The column containing the features.
  • target_column (optional): The column containing the target labels.
  • split (optional): The data split to be used.
  • source (optional): Set to 'huggingface' when loading Hugging Face dataset.
config Configuration for the tests to be performed, specified in the form of a YAML file.
print_reports A bool value that specifies if the reports should be printed.
save_reports A bool value that specifies if the reports should be saved.
run_each_epoch A bool value that specifies if the tests should be run after each epoch or the at the end of training

Screenshots (if appropriate):

The report after each epoch is printed and saved if configured to do so.
image

image

Notebook:
https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/misc/HF_Callback_Text_Classification.ipynb
https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/misc/HF_Callback_Text_Classification.ipynb

@ArshaanNazir
Copy link
Collaborator Author

@alytarik add NB and screenshots in description

langtest/callback.py Outdated Show resolved Hide resolved
@ArshaanNazir ArshaanNazir merged commit 81c9a74 into release/1.9.0 Nov 29, 2023
3 checks passed
@ArshaanNazir ArshaanNazir linked an issue Dec 1, 2023 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Langtest callback to HF Trainer
3 participants