Skip to content

progress bar added for training and validation, on screen logs structured #204

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

shubsraj
Copy link

@shubsraj shubsraj commented May 8, 2025

…ured.

Description

A progress bar has been added for training and validation processes, allowing better monitoring of training.

  • progress bar using tqdm

How has this change been tested, please provide a testcase or example of how you tested the change?

YOUR_ANSWER

no special change, just created a progress bar using tqdm and implemented in the main engine code for one epoch and the evaluate function, which adds structure to the logs and making it easier to read.

For example, documentation changes, usability, usage/costs, secrets, etc.

Docs

No change in the documentation
Screenshot 2025-05-08 232538

@CLAassistant
Copy link

CLAassistant commented May 8, 2025

CLA assistant check
All committers have signed the CLA.

…passing issue when only evaluating any pretrained model.
@nok
Copy link

nok commented May 25, 2025

Hi @shubsraj , I created a new PR shubsraj#1 which mainly adds a config parameter progress_bar to toggle the progress bar. In addition I change the color of the progress bar (see below). And I tested the progress bar with multiple gpus successfully.

Screenshot 2025-05-26 at 00 36 25

Add config parameter `progress_bar`
start_steps = epoch * num_training_steps_per_epoch

print("Grad accum steps: ", args.grad_accum_steps)
print("Total batch size: ", batch_size * utils.get_world_size())
# print("Grad accum steps: ", args.grad_accum_steps)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how helpful these print statements are in the current context, so I left them unchanged. I'll leave it to a core maintainer to decide whether they should be removed or kept.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every time before an epoch, the same info was printed, so I shifted it to the main code to print it once, before the start of the training while adding a progress bar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants