Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mlflow logging integration with yolox training #1773

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

Im-Himanshu
Copy link

Needed integration of yolox to log experiments with mlflow, the pull request provide additional option in -l --logger argument to log output in "mlflow".

Requires an environment file (.env) in the root folder of the projects.

required additional dependency of mlflow and python-dotenv failing which error is raised if logger is set to mlflow.

@Im-Himanshu Im-Himanshu changed the title mlflow integration with yolox training mlflow logging integration with yolox training May 15, 2024
@Im-Himanshu
Copy link
Author

Im-Himanshu commented May 15, 2024

Tested the logging on to data bricks, following logs are available for all the runs.

Logged params

image

logged metrices

image

logged artifacts

image

@Im-Himanshu
Copy link
Author

@FateScript Requesting you to please review the pull request.

Copy link
Member

@FateScript FateScript left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Im-Himanshu Thanks for your contribution : )

Please check my review suggestion and lint your code to pass the github workflow.

yolox/core/trainer.py Outdated Show resolved Hide resolved
yolox/utils/logger.py Outdated Show resolved Hide resolved
yolox/utils/logger.py Outdated Show resolved Hide resolved
yolox/core/trainer.py Outdated Show resolved Hide resolved
yolox/utils/logger.py Outdated Show resolved Hide resolved
yolox/utils/logger.py Outdated Show resolved Hide resolved
yolox/core/trainer.py Outdated Show resolved Hide resolved
yolox/utils/logger.py Outdated Show resolved Hide resolved
@FateScript
Copy link
Member

Any update? @Im-Himanshu

@Im-Himanshu
Copy link
Author

Im-Himanshu commented Jun 17, 2024

Any update? @Im-Himanshu

@FateScript Excuse me for the delayed response, I have pushed new commits to implement all the suggestions.
Kindly review.

@Im-Himanshu
Copy link
Author

Any update? @Im-Himanshu

@FateScript Excuse me for the delayed response, I have pushed new commits to implement all the suggestions. Kindly review.

@FateScript Gentle Reminder to Please check and merge the request.

docs/mlflow_integration.md Outdated Show resolved Hide resolved
docs/mlflow_integration.md Outdated Show resolved Hide resolved
Copy link
Member

@FateScript FateScript left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Im-Himanshu Please fix it.

@@ -98,8 +98,11 @@ def setup_logger(save_dir, distributed_rank=0, filename="log.txt", mode="a"):

logger.remove()
save_file = os.path.join(save_dir, filename)
crnt_log_save_file = os.path.join(save_dir, 'train_log_crnt.txt')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this train_log_crnt.txt is needed? Seems that your code redirect io to it and remove this file if it exists.

Copy link
Author

@Im-Himanshu Im-Himanshu Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FateScript As you have suggested in your earlier review that you recommend creating a new logger file.
Moreover, it is deleted at that start to keep only current run logs in this and upload that part only, current .log file has logs of all the previous runs which may be confusing in experiment tracking.
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's my bad, I didn't make it clear. In fact, the logger file means logger.py but not the file where logs are saved.

Your code here just make a copy of the log file. It's better for you to reset to code here. Thanks!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it and reverted to old code.

though I think it would have been better to log only the current experiment logs to mlflow.

docs/mlflow_integration.md Outdated Show resolved Hide resolved
docs/mlflow_integration.md Show resolved Hide resolved
yolox/utils/mlflow_logger.py Show resolved Hide resolved
yolox/utils/mlflow_logger.py Show resolved Hide resolved
yolox/utils/mlflow_logger.py Outdated Show resolved Hide resolved
yolox/utils/mlflow_logger.py Outdated Show resolved Hide resolved
yolox/utils/mlflow_logger.py Outdated Show resolved Hide resolved
@FateScript
Copy link
Member

FateScript commented Jul 4, 2024

@Im-Himanshu Also please don't forget to lint your code.

@Im-Himanshu
Copy link
Author

@Im-Himanshu Also please don't forget to lint your code.

Linted the code, the only major issue in lint, is the import statement which has to be done inside class (same as being done in wandb logger) because this is an optional feature.
image

@Im-Himanshu
Copy link
Author

@FateScript @Cloudhax23 completed all the suggestion, please check.

Copy link
Member

@FateScript FateScript left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -98,8 +98,11 @@ def setup_logger(save_dir, distributed_rank=0, filename="log.txt", mode="a"):

logger.remove()
save_file = os.path.join(save_dir, filename)
crnt_log_save_file = os.path.join(save_dir, 'train_log_crnt.txt')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's my bad, I didn't make it clear. In fact, the logger file means logger.py but not the file where logs are saved.

Your code here just make a copy of the log file. It's better for you to reset to code here. Thanks!

yolox/utils/mlflow_logger.py Show resolved Hide resolved
@Im-Himanshu
Copy link
Author

@FateScript Removed the additional logger and linted the code again to remove build process error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants