Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] Extend aim.ext.tensorboard_tracker.run.Run to allow stdout logging and system stats and parameter logging #2671

Merged

Conversation

alansaul
Copy link
Contributor

Implemented the simple approach to resolve #2670 by simply inheriting from aim.sdk.run.Run rather than aim.sdk.run.BasicRun, and we get the behaviour for free.

Note: the default arguments for system_tracking_interval, log_system_params and capture_terminal_logs are different from the defaults provided by aim.sdk.run.Run to maintain backwards compatability, and by default not do any additional logging.

I have also added a bunch of tests to check the extended behaviour works (default no capture, and that capturing each component works as expected). Perhaps there is an easier implementation for grabbing the metrics of interest, I might need some guidance as I am new to this project. I couldn't find anywhere where the logging behaviour of aim.sdk.run.Run is captured to copy from. Let me know if you need any changes/think these tests should be removed.

Resolves #2670

@CLAassistant
Copy link

CLAassistant commented Apr 21, 2023

CLA assistant check
All committers have signed the CLA.

@mihran113 mihran113 changed the title [feat]: Extend aim.ext.tensorboard_tracker.run.Run to allow stdout logging and system stats and parameter logging [feat] Extend aim.ext.tensorboard_tracker.run.Run to allow stdout logging and system stats and parameter logging Apr 23, 2023
@mihran113
Copy link
Contributor

Everything looks good to me. Please just sign the CLA and add a CHANGELOG entry.
@alberttorosyan please have a look as well, Is there anything concerning to just inherit from aim.Run directly?

Copy link
Member

@alberttorosyan alberttorosyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation looks good!
Have some thoughts about merging the BasicRun and Run classes of SDK; as the only purpose for the separation was the tb sync functionality.
But we can leave it for now.

tests/ext/test_tensorboard_run.py Outdated Show resolved Hide resolved
Copy link
Contributor

@mihran113 mihran113 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! 🙌

Please sign the CLA, and also there's a conflict that needs to be fixed before merging.

@gorarakelyan
Copy link
Contributor

@alansaul could you please sign the CLA, so we can merge the enhancement?

@alansaul
Copy link
Contributor Author

alansaul commented May 4, 2023

Apologies for the delay, I've merged upstream and signed the CLA, and fixed the changelog

@alansaul
Copy link
Contributor Author

alansaul commented May 4, 2023

It looks like you might have some randomly failing tests as I don't believe I touched anything relating to the failing tests.

@alberttorosyan alberttorosyan merged commit ea66c28 into aimhubio:main May 4, 2023
3 of 4 checks passed
mihran113 pushed a commit that referenced this pull request May 30, 2023
…ogging and system stats and parameter logging (#2671)

# Conflicts:
#	CHANGELOG.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extend aim.ext.tensorboard_tracker.run.Run to also allow system stats, parameters, and stdout capture.
5 participants