-
Couldn't load subscription status.
- Fork 1.3k
logger: remote: use lazy formatting #3178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Benchmark: ~242 sec with this patch and ~282 without it. E.g. |
|
For the record: used incorrect formatting, fixing right now... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, @efiop !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you try to bench it?
dvc/remote/base.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still use msg maybe?
msg = "dir cache file format error '{}' [skipping the file]"
logger.error(msg, path_info)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No point in a new var, it is an old code from time when we weren't sure on formatting. We are inlining it these days, so it makes sense to inline here as well.
dvc/remote/base.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above:
msg = "checksum '{}'(actual '{}') for '{}' has changed."
logger.debug(msg, checksum, actual, path_info)
dvc/rwlock.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we trying to move this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Suor It is a faulty import from an incorrect place that broke once I've removed no longer used relpath re-import from fs.py.
|
Part of iterative#1843 As our previous investigations showed, path stringification and stuff like relpath are taking a very large chunk of time when working with giant directories. This patch removes `format()`s and uses lazy formatting provided by logger instead, so stuff like path_info is not stringified until actually needed (e.g. on `-v`).
Part of #1843 , related to #3177
As our previous investigations showed, path stringification and stuff
like relpath are taking a very large chunk of time when working with
giant directories. This patch removes
format()s and uses lazyformatting provided by logger instead, so stuff like path_info is not
stringified until actually needed (e.g. on
-v).Benchmark results coming soon
❗ Have you followed the guidelines in the Contributing to DVC list?
📖 Check this box if this PR does not require documentation updates, or if it does and you have created a separate PR in dvc.org with such updates (or at least opened an issue about it in that repo). Please link below to your PR (or issue) in the dvc.org repo.
❌ Have you checked DeepSource, CodeClimate, and other sanity checks below? We consider their findings recommendatory and don't expect everything to be addressed. Please review them carefully and fix those that actually improve code or fix bugs.
Thank you for the contribution - we'll try to review it as soon as possible. 🙏