-
Notifications
You must be signed in to change notification settings - Fork 1.2k
remote: config checksum_jobs #3133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d9ce336 to
488bc3d
Compare
tests/func/test_remote.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JIoJIaJIu , sorry for the confusion and extra work, but we are in the middle of a migration from unittest to pytest.
Could you rewrite your tests to pytest? Happy to help with further questions about this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you rewrite your tests to pytest?
Have I got right your request?
gurobokum@42c34d7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JIoJIaJIu, you did) Thanks!
dvc/remote/base.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we going to have both, core and remote checksum_jobs? Why not going remote-only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented fallback logic as was described in the comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JIoJIaJIu Got it, thanks! Actually, thinking about it, if checksum_jobs didn't work anyway, we could consider just getting rid of it, as it doesn't make much sense in many cases. E.g. if core.checksum_jobs is set for 200, then it makes sense for local remote or s3, but will break everything for ssh remote as we will run out of connections. But that will break the backward compatibility, so not worth bothering with right now. So your current solution is correct, let's stick with it for now 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@efiop , so are we staying with both core and remote for backward compatibility issues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MrOutis Yep, at least for now.
|
Thanks for the PR, @JIoJIaJIu, left some comments :) Is your suggestion about renaming |
Hello @MrOutis Line 42 in fe635a5
But further in RemoteBASE it has naming as config that introduces ambiguity - Line 85 in fe635a5
I suggest just to define such terms as And change Line 85 in fe635a5
So it means that component (Remote) can be initialized with settings which have priority1 and use config if needs as fallback rather then use 2 configs where the one is a child of the second |
488bc3d to
42c34d7
Compare
dvc/remote/base.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to pass repo, you have self.repo set already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why I decided to pass repo as argument for avoiding initialization dependency - otherwise needs to care that self.repo is initialized above. But I'm not sure that it's a good way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, not sure I follow. Could you elaborate/rephrase, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant that needs first init self.repo like
def __init__(...):
self.repo = ...
self.checksum_jobs = ...
introduce complexity that self.checksum_jobs should be initialized after self.repo, so in terms of refactoring the order can change and it breaks the constructor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, but it will break in an obvious way with an attribute error, as self.repo won't be set. I would understand your logic if you would've made _get_checksum_jobs a static method or a helper function, but you've made it a full-blown method so using self.repo there would be totally fine. But current implementation works too. I guess another way to make it obvious would be to use _get_checksum_jobs(repo.config if repo else {}, config) 🙂But, again, these are non-vital details and neat-picking, so your call 🙂 Current implementation for this is mergeable.
dvc/remote/base.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please clarify what AttributeError is for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are places in tests that pass None or custom Repo() to RemoteBASE so such way I am checking that case.
https://github.com/JIoJIaJIu/dvc/blob/42c34d7269c24187a09bbebe0469647862a75fdd/tests/unit/remote/test_gdrive.py#L27
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not if self.repo then? Attribute error might be too generic, as it will catch something that might happen deeper. Your first implementation had both if self.repo as well as except AttributeError, right? Let's keep only if self.repo then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not if self.repo then
Cause here there is a repo but not config, https://github.com/JIoJIaJIu/dvc/blob/42c34d7269c24187a09bbebe0469647862a75fdd/tests/unit/remote/test_gdrive.py#L27
I didn't dare to touch the test
so I need to check self.repo and self.repo can be without config. I can change it to
self.repo.get('config', {})
Should I?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That specific test will still have repo.config though, because of https://github.com/iterative/dvc/blob/master/dvc/repo/__init__.py#L82 . Maybe some other test is not mocking it properly, then it should be fixed, unless it is unreasonably hard, of course 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JIoJIaJIu , I thought I had left a comment about AttributeError, but I couldn't find it.
Do you remember if I posted it? Am I going cuckoo 🙈 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For avoiding checking config attribute existence I extended Repo class in GDrive tests
gurobokum@aec7781#diff-b780ac32bee8290565619b80d7051c85R18
Could you please double check that it doesn't break anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MrOutis I've resolved it marking that the changes were applied. Is it ok or better to keep all threads open?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JIoJIaJIu, it's fine to resolve them, I do the same) Just couldn't find it this time 😅
|
@JIoJIaJIu Looks good! Good point about |
42c34d7 to
aec7781
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Please see one minor comment above.
aec7781 to
4918c56
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, @JIoJIaJIu !
|
@JIoJIaJIu Oops, looks like we forgot about docs. Could you please either create an issue in dvc.org or, even better, submit a PR to dvc.org? |
Close #2920
Additional details
I would like to suggest to define and use such terms for avoiding ambiguity
Hence refactor RemoteBase from
with all changes as
config -> settingsThen get some setting as:
❗ Have you followed the guidelines in the Contributing to DVC list?
📖 Check this box if this PR does not require documentation updates, or if it does and you have created a separate PR in dvc.org with such updates (or at least opened an issue about it in that repo). Please link below to your PR (or issue) in the dvc.org repo.
❌ Have you checked DeepSource, CodeClimate, and other sanity checks below? We consider their findings recommendatory and don't expect everything to be addressed. Please review them carefully and fix those that actually improve code or fix bugs.
Thank you for the contribution - we'll try to review it as soon as possible. 🙏