Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support external ssh directories as dependencies and outputs #1892

Merged
merged 6 commits into from Apr 24, 2019

Conversation

2 participants
@efiop
Copy link
Member

commented Apr 15, 2019

No description provided.

@efiop efiop force-pushed the efiop:1654 branch 17 times, most recently from afdfa26 to 6f1a01e Apr 16, 2019

@efiop efiop force-pushed the efiop:1654 branch from 6f1a01e to e07d412 Apr 24, 2019

move: move directory into an existing directory
Signed-off-by: Ruslan Kuprieiev <ruslan@iterative.ai>

@efiop efiop force-pushed the efiop:1654 branch 2 times, most recently from cb06a2d to 88c6c29 Apr 24, 2019

efiop added some commits Apr 14, 2019

remote: generalize file/dir handling
Fixes #1654
Fixes #1655

Signed-off-by: Ruslan Kuprieiev <ruslan@iterative.ai>
remote: ssh: support directories
Signed-off-by: Ruslan Kuprieiev <ruslan@iterative.ai>
test: func: use pytest for outputs/dependencies
Signed-off-by: Ruslan Kuprieiev <ruslan@iterative.ai>
use ValueError instead of JSONDecodeError
JSONDecodeError is only available in never python 3 versions and still
inherits from ValueError, so we could just use it.

Signed-off-by: Ruslan Kuprieiev <ruslan@iterative.ai>
test: test ssh
Signed-off-by: Ruslan Kuprieiev <ruslan@iterative.ai>

@efiop efiop force-pushed the efiop:1654 branch from 88c6c29 to 37ee55b Apr 24, 2019

@efiop efiop changed the title [WIP] support external ssh directories as dependencies and outputs support external ssh directories as dependencies and outputs Apr 24, 2019

@efiop efiop merged commit b8018c1 into iterative:master Apr 24, 2019

3 checks passed

Travis CI - Pull Request Build Passed
Details
codeclimate Approved by Ruslan Kuprieiev.
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details

@efiop efiop self-assigned this Apr 25, 2019

@efiop efiop added this to In progress in Weekly tasks via automation Apr 25, 2019

@efiop efiop moved this from In progress to Done in Weekly tasks Apr 25, 2019

@mroutis
Copy link
Collaborator

left a comment

Late review 😞

return path.replace(from_sep, to_sep)

@classmethod
def to_posixpath(cls, path):

This comment has been minimized.

Copy link
@mroutis

mroutis May 8, 2019

Collaborator

Why are we defining these methods in RemoteBase instead of utils or something like that?
By the way, Pathlib has related methods: https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.as_posix

raise NotImplementedError

def _collect_dir(self, path_info):

This comment has been minimized.

Copy link
@mroutis

mroutis May 8, 2019

Collaborator

when method in not that straight forward to read (more than a few lines, or generic naming), please add a comment explaining the reason behind that method :)

# NOTE: sorting the list by path to ensure reproducibility
return sorted(dir_info, key=itemgetter(self.PARAM_RELPATH))

def get_dir_checksum(self, path_info):

This comment has been minimized.

Copy link
@mroutis

mroutis May 8, 2019

Collaborator

same here, it is hard to read at first glance, a comment would help to grasp the idea of this method without reading all the source and mentally "interpreting" it.

@@ -165,6 +317,61 @@ def changed(self, path_info, checksum_info):
logger.debug("'{}' hasn't changed.".format(path_info))
return False

def link(self, from_info, to_info):

This comment has been minimized.

Copy link
@mroutis

mroutis May 8, 2019

Collaborator

This looks unnecessary, it is just an "alias" for self.copy. Why do you decided to add it, @efiop ?

This comment has been minimized.

Copy link
@efiop

efiop May 8, 2019

Author Member

@mroutis Take a look at RemoteLOCAL.link(). It is copy only for some remotes. For example, ssh will also get reflink/hardlink/symlink support later.

def is_empty(self, path_info):
return False

def isfile(self, path_info):

This comment has been minimized.

Copy link
@mroutis

mroutis May 8, 2019

Collaborator

I think that if we are going to do operations on path_infos, it is better to have them on a PathInfo class and define them as properties. (cc: @pared , you might be interested in this, as you are refactoring path_info)

Also, what part of path_info is required to know if it is a file or not? is it just the path? why not doing something like path_info.path.isfile ?

def isdir(self, path_info):
return False

def walk(self, path_info):

This comment has been minimized.

Copy link
@mroutis

mroutis May 8, 2019

Collaborator

Same with this method, why won't we do something like os.walk(path_info.path)?

I don't like the aesthetics of this kind of inderections:
RemoteInstance -> RemoteBase -> walk -> path_info.walk -> os.walk(path).

Prefer the flatness of `os.walk(path_info.path).

This comment has been minimized.

Copy link
@efiop

efiop May 8, 2019

Author Member

@mroutis again, os.walk is only relevant for RemoteLOCAL. We have other remotes too, for which definition of walk() is different.

return

to_info = self.checksum_to_path_info(checksum)
self._save(path_info, checksum)

This comment has been minimized.

Copy link
@mroutis

mroutis May 8, 2019

Collaborator

This kind of abstraction doesn't look like the best one to me: foo calling _foo.
If I understand correctly, the reason behind doing this is to have a private method that instances can overwrite to modify the behavior without modifying the public interface.

A good refactor could be to migrate to ABC and use super() to call the parent's method: https://www.python-course.eu/python3_abstract_classes.php

If this isn't the case and I'm missing something, maybe we should improve the naming, or it is the wrong abstraction, having save and _save is not intuitive.

@@ -226,6 +439,7 @@ def ospath(self):
return posixpath

def checksum_to_path(self, checksum):
assert checksum

This comment has been minimized.

Copy link
@mroutis

mroutis May 8, 2019

Collaborator

Maybe we can use pytype and remove these assertions.

@@ -29,11 +29,18 @@ def __init__(self, dvc_version, expected, actual):
)


def _file_metadata_changed(actual_mtime, mtime, actual_size, size):
return actual_mtime != mtime or actual_size != size
class StateBase(object):

This comment has been minimized.

Copy link
@mroutis

mroutis May 8, 2019

Collaborator

What's the reason behind having another Base/Abstract class? The only one who is using StateBase is State, so it doesn't make a lot of sense to me.

It doesn't even raise NotImplementedError, just passes silently.

@efiop

This comment has been minimized.

Copy link
Member Author

commented May 8, 2019

Thanks for a late review @mroutis ! 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.