Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: optimize cache listing for local, ssh and hdfs #2836

Merged
merged 1 commit into from
Nov 28, 2019

Conversation

Suor
Copy link
Contributor

@Suor Suor commented Nov 22, 2019

  • make all three lazy
  • simplify local one
  • use deque instead of list for hdfs

- make all three lazy
- simplify local one
- use deque instead of list for hdfs
Comment on lines +267 to +269
# If we simply return an iterator then with above closes instantly
for path in ssh.walk_files(self.path_info.path):
yield path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't that mean that connection will be open until the iterator is closed?
Considering that now it is used only in remote.all, which was supposed to create a generator in the first place this change is good.

I am just wondering if it will not become fragile if, in the future, we will start to wrap some logic around this method (for any reason). But also, I am worrying prematurely here. I just wanted to ask this and keep in mind for the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this will hold the connection open. Not sure this will be an issue.

@Suor
Copy link
Contributor Author

Suor commented Nov 26, 2019

@MrOutis may you take a look at this?

@@ -68,15 +64,15 @@ def hadoop_fs(self, cmd, user=None):
close_fds = os.name != "nt"

executable = os.getenv("SHELL") if os.name != "nt" else None
p = Popen(
p = subprocess.Popen(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @Suor !

@efiop
Copy link
Contributor

efiop commented Nov 28, 2019

Thank you @Suor ! 🙏

@efiop efiop merged commit 838efb2 into iterative:master Nov 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants