Skip to content

Conversation

@baxen
Copy link
Contributor

@baxen baxen commented Jul 20, 2021

Fixes #712

@baxen
Copy link
Contributor Author

baxen commented Jul 20, 2021

I couldn't really find a good place to add a test for this in fsspec, looks like this is more something that is deferred to tests in the concrete async filesystems like GCSFS? I could possibly extend https://github.com/dask/gcsfs/blob/main/gcsfs/tests/test_core.py#L460 to cover this regression

@baxen baxen force-pushed the baxen/asyn_put_lists branch from 36ea406 to 5c2ce89 Compare July 20, 2021 01:12
baxen added a commit to baxen/gcsfs that referenced this pull request Jul 20, 2021
There is currently a bug that this would have caught from
fsspec/filesystem_spec#712

Will fail until fsspec/filesystem_spec#713 is available.
baxen added a commit to baxen/gcsfs that referenced this pull request Jul 20, 2021
There is currently a bug that this would have caught from
fsspec/filesystem_spec#712

Will fail until fsspec/filesystem_spec#713 is available.
baxen added a commit to baxen/gcsfs that referenced this pull request Jul 20, 2021
There is currently a bug that this would have caught from
fsspec/filesystem_spec#712

Will fail until fsspec/filesystem_spec#713 is available.
@martindurant
Copy link
Member

Is this an issue for _get too? HTTPFileSystem is the only implementation within this repo that's async, and it's read-only. So any test would need to be reading against that.

@baxen
Copy link
Contributor Author

baxen commented Jul 26, 2021

Nope no issue with _get - this was coming just from the code that was handling making directories. And I think https implementation has a good test of this already in test_multi_download

I believe you saw it, but the test in fsspec/gcsfs#401 covers it indirectly at least

Copy link
Member

@martindurant martindurant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good and function. Just the one question (see below).

I agree now that this will only get tested in gcsfs, but that's OK. Note that gcsfs tests run in CI here, so a test there ( fsspec/gcsfs#401 ) will show up here too after merging.

fsspec/asyn.py Outdated
rpaths = other_paths(lpaths, rpath)

rdirs = [r for l, r in zip(lpaths, rpaths) if os.path.isdir(l)]
file_pairs = [(l, r) for l, r in zip(lpaths, rpaths) if not os.path.isdir(l)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we possibly avoid calling isdir on every file twice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now called just once per file

@baxen baxen force-pushed the baxen/asyn_put_lists branch from 5c2ce89 to 12684e9 Compare July 26, 2021 16:49
@martindurant martindurant merged commit ee22435 into fsspec:master Jul 26, 2021
baxen added a commit to baxen/gcsfs that referenced this pull request Jul 26, 2021
There is currently a bug that this would have caught from
fsspec/filesystem_spec#712

Will fail until fsspec/filesystem_spec#713 is available.
martindurant pushed a commit to fsspec/gcsfs that referenced this pull request Jul 30, 2021
There is currently a bug that this would have caught from
fsspec/filesystem_spec#712

Will fail until fsspec/filesystem_spec#713 is available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Filesystem put fails if inputs and outputs are lists of individual files

2 participants