-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warn when creating driver that points to an empty directory #77
Conversation
Shall we base this branch onto #72, there are some common changes in both, it might be safer to see how the changes combined work together. |
8dca81c
to
dcc60a5
Compare
I have rebased and tests for both branches pass. So we can merge #72 first, and then take care of this one @AlirezaSohofi @AlpAribal. |
test store creation with empty path and non-empty path create directory when inserting shards Update squirrel/store/squirrel_store.py Co-authored-by: Alireza Sohofi <a.sohofi@gmail.com> Update test/test_driver/test_sq_store.py Co-authored-by: Alp Arıbal <AlpAribal@users.noreply.github.com> move dir-exist check out of set operation add set operation in test; adapt dir exist flag, because dir doesnt exist after clean remove created_dir_if_not_exist function bump minor version throw error when instantiating a driver that points to an empty or invalid url refactor warning into FilePathGenerator
6086900
to
f71a756
Compare
squirrel/iterstream/source.py
Outdated
if not self._returned_url: | ||
self._returned_url = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the same check for the nested
case as well. I think, right now, we will get an empty warning for a dir containing only dirs but no files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to cover the nested case though? We are not reading in a nested way in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FilePathGenerator is not only menat to be used in SquirrelStore
, so it should be general and and able to handle nested cases too.
test/test_driver/test_msgpack.py
Outdated
with warnings.catch_warnings(): | ||
warnings.simplefilter("ignore") | ||
_ = MessagepackDriver(url=tmp_dir).get_iter().collect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we catch/filter warnings here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to check that when there are items in the store, then no warnings shall be printed.
This .simplefilter
asserts that no warning is shown:
https://stackoverflow.com/questions/45671803/how-to-use-pytest-to-assert-no-warning-is-raised
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it is in fact unintuitive, since a directory containing empty directories will no yield warnings
test/test_driver/test_msgpack.py
Outdated
with warnings.catch_warnings(): | ||
warnings.simplefilter("ignore") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we catch/filter warnings here?
test/test_driver/test_msgpack.py
Outdated
@@ -10,6 +11,34 @@ | |||
from squirrel.store import SquirrelStore | |||
|
|||
|
|||
def test_invalid_url(local_msgpack_url: URL) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would rename this test, it does not only use invalid urls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to add nested
cases here, perhaps via parametrize
squirrel/iterstream/source.py
Outdated
yield f"{self.protocol}{url}" | ||
if not self._returned_url: | ||
self._returned_url = True | ||
yield f"{self.protocol}{url}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yield f"{self.protocol}{url}" | |
yield f"{self.protocol}{url}" |
if not self._returned_file_url and not self.fs.isdir(path=url): | ||
self._returned_file_url = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not self._returned_file_url and not self.fs.isdir(path=url): | |
self._returned_file_url = True | |
if not self._returned_file_url: | |
self._returned_file_url = True | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need not self.fs.isdir(path=url)
?
Description
A warning is shown when we iterate over a driver, that points to an empty or non-existent directory.
Type of change
Checklist: