-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extend capabilities of read_raw_data #84
Conversation
* possibility to read part of the file, with offset and partial_read * choice of row/column major order
Codecov Report
@@ Coverage Diff @@
## master #84 +/- ##
==========================================
+ Coverage 91.65% 91.77% +0.11%
==========================================
Files 4 4
Lines 635 644 +9
Branches 140 143 +3
==========================================
+ Hits 582 591 +9
Misses 33 33
Partials 20 20
Continue to review full report at Codecov.
|
This looks like a nice simple extension of the read_raw_data function. I see how it will be useful in the future refactoring. Thanks! The new options need a test, however. The current test of You can either extend that test or add a new test function that covers the new options. |
working on it. I have found some case where the error message is not very informative. |
* function will warn user if trying to pass inconsistent args * function checks byte offset < file size
'(expected %g, found %g)' % | ||
(datafile, | ||
expected_number_of_bytes, | ||
actual_number_of_bytes)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (80 > 79 characters)
raise IOError('File `%s` does not have the correct size ' | ||
'(expected %g, found %g)' % | ||
(datafile, | ||
expected_number_of_bytes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (80 > 79 characters)
xmitgcm/test/test_mds_store.py
Outdated
assert isinstance(mdata, np.memmap) | ||
|
||
# test it breaks when it should | ||
with pytest.raises(IOError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (82 > 79 characters)
xmitgcm/test/test_mds_store.py
Outdated
|
||
# a meta test | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F841 local variable '_' is assigned to but never used
xmitgcm/test/test_mds_store.py
Outdated
|
||
# a meta test | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (93 > 79 characters)
xmitgcm/test/test_mds_store.py
Outdated
offset=offset, partial_read=True, use_mmap=True) | ||
assert isinstance(mdata, np.memmap) | ||
|
||
# test it breaks when it should |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (82 > 79 characters)
xmitgcm/test/test_mds_store.py
Outdated
|
||
# a meta test | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (93 > 79 characters)
yeah finally! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Thanks Raphael!
There are a few minor changes you could make, but I'm generally happy to merge as is.
xmitgcm/test/test_mds_store.py
Outdated
|
||
# a meta test | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe remove these blank lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
xmitgcm/test/test_mds_store.py
Outdated
shape[0]*shape[1]*shape[2]*dtype.itemsize), partial_read=True) | ||
_ = read_raw_data(fname, dtype, shape, offset=( | ||
shape[0]*shape[1]*shape[2]*dtype.itemsize), partial_read=True, | ||
use_mmap=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! 👍
xmitgcm/utils.py
Outdated
d.shape = shape | ||
return d | ||
pass | ||
assert(offset < actual_number_of_bytes), 'offset greater than filesize' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we raise an error here instead of assert?
assert should be used only for internal consistency checks. Is that what this is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's what I did originally but I couldn't get codecov to pass.
I guess when I put a condition that is not realized much then that ruins the coverage of the statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way to resolve this is to put it back as an exception but add a test function that the error is raised.
Just do what you think is best and then merge.
xmitgcm/test/test_mds_store.py
Outdated
# test optional functionalities | ||
shape = (5, 15, 10) | ||
shape_subset = (15, 10) | ||
for dtype in [np.dtype('f8'), np.dtype('f4'), np.dtype('i4')]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, one comment about pytest:
rather than doing a for loop within the test function, you could parameterize this dtype. Before the function, you could add
@pytest.mark.parametrize("dtype", [np.dtype('f8'), np.dtype('f4'), np.dtype('i4')])
def test_read_raw_data(tmpdir, dtype):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool! I didn't know you could do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just saw one small thing that you could do to improve the test function using parameterization.
* remove blank lines * iteration on dtype cleaner
xmitgcm/utils.py
Outdated
d.shape = shape | ||
return d | ||
pass | ||
assert(offset < actual_number_of_bytes), 'offset greater than filesize' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way to resolve this is to put it back as an exception but add a test function that the error is raised.
Just do what you think is best and then merge.
xmitgcm/test/test_mds_store.py
Outdated
_ = read_raw_data(fname, dtype, shape, offset=( | ||
shape[0]*shape[1]*shape[2]*dtype.itemsize), partial_read=True) | ||
_ = read_raw_data(fname, dtype, shape, offset=( | ||
shape[0]*shape[1]*shape[2]*dtype.itemsize), partial_read=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E999 IndentationError: unindent does not match any outer indentation level
I think this is ready to merge, despite the codecov complaints. Please do not make any new pull requests other than those related to the release. We need to make a release to mark the current state of xmitgcm, before making major changes. That is the whole point of having versions. |
pass | ||
else: | ||
raise ValueError('bytes offset %g is greater than file size %g' % | ||
(offset, actual_number_of_bytes)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the codecov report, it looks like this is the error that is not getting tested:
https://codecov.io/gh/xgcm/xmitgcm/pull/84/src/xmitgcm/utils.py?before=xmitgcm/utils.py#L247
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weird! because that's what I trying to test here :
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! Now I understand why it doesn't work.
You need a separate with
block for each exception you want to catch. The block exits as soon as it finds the exception. So only the first of the four read_raw_data
calls in your with pytest.raises(ValueError):
block is actually getting run.
Sorry I didn't catch that in my review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! Ok that make sense now!
Great! |
* extend capabilities of read_raw_data * possibility to read part of the file, with offset and partial_read * choice of row/column major order * testing + better error handling * function will warn user if trying to pass inconsistent args * function checks byte offset < file size * Fixing style errors. * get it shorter * completing tests for codecov * Fixing style errors. * fix line length * fix line length * try to improve coverage * finalize request * remove blank lines * iteration on dtype cleaner * replace assert with raise error * try to fool code cov * fix indentation * codecov didn't bite the bait * fix error in testing
This will allow the refactor of read_mds by putting the np.fromfile and np.memmap
calls into one generic function