extend capabilities of read_raw_data #84

raphaeldussin · 2018-06-22T15:32:51Z

possibility to read part of the file, with offset and partial_read
choice of row/column major order

This will allow the refactor of read_mds by putting the np.fromfile and np.memmap
calls into one generic function

* possibility to read part of the file, with offset and partial_read * choice of row/column major order

codecov-io · 2018-06-22T15:39:02Z

Codecov Report

Merging #84 into master will increase coverage by 0.11%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master      #84      +/-   ##
==========================================
+ Coverage   91.65%   91.77%   +0.11%     
==========================================
  Files           4        4              
  Lines         635      644       +9     
  Branches      140      143       +3     
==========================================
+ Hits          582      591       +9     
  Misses         33       33              
  Partials       20       20

Impacted Files	Coverage Δ
xmitgcm/utils.py	`91.47% <100%> (+0.35%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 14c7338...91c1a70. Read the comment docs.

rabernat · 2018-06-22T15:40:10Z

This looks like a nice simple extension of the read_raw_data function. I see how it will be useful in the future refactoring. Thanks!

The new options need a test, however. The current test of read_raw_data is here:
https://github.com/xgcm/xmitgcm/blob/master/xmitgcm/test/test_mds_store.py#L202-L221

You can either extend that test or add a new test function that covers the new options.

raphaeldussin · 2018-06-22T19:46:24Z

working on it. I have found some case where the error message is not very informative.
will catch those exception and set proper error message then resubmit PR with testing

* function will warn user if trying to pass inconsistent args * function checks byte offset < file size

stickler-ci · 2018-06-22T20:38:42Z

xmitgcm/utils.py

+                          '(expected %g, found %g)' %
+                          (datafile,
+                           expected_number_of_bytes,
+                           actual_number_of_bytes))


E501 line too long (80 > 79 characters)

stickler-ci · 2018-06-22T20:39:17Z

xmitgcm/utils.py

+            raise IOError('File `%s` does not have the correct size '
+                          '(expected %g, found %g)' %
+                          (datafile,
+                           expected_number_of_bytes,


E501 line too long (80 > 79 characters)

stickler-ci · 2018-06-25T15:48:51Z

xmitgcm/test/test_mds_store.py

+            assert isinstance(mdata, np.memmap)
+
+        # test it breaks when it should
+        with pytest.raises(IOError):


E501 line too long (82 > 79 characters)

stickler-ci · 2018-06-25T15:48:51Z

xmitgcm/test/test_mds_store.py


 # a meta test
+


F841 local variable '_' is assigned to but never used

stickler-ci · 2018-06-25T15:48:52Z

xmitgcm/test/test_mds_store.py


 # a meta test
+
+


E501 line too long (93 > 79 characters)

stickler-ci · 2018-06-25T15:49:38Z

xmitgcm/test/test_mds_store.py

+                                  offset=offset, partial_read=True, use_mmap=True)
+            assert isinstance(mdata, np.memmap)
+
+        # test it breaks when it should


E501 line too long (82 > 79 characters)

stickler-ci · 2018-06-25T15:49:38Z

xmitgcm/test/test_mds_store.py


 # a meta test
+


E501 line too long (93 > 79 characters)

… into dev_read_mds

raphaeldussin · 2018-06-25T16:37:40Z

yeah finally!

rabernat

This looks great. Thanks Raphael!

There are a few minor changes you could make, but I'm generally happy to merge as is.

rabernat · 2018-06-26T16:23:27Z

xmitgcm/test/test_mds_store.py


 # a meta test
+
+


maybe remove these blank lines

rabernat · 2018-06-26T16:23:30Z

xmitgcm/test/test_mds_store.py

+                shape[0]*shape[1]*shape[2]*dtype.itemsize), partial_read=True)
+            _ = read_raw_data(fname, dtype, shape, offset=(
+                shape[0]*shape[1]*shape[2]*dtype.itemsize), partial_read=True,
+                use_mmap=True)


This looks great! 👍

rabernat · 2018-06-26T16:24:31Z

xmitgcm/utils.py

-    d.shape = shape
-    return d
+        pass
+    assert(offset < actual_number_of_bytes), 'offset greater than filesize'


should we raise an error here instead of assert?

assert should be used only for internal consistency checks. Is that what this is?

that's what I did originally but I couldn't get codecov to pass.
I guess when I put a condition that is not realized much then that ruins the coverage of the statement.

The way to resolve this is to put it back as an exception but add a test function that the error is raised.

Just do what you think is best and then merge.

rabernat · 2018-06-26T16:28:19Z

xmitgcm/test/test_mds_store.py

+    # test optional functionalities
+    shape = (5, 15, 10)
+    shape_subset = (15, 10)
+    for dtype in [np.dtype('f8'), np.dtype('f4'), np.dtype('i4')]:


ok, one comment about pytest:
rather than doing a for loop within the test function, you could parameterize this dtype. Before the function, you could add

@pytest.mark.parametrize("dtype", [np.dtype('f8'), np.dtype('f4'), np.dtype('i4')]) def test_read_raw_data(tmpdir, dtype):

cool! I didn't know you could do that.

rabernat

I just saw one small thing that you could do to improve the test function using parameterization.

* remove blank lines * iteration on dtype cleaner

rabernat · 2018-07-10T16:13:37Z

xmitgcm/utils.py

-    d.shape = shape
-    return d
+        pass
+    assert(offset < actual_number_of_bytes), 'offset greater than filesize'


The way to resolve this is to put it back as an exception but add a test function that the error is raised.

Just do what you think is best and then merge.

stickler-ci · 2018-07-10T19:33:21Z

xmitgcm/test/test_mds_store.py

+                _ = read_raw_data(fname, dtype, shape, offset=(
+                    shape[0]*shape[1]*shape[2]*dtype.itemsize), partial_read=True)
+                _ = read_raw_data(fname, dtype, shape, offset=(
+                    shape[0]*shape[1]*shape[2]*dtype.itemsize), partial_read=True,


E999 IndentationError: unindent does not match any outer indentation level

rabernat · 2018-07-16T14:46:53Z

I think this is ready to merge, despite the codecov complaints.

Please do not make any new pull requests other than those related to the release. We need to make a release to mark the current state of xmitgcm, before making major changes. That is the whole point of having versions.

rabernat · 2018-07-16T14:50:11Z

xmitgcm/utils.py

+        pass
+    else:
+        raise ValueError('bytes offset %g is greater than file size %g' %
+                         (offset, actual_number_of_bytes))


Based on the codecov report, it looks like this is the error that is not getting tested:
https://codecov.io/gh/xgcm/xmitgcm/pull/84/src/xmitgcm/utils.py?before=xmitgcm/utils.py#L247

weird! because that's what I trying to test here :

https://github.com/raphaeldussin/xmitgcm/blob/fe85c122d3f46c90b04ba054d7eba60cc6f08fc4/xmitgcm/test/test_mds_store.py#L266

Ah! Now I understand why it doesn't work.

You need a separate with block for each exception you want to catch. The block exits as soon as it finds the exception. So only the first of the four read_raw_data calls in your with pytest.raises(ValueError): block is actually getting run.

Sorry I didn't catch that in my review.

thanks! Ok that make sense now!

rabernat · 2018-07-16T21:58:33Z

Great!

* extend capabilities of read_raw_data * possibility to read part of the file, with offset and partial_read * choice of row/column major order * testing + better error handling * function will warn user if trying to pass inconsistent args * function checks byte offset < file size * Fixing style errors. * get it shorter * completing tests for codecov * Fixing style errors. * fix line length * fix line length * try to improve coverage * finalize request * remove blank lines * iteration on dtype cleaner * replace assert with raise error * try to fool code cov * fix indentation * codecov didn't bite the bait * fix error in testing

extend capabilities of read_raw_data

712aa1e

* possibility to read part of the file, with offset and partial_read * choice of row/column major order

raphaeldussin and others added 2 commits June 22, 2018 16:36

testing + better error handling

919a1e7

* function will warn user if trying to pass inconsistent args * function checks byte offset < file size

Fixing style errors.

b37a209

stickler-ci reviewed Jun 22, 2018

View reviewed changes

raphaeldussin and others added 3 commits June 22, 2018 16:44

get it shorter

d7d0446

completing tests for codecov

fcba8b5

Fixing style errors.

5017bf4

stickler-ci reviewed Jun 25, 2018

View reviewed changes

raphaeldussin added 4 commits June 25, 2018 11:51

fix line length

bada2ee

Merge branch 'dev_read_mds' of https://github.com/raphaeldussin/xmitgcm…

cd27cb6

… into dev_read_mds

fix line length

af1b05e

try to improve coverage

522bf5f

rabernat approved these changes Jun 26, 2018

View reviewed changes

rabernat reviewed Jun 26, 2018

View reviewed changes

rabernat requested changes Jun 26, 2018

View reviewed changes

finalize request

4019019

* remove blank lines * iteration on dtype cleaner

rabernat approved these changes Jul 10, 2018

View reviewed changes

raphaeldussin added 2 commits July 10, 2018 15:14

replace assert with raise error

e23a8d6

try to fool code cov

e8ae40b

stickler-ci reviewed Jul 10, 2018

View reviewed changes

raphaeldussin added 2 commits July 10, 2018 15:35

fix indentation

ae0d15b

codecov didn't bite the bait

fe85c12

rabernat reviewed Jul 16, 2018

View reviewed changes

fix error in testing

91c1a70

rabernat merged commit 6b24a3c into MITgcm:master Jul 16, 2018

raphaeldussin deleted the dev_read_mds branch July 17, 2018 19:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

extend capabilities of read_raw_data #84

extend capabilities of read_raw_data #84

raphaeldussin commented Jun 22, 2018

codecov-io commented Jun 22, 2018 •

edited

Loading

rabernat commented Jun 22, 2018

raphaeldussin commented Jun 22, 2018

stickler-ci Jun 22, 2018

stickler-ci Jun 22, 2018

stickler-ci Jun 25, 2018

stickler-ci Jun 25, 2018

stickler-ci Jun 25, 2018

stickler-ci Jun 25, 2018

stickler-ci Jun 25, 2018

raphaeldussin commented Jun 25, 2018

rabernat left a comment

rabernat Jun 26, 2018

raphaeldussin Jun 26, 2018

rabernat Jun 26, 2018

rabernat Jun 26, 2018

raphaeldussin Jun 26, 2018

rabernat Jul 10, 2018

rabernat Jun 26, 2018

raphaeldussin Jun 26, 2018

rabernat left a comment

rabernat Jul 10, 2018

stickler-ci Jul 10, 2018

rabernat commented Jul 16, 2018 •

edited

Loading

rabernat Jul 16, 2018

raphaeldussin Jul 16, 2018

rabernat Jul 16, 2018

raphaeldussin Jul 16, 2018

rabernat commented Jul 16, 2018

extend capabilities of read_raw_data #84

extend capabilities of read_raw_data #84

Conversation

raphaeldussin commented Jun 22, 2018

codecov-io commented Jun 22, 2018 • edited Loading

Codecov Report

rabernat commented Jun 22, 2018

raphaeldussin commented Jun 22, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raphaeldussin commented Jun 25, 2018

rabernat left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rabernat left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rabernat commented Jul 16, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rabernat commented Jul 16, 2018

codecov-io commented Jun 22, 2018 •

edited

Loading

rabernat commented Jul 16, 2018 •

edited

Loading