Unit tests and some bug fixes #110

antnieszka · 2016-12-05T20:46:17Z

Implements Profile.__new__() is somewhat bugged when running simple tests using Python 2.7 #99
Improves Add unit tests #19
Fixes Make beprof working with numpy 1.8 #33, Remove scipy from appveyor build #117, Test inclusion in package #111, Curve object creation #116, Provide some better error handling for non-float data #113, Travis 'docs' started to fail? #118, Investigate print(Profile object) error #115

grzanka · 2016-12-05T20:48:43Z

beprof/curve.py

@@ -113,7 +113,8 @@ def change_domain(self, domain):
            raise ValueError('in change_domain():' 'the old domain does not include the new one')

        y = np.interp(domain, self.x, self.y)
-        obj = self.__class__(np.stack((domain, y), axis=1), **self.__dict__['metadata'])
+        # np.dstack(...)[0] is used to extract nested array (previously used np.stack which behaved different)
+        obj = self.__class__(np.dstack((domain, y))[0], **self.__dict__['metadata'])


Add a comment why dstack is used here ? Why is it needed, what is it doing ?

Like "stack is available from numpy 1.10 and to preserve compatibility we use older function which does the same thing as np.stack((arrays), axis=1)" ?

Not really, I had in mind a description for a fellow programmer who takes over the project and is trying to figure out what is going on in these few lines of code, sth like:

We need to ... domain and y because .... We pass it as argument to ... because it takes .... Dstast is ... and this is the reason why we use [0]

antnieszka · 2016-12-05T21:12:41Z

I'm writing missing test for normalize and this came out:

>>> a
Profile([[ 1,  1],
       [ 2, 20],
       [ 3, 40]])
>>> a.normalize(0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ant6/.virtualenvs/in_beprof3/lib/python3.4/site-packages/beprof/profile.py", line 150, in normalize
    self.y /= ave
TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind''
>>> ave = np.average(a.y[np.fabs(a.x) <= 0])
>>> ave
nan
>>> a /= ave
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind''

Our method looks like this:

    def normalize(self, dt):
        """
        Normalize to 1 over [-dt, +dt] area
        :param dt:
        :return:
        """
        logger.info('Running {0}.normalize(dt={1})'.format(self.__class__, dt))
        try:
            ave = np.average(self.y[np.fabs(self.x) <= dt])
        except RuntimeWarning as e:
            logger.error('in normalize(). self class is {0}, dt={1}'.format(self.__class__, dt))
            raise Exception("Scaling factor error:\n" + str(e))
        self.y /= ave  # <-- extra try-except?

So it's not idiot-proof enough.

antnieszka · 2016-12-05T21:17:35Z

It's something with division...

>>> a
Profile([[ 1,  1],
       [ 2, 20],
       [ 3, 40]])
>>> ave = np.average(a.y[np.fabs(a.x) <= 10])
>>> ave
20.333333333333332
>>> a.y /= ave
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind''

(in_beprof3){16-12-05 22:18} ~ ant6% ➜ python -V
Python 3.4.5
(in_beprof3){16-12-05 22:18} ~ ant6% ➜ pip freeze
beprof==0.post0.dev1801480862596
numpy==1.11.2

antnieszka · 2016-12-05T21:22:39Z

@grzanka that's sad:

>>> import numpy as np
>>> a = np.array([1,2,3])
>>> b = 2.0
>>> a /= b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind''

More here: dipy/dipy#730 and numpy/numpy#6464
And this works:

>>> a = np.array([1.0, 2.0, 3.0])
>>> a /= 2.0
>>> a
array([ 0.5,  1. ,  1.5])

when a is filled with floats...

antnieszka · 2016-12-05T21:27:44Z

In other words - Profile.y data has to be floats or it errors...

codecov-io · 2016-12-05T22:23:35Z

Current coverage is 67.96% (diff: 98.21%)

Merging #110 into master will increase coverage by 25.93%

@@             master       #110   diff @@
==========================================
  Files             5          7     +2   
  Lines           502        768   +266   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits            211        522   +311   
+ Misses          291        246    -45   
  Partials          0          0

Powered by Codecov. Last update 298c6d6...2315c9e

grzanka · 2016-12-06T07:54:05Z

Typically we will work with floats. But you are right - we should make a separate issue for Y data with ints and methods which use division.
There should be better user-friendy error handling and we can also provide some optional casting to float in such method.

grzanka · 2016-12-06T13:20:18Z

tests/test_profile.py

-        assert np.isnan(p.x_at_y(4.9999999))
-        assert np.isnan(p.x_at_y(20.0000001))
-        assert np.isnan(p.x_at_y(7.5, reverse=True))
+        assert np.isnan(self.p.x_at_y(-1.0))


Why assert instead of self.assertTrue ?

Probably inspired by numpy docs... I'll change it in the upcoming commit.

antnieszka · 2016-12-06T13:54:04Z

@grzanka I think, we can now change required version of numpy in setup.py to 1.8.0?

grzanka · 2016-12-06T14:05:22Z

Yes, change it please

grzanka · 2016-12-06T14:08:16Z

beprof/curve.py

@@ -113,7 +113,13 @@ def change_domain(self, domain):
            raise ValueError('in change_domain():' 'the old domain does not include the new one')

        y = np.interp(domain, self.x, self.y)
-        obj = self.__class__(np.stack((domain, y), axis=1), **self.__dict__['metadata'])
+        # We need to join together domain and y because we are recreating 2 dimensional Curve object
+        # (we pass it as argument to self.__class__ to do so and it takes 2 or 3 dimensional arrays as argument.


We should have a discussion on curve with 3-D data. Check if it is implemented correctly, test methods etc... - separate issue.

grzanka · 2016-12-06T14:08:37Z

beprof/profile.py

@@ -144,11 +146,14 @@ def normalize(self, dt):
        except RuntimeWarning as e:
            logger.error('in normalize(). self class is {0}, dt={1}'.format(self.__class__, dt))
            raise Exception("Scaling factor error:\n" + str(e))
-        self.y /= ave
+        self.y = self.y / ave


Why this change was made ?

Fixing error mentioned above in comments:

>>> c = np.array([1,2,3]) >>> c array([1, 2, 3]) >>> c /= 2. Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: ufunc 'divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind'' >>> c array([1, 2, 3]) >>> c = c / 2. >>> c array([ 0.5, 1. , 1.5])

When we have only integers - should I roll back this one?

You can keep this change, but add info in docstring what happens if you have array of integers, explaining what if you:

normalize with integer factor

normalize with floating-point factor

But before proceeding further: consider these two snippets: self.y /= ave and self.y = self.y / ave - which one requires additional memory allocation ?

What if self.y is large array or floats (say 1e6) and we normalize by factor 0.5. Which one of abovementioned snippets will run faster ?

Reading https://docs.scipy.org/doc/numpy/reference/internals.html might help.

Ok, I have to ideas for rescale and normalize divisions:

def rescale(self, factor=1.0, allow_cast=True): if allow_cast: # will work in 99.9% cases try: self.y /= factor # there is this case, where "TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced # to provided output parameter (typecode 'l') according to the casting rule ''same_kind''" i thrown except TypeError as e: logger.info("Division in place is impossible, casting...\n%s" % e) self.y = self.y / factor elif not np.issubdtype(self.y.dtype.type, type(factor)): raise TypeError("Type mismatch - array dtype: %s, factor type: %s\nallow_cast flag set to True should help" % (self.y.dtype.type, type(factor))) else: try: self.y /= factor except TypeError as e: raise TypeError("%s\nallow_cast flag set to True should help" % e)

and simpler(?) one:

def rescale(self, factor=1.0, allow_cast=True): try: self.y /= factor except TypeError as e: logger.info("Division in place is impossible.\n%s" % e) if allow_cast: self.y = self.y / factor else: raise TypeError("%s\nallow_cast flag set to True should help" % e)

Second one better, but please add also proper docstring with doctests (see i.e. change_domain and rebinned methods).

Read also this: http://www.ianbicking.org/blog/2007/09/re-raising-exceptions.html
https://blog.ionelmc.ro/2014/08/03/the-most-underrated-feature-in-python-3/

Sure, the second came to my mind when I was writing this reply, so...

What about the div zero case? Should we somehow catch it, or just let it error or whatever?

Print nice error message and let numpy exception be thrown

Well, the above is not yet ready...

>a Curve([[ 0, 0], [ 5, 5], [10, 10]]) >a.rescale(3.7) >a Curve([[ 0, 0], [ 5, 1], [10, 2]]) >a.rescale(22) >a Curve([[ 0, 0], [ 5, 0], [10, 0]])

grzanka · 2016-12-06T14:09:25Z

beprof/profile.py

+        # todo: investigate why super().__str__() errors in python 2.7
+        #       after change of implementation of __new__
+        ret = curve.Curve.__str__(self)
+        # ret = super(curve.Curve).__str__()


Why did you added second commented line ?

It was the previous version left to discussion, but since we're using git - it's pointless, yes.

Remove it, also "investigate TODO" can directly go as new issue, instead of dirty comment.

It already has issue: #99

grzanka · 2016-12-06T14:10:19Z

tests/test_profile.py

+
+        # case - less or equal to 0
+        p1 = Profile([[1, 1], [2, 20], [3, 40]])
+        p1.normalize(-1)


Does it make sense to normalize this way ?

No, but we probably should inform user about it. Maybe raise ValueError()?

Good idea. Such exception is OK, but add also some message in it, guiding user what type of input is expected.

Probably "Expected positive input" would be enough?

…d minimal version of numpy to 1.8

grzanka · 2016-12-09T07:54:55Z

beprof/curve.py

+            if allow_cast:
+                self.y = self.y / factor
+            else:
+                logger.error("allow_cast flag set to True should help")


Does code execution continue or stops with some exception being rethrown ?

It stops, we could go for re-raise at the end - after calculating with self.y = self.y / factor or ignore TypeErrors thrown by self.y /= factor or... just log it?

try: self.y /= factor except TypeError as e: logger.error("Division in place is impossible") if allow_cast: self.y = self.y / factor else: logger.error("allow_cast flag set to True should help") raise e

Ignore error case (clean, but...):

>>> a Curve([[1, 5], [2, 6], [3, 7], [4, 8]]) >>> a.rescale(2) Division in place is impossible >>> a Curve([[1, 2], [2, 3], [3, 3], [4, 4]]) >>> a.rescale(2, allow_cast=False) Division in place is impossible allow_cast flag set to True should help >>> a Curve([[1, 2], [2, 3], [3, 3], [4, 4]])

Works like this for code below:

def rescale(self, factor=1.0, allow_cast=True): try: self.y /= factor except TypeError as e: logger.error("Division in place is impossible") if allow_cast: self.y = self.y / factor else: logger.error("allow_cast flag set to True should help")

The thing is - at least for me if something throws exception - it probably did not finish it's job (and possibly rolled back). So raising exception after successful division is... could be misinterpreted?

grzanka · 2016-12-09T19:36:44Z

beprof/profile.py

+                self.y = self.y / ave
+            else:
+                logger.error("Division in place impossible - allow_cast flag set to True should help")
+                raise e


Is it the correct way of re-throwing an exception ?
See i.e. http://blog.bstpierre.org/python-exception-handling-cleanup-and-reraise

Not really. Another try-catch would be nice and reporting original exception - like reraiser4()?

Edit: some of this examples are no longer supported e.g. in Python3.4

For:

try: self.y /= factor except TypeError as e: # (_, _, traceback) = sys.exc_info() logger.warning("Division in place is impossible: {0}".format(e)) if allow_cast: self.y = self.y / factor else: logger.error("allow_cast flag set to True should help") raise

result:

>>> a.rescale(0, False) Division in place is impossible: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind'' allow_cast flag set to True should help Traceback (most recent call last): File "<input>", line 1, in <module> File "/home/ant6/PycharmProjects/beprof/beprof/curve.py", line 111, in rescale self.y /= factor TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind''

result2

>>> a.rescale(0, True) Division in place is impossible: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind'' /home/ant6/PycharmProjects/beprof/beprof/curve.py:116: RuntimeWarning: divide by zero encountered in true_divide self.y = self.y / factor /home/ant6/PycharmProjects/beprof/beprof/curve.py:116: RuntimeWarning: invalid value encountered in true_divide self.y = self.y / factor Traceback (most recent call last): File "<input>", line 1, in <module> File "/home/ant6/PycharmProjects/beprof/beprof/curve.py", line 111, in rescale self.y /= factor TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind''

Both exceptions are reported

antnieszka · 2016-12-10T10:52:13Z

@grzanka ready for re-review

antnieszka added 7 commits December 5, 2016 14:24

Changed test directory

8b6ca98

Some basic test for Profile class

a803cef

Improving Profile.__new__ call

9bc6575

Added missing view test

ab7d7ad

FWHM test

3edbdcb

Resolves DataMedSci#33 - np.dstack used instead of np.stack

8179a99

Code reformat and extra comments

1ec3023

antnieszka added this to the 0.1.1 milestone Dec 5, 2016

antnieszka assigned antnieszka and grzanka Dec 5, 2016

grzanka reviewed Dec 5, 2016

View reviewed changes

Changed implementation of profile.normalize and added simple test

e4b4781

Improved test_profile a bit, fixed windows specific error and pep8

50dd584

antnieszka mentioned this pull request Dec 6, 2016

Use pytest as test runner? #112

Closed

grzanka reviewed Dec 6, 2016

View reviewed changes

Extra comments and tests code reformat'

fa2277e

grzanka reviewed Dec 6, 2016

View reviewed changes

antnieszka added 2 commits December 6, 2016 15:54

Added additional exception raising, removed some old comments, change…

8b7f632

…d minimal version of numpy to 1.8

Added wercker run with numpy < 1.9

d41b934

antnieszka added 4 commits December 9, 2016 00:03

Improved Curve.__new__ DataMedSci#116 and tests

b7cc180

Attempt to fix wercker pipeline for numpy 1.8

d22284b

Moved tests to dir beprof

ecc5c13

Fixed apveyor build

eb5bba4

grzanka reviewed Dec 9, 2016

View reviewed changes

antnieszka added 5 commits December 9, 2016 15:21

Removed scipy from appveyor cofig (fixed DataMedSci#117)

0986388

Changed decreased number of sphinx jobs in tox:docs

3532a1f

Another try to fix DataMedSci#118

55cc6fa

Fixed version of docutils DataMedSci#118

6fcd583

Improved implementation of rescale, normalize and tests

23dc8f3

grzanka reviewed Dec 9, 2016

View reviewed changes

antnieszka added 3 commits December 9, 2016 20:58

Fix DataMedSci#115 curve.__str__

ede8556

Refactored tests a bit

652be68

Added simple doctest in profile.normalize

2315c9e

grzanka approved these changes Dec 10, 2016

View reviewed changes

antnieszka merged commit 773045f into DataMedSci:master Dec 10, 2016

antnieszka deleted the feature/19-unit-testing branch December 13, 2016 16:58

Unit tests and some bug fixes #110

Unit tests and some bug fixes #110

Conversation

antnieszka commented Dec 5, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antnieszka commented Dec 5, 2016

antnieszka commented Dec 5, 2016 • edited

antnieszka commented Dec 5, 2016 • edited

antnieszka commented Dec 5, 2016

codecov-io commented Dec 5, 2016 • edited

Current coverage is 67.96% (diff: 98.21%)

grzanka commented Dec 6, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antnieszka commented Dec 6, 2016

grzanka commented Dec 6, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antnieszka Dec 8, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antnieszka Dec 9, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antnieszka commented Dec 10, 2016

antnieszka commented Dec 5, 2016 •

edited

antnieszka commented Dec 5, 2016 •

edited

antnieszka commented Dec 5, 2016 •

edited

codecov-io commented Dec 5, 2016 •

edited

antnieszka Dec 8, 2016 •

edited

antnieszka Dec 9, 2016 •

edited