Exposing the lower and upper limits V.2 #41

eric-valente · 2018-04-16T22:25:43Z

Trying to get this PR merged. I forked and fixed the remaining issues.

"either input an array or a single value of lower and upper limits
check that lower_limits <= 0 and upper_limits >= 0"

stephen-hoover

Thank you for reviving this PR! I noticed some things which the previous review failed to catch. The comments apply to both logistic and linear algorithms.

stephen-hoover · 2018-04-16T22:45:23Z

glmnet/linear.py

+        if any(self.lower_limits) > 0 if isinstance(self.lower_limits, np.ndarray) else self.lower_limits > 0:
+            raise ValueError("lower_limits must be non-positive")
+
+        if any(self.upper_limits) < 0 if isinstance(self.upper_limits, np.ndarray) else self.upper_limits < 0:


The second parentheses around the any call need to be after the 0, here and above.

any(self.lower_limits > 0)

Another suggestion: Cast the inputs with an np.asarray if they fail an np.isscalar check; that will allow lists or tuples to be valid inputs. There should also be a ValueError raised if there's an array input with the wrong number of elements.

Please also put these checks in fit, maybe in their own helper function. The scikit-learn API requires that __init__ do nothing other than set the input parameters.

stephen-hoover · 2018-04-16T22:49:37Z

glmnet/tests/test_linear.py

+            m = ElasticNet(lower_limits=lower_limits, upper_limits=upper_limits, random_state=5934)
+            m = m.fit(x, y)
+            assert(np.all(m.coef_) >= 0)
+            assert(np.all(m.coef_) <= 1)            


As long as all coefficients are non-zero, these tests reduce to assert True >= 0 and assert True <= 1. The closing parenthesis on the np.all needs to be after the integer.

stephen-hoover · 2018-04-16T22:50:18Z

glmnet/tests/test_linear.py

+    def test_coef_limits(self):
+            x, y = self.inputs[0]
+            lower_limits = 0
+            upper_limits = np.repeat(1,x.shape[1])


Since this test fails to catch the bug in the bounds checking, how about you try re-writing it with lower limits of np.repeat(-1, x.shape[1]) and upper limits of 0.

eric-valente · 2018-04-17T00:30:33Z

Think I covered all your points except this: There should also be a ValueError raised if there's an array input with the wrong number of elements.

Can you give me a little guidance? Assuming if the array > number of features throw ValueError?

stephen-hoover · 2018-04-17T13:30:45Z

The limits must be either a scalar, or else a list or array with length equal to the number of features. In the if not np.isscalar(self.lower_limits) blocks, you can put a check like if len(...) != X.shape[1]: raise ValueError. You should move the X, y = check_X_y(X, y, accept_sparse='csr', ensure_min_samples=2) line to the top of the function so that you can be sure that X is an array at that point.

eric-valente · 2018-04-17T14:08:03Z

Okay - made the request changed, will check if it passes ci.

eric-valente · 2018-04-17T15:01:32Z

AssertionError on the logistic tests, not linear. Thoughts?

Traceback (most recent call last):
File "/src/python-glmnet/glmnet/tests/test_logistic.py", line 101, in test_coef_limits
assert(np.all(m.coef_ >= 0))
AssertionError

stephen-hoover · 2018-04-17T16:29:37Z

I'm bothered by the fact that the linear tests don't fail. In each test, you've set lower limits of -1 and upper limits of 0. The tests are testing that the coefficients are between 0 and 1. The tests should be changed to check that coefficients are between -1 and 0. If the linear tests are passing, then it must be that all coefficients are 0.

Try setting alpha = 0 in both the ElasticNet and LogitNet constructors in the new tests. It defaults to 1, which is lasso. That will push coefficients to equal 0. If you make that change in the linear tests before switching the limits, then that test should start failing.

eric-valente · 2018-04-17T16:38:53Z

No luck w/ alpha=0

======================================================================
FAIL: test_coef_limits (test_linear.TestElasticNet)

Traceback (most recent call last):
File "/src/python-glmnet/glmnet/tests/test_linear.py", line 111, in test_coef_limits
assert(np.all(m.coef_ >= 0))
AssertionError

======================================================================
FAIL: test_coef_limits (test_logistic.TestLogitNet)

Traceback (most recent call last):
File "/src/python-glmnet/glmnet/tests/test_logistic.py", line 101, in test_coef_limits
assert(np.all(m.coef_ >= 0))
AssertionError

eric-valente · 2018-04-17T16:43:04Z

I think the code is okay but we are setting lower_limits to -1, it makes sense that this test fails. We set lower_limits to -1 so it will trigger this: assert(np.all(m.coef_ >= 0))

Would this make more sense?

       assert(np.all(m.coef_ >= -1))
        assert(np.all(m.coef_ <= 0))

stephen-hoover · 2018-04-17T16:45:01Z

Yes, I'm happy to see that both tests are now failing when you set alpha=0. That's good. If you switch the asserts like you suggest, then the tests should pass.

eric-valente · 2018-04-17T16:47:52Z

Changed the bounds of the test, but does the same thing. It passed. Anything else?

        x, y = self.inputs[0]
        lower_limits = 0
        upper_limits = np.repeat(1, x.shape[1])
        m = ElasticNet(lower_limits=lower_limits, upper_limits=upper_limits, random_state=5934, alpha=0)
        m = m.fit(x, y)
        assert(np.all(m.coef_ >= 0))
        assert(np.all(m.coef_ <= 1))

stephen-hoover · 2018-04-17T16:51:24Z

I'd previously requested that you change the limits to -1 to 0 because the original tests, with limits from 0 to 1, failed to expose a bug in the bounds checking. I'd prefer that these tests use a lower limit of -1 and an upper limit of 0.

eric-valente · 2018-04-17T16:57:56Z

Okay put it back and changed this:

        assert(np.all(m.coef_ >= -1))
        assert(np.all(m.coef_ <= 0))

eric-valente · 2018-04-17T17:03:41Z

Okay looks good! Let me know.

stephen-hoover

LGTM! Thank you for your contribution!

Eric Valente added 3 commits April 16, 2018 18:12

Added upper/lower coefficient limits

12b455b

Fixed spacing

7733348

Fixed some spacing

0131d8f

stephen-hoover suggested changes Apr 16, 2018

View reviewed changes

Updated error handling and tests

d94cc13

Eric Valente added 3 commits April 16, 2018 20:50

Typo fixed

636c911

Another typo

10f9016

Typo

fbeee37

Added additional checks for upper/lower limits

9f91d08

Fixed some whitespaces

33425e9

Added alpha=0 to test constructors

0ebb172

Changing test coef limits

28559b6

Changed coef test to -1 to 0

88003c5

stephen-hoover approved these changes Apr 17, 2018

View reviewed changes

stephen-hoover merged commit de9b92b into civisanalytics:master Apr 17, 2018

stephen-hoover mentioned this pull request Apr 17, 2018

Force positive coefficient #40

Closed

ericjster mentioned this pull request Mar 2, 2019

Create new release for PIP past 2.0 #49

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exposing the lower and upper limits V.2 #41

Exposing the lower and upper limits V.2 #41

eric-valente commented Apr 16, 2018

stephen-hoover left a comment

stephen-hoover Apr 16, 2018

stephen-hoover Apr 16, 2018

stephen-hoover Apr 16, 2018

eric-valente commented Apr 17, 2018

stephen-hoover commented Apr 17, 2018

eric-valente commented Apr 17, 2018

eric-valente commented Apr 17, 2018

stephen-hoover commented Apr 17, 2018

eric-valente commented Apr 17, 2018

eric-valente commented Apr 17, 2018

stephen-hoover commented Apr 17, 2018

eric-valente commented Apr 17, 2018

stephen-hoover commented Apr 17, 2018

eric-valente commented Apr 17, 2018

eric-valente commented Apr 17, 2018

stephen-hoover left a comment

Exposing the lower and upper limits V.2 #41

Exposing the lower and upper limits V.2 #41

Conversation

eric-valente commented Apr 16, 2018

stephen-hoover left a comment

Choose a reason for hiding this comment

stephen-hoover Apr 16, 2018

Choose a reason for hiding this comment

stephen-hoover Apr 16, 2018

Choose a reason for hiding this comment

stephen-hoover Apr 16, 2018

Choose a reason for hiding this comment

eric-valente commented Apr 17, 2018

stephen-hoover commented Apr 17, 2018

eric-valente commented Apr 17, 2018

eric-valente commented Apr 17, 2018

stephen-hoover commented Apr 17, 2018

eric-valente commented Apr 17, 2018

====================================================================== FAIL: test_coef_limits (test_linear.TestElasticNet)

====================================================================== FAIL: test_coef_limits (test_logistic.TestLogitNet)

eric-valente commented Apr 17, 2018

stephen-hoover commented Apr 17, 2018

eric-valente commented Apr 17, 2018

stephen-hoover commented Apr 17, 2018

eric-valente commented Apr 17, 2018

eric-valente commented Apr 17, 2018

stephen-hoover left a comment

Choose a reason for hiding this comment

======================================================================
FAIL: test_coef_limits (test_linear.TestElasticNet)

======================================================================
FAIL: test_coef_limits (test_logistic.TestLogitNet)