ENH: special: Add the softmax function #8872

WarrenWeckesser · 2018-05-27T17:08:05Z

This pull request is a continuation of #8556

hameerabbasi

Couple of minor; optional changes.

hameerabbasi · 2018-05-27T17:25:35Z

scipy/special/__init__.py

@@ -640,7 +641,7 @@
 from ._ufuncs import *

 from .basic import *
-from ._logsumexp import logsumexp
+from ._logsumexp import logsumexp, softmax


Is it possible to rename this to _expfuncs.py?

Since it is a private module, I'm not too worried about the name. And once you realize that the softmax function is the gradient of logsumexp, it isn't so bad leaving the name as _logsumexp.py. 😃 That also suggests that a better name for softmax is something like logsumexp_grad, but softmax is what people will be looking for.

And once you realize that the softmax function is the gradient of logsumexp

This seems like an interesting enough observation that it could go into the notes section of the docs

This seems like an interesting enough observation that it could go into the notes section of the docs

Done (if somewhat tersely).

hameerabbasi · 2018-05-27T17:26:16Z

scipy/special/_logsumexp.py

+    ----------
+    x : array_like
+        Input array.
+    axis : int, optional


Can be tuple of ints as well.

Done, thanks.

WarrenWeckesser · 2018-05-27T22:08:11Z

Over in #8556, @person142 was against adding this to scipy.special. Josh has been one of the more active developers in special, so I'd rather not merge this without more feedback from him. Maybe we can nudge him from "-1" to "-0"?

person142 · 2018-05-27T23:54:41Z

Since the majority are in favor, I would hardly want to hold this back.

hameerabbasi

Some more minor comments.

hameerabbasi · 2018-05-28T09:18:00Z

scipy/special/tests/test_softmax.py

@@ -0,0 +1,59 @@
+from __future__ import division, print_function, absolute_import


It'd be nice if this could be merged into test_logsumexp.py, since the file is the same, the testing file should be the same.

It's nice if the tests mirror the structure of the actual code.

Agreed. I was this close to making this change earlier anyway.

hameerabbasi · 2018-05-28T09:18:27Z

scipy/special/_logsumexp.py

+
+    # compute in log space for numerical stability
+    sigma = np.exp(x - logsumexp(x, axis=axis, keepdims=True))
+    return sigma


Nit by @eric-wieser that I agree wtih: Merge these two lines.

I don't have a strong preference here, but since folks are commenting on it a pull request (and I seem to recall seeing it in another one), I'd like to get some clarification. @hameerabbasi, @eric-wieser, can you give explicit reasons for making this change? Is it for better performance? I see about a 5 ns improvement between

def func1(x): y = 2*x return y

and

def func2(x): return 2*x

The argument against the change is that assigning the return value to a variable, and then returning that variable in a separate return statement, eases future degugging. One can set a break point at the return statement (or add a print statement before it) to inspect the result before it is returned. Indeed, this pattern is taught in some courses on programming style.

Like I said, it's a nit. Feel free to ignore. I tend to only store a variable if it's not a one-liner.

The argument was only ever one of brevity, not of performance. Also, it's weird to see:

def thing_one(): thing_two = ... return thing_two

because it suggests the author wasn't really sure what to call their function.

to inspect the result before it is returned

Note that pdb provides __return__ for this purpose. Typing return (execute and run the return statement) followed by p __return__ does exactly this.

Indeed, this pattern is taught in some courses on programming style.

These courses would do better to teach how to use the debugger usefully

Adds the softmax function, commonly used in machine learning and statistics, to scipy.special.

* Update the docstring: * Copy-edit a bit. * Added "Examples" section. * Move LaTeX notation to the Notes section. * Add versionadded annotation. * Tests: * More explicit tests of "normal" cases and extreme cases. * More tests using the `axis` argument. * Use `assert_allclose` instead of `assert_almost_equal`.

hameerabbasi

LGTM. Thanks for the patience, @WarrenWeckesser!

eric-wieser · 2018-06-03T05:21:21Z

scipy/special/_logsumexp.py

+    """
+
+    # compute in log space for numerical stability
+    return np.exp(x - logsumexp(x, axis=axis, keepdims=True))


Are we sure that this is as stable as the one suggested here

def stablesoftmax(x, axis=None): """Compute the softmax of vector x in a numerically stable way.""" shiftx = x - np.max(x, axis=axis, keepdims=True) exps = np.exp(shiftx) return exps / np.sum(exps, axis=axis, keepdims=True)

That implementation has the advantage that translating the input (without precision loss) results in exactly the same result, not just a to-within-floating-accuracy result

I tend to disagree... There will be floating-point inaccuracies incurred by division, np.exp, np.sum. The key point here is that even subtraction has floating point inaccuracies if the exponent of the two operands isn't exactly equal.

This is just the softmax done in the log-domain. Agreed that logsumexp could introduce some floating point errors, but the other implementation doesn't necessarily produce the same result as just the last line. It could be that the errors cancel out, but I'm not well-versed enough in floating-point arithmetic to see how that'd work.

eric-wieser · 2018-06-03T06:06:10Z

Can you add a test that all of the following give [1, 0]?

logsumexp([inf, 0])
logsumexp([0, -inf])
logsumexp([inf, -inf])

I'm not sure there's a clear definition for what cases with repeated infinities like logsumexp([inf, inf, 0]) should give - [nan, nan, 0] would be safest, using the assumption that no two infinities of the same sign can be known to equal

rgommers · 2018-06-28T09:02:03Z

@eric-wieser I don't understand why you're saying you expect [1, 0] for all of those. Here's what it gives now, which makes more sense to me:

In [22]: logsumexp([inf, 0])
Out[22]: inf

In [23]: logsumexp([0, -inf])
Out[23]: 0.0

In [24]: logsumexp([inf, -inf])
Out[24]: inf

rgommers · 2018-06-28T09:03:11Z

Either way, that's about logsumexp and not about this PR directly. This is in good shape, so merging.

Thanks all!

hameerabbasi · 2018-06-28T09:03:35Z

@rgommers I think he meant softmax, not logsumexp.

rgommers · 2018-06-28T09:14:05Z

Ah, that makes more sense. Is not the case though for either the current implementation or the stablesoftmax suggested as an alternative:

In [4]: softmax([inf, 0])
/Users/rgommers/Code/scipy/scipy/special/_logsumexp.py:215: RuntimeWarning: invalid value encountered in subtract
  return np.exp(x - logsumexp(x, axis=axis, keepdims=True))
Out[4]: array([nan,  0.])

In [5]: softmax([0, -inf])
Out[5]: array([1., 0.])

In [6]: softmax([inf, -inf])
/Users/rgommers/Code/scipy/scipy/special/_logsumexp.py:215: RuntimeWarning: invalid value encountered in subtract
  return np.exp(x - logsumexp(x, axis=axis, keepdims=True))
Out[6]: array([nan,  0.])

In [12]: stablesoftmax([0, -inf])
Out[12]: array([1., 0.])

In [13]: stablesoftmax([inf, -inf])
/Users/rgommers/anaconda3/bin/ipython:3: RuntimeWarning: invalid value encountered in subtract
  # -*- coding: utf-8 -*-
Out[13]: array([nan, nan])

In [14]: stablesoftmax([inf, 0])
/Users/rgommers/anaconda3/bin/ipython:3: RuntimeWarning: invalid value encountered in subtract
  # -*- coding: utf-8 -*-
Out[14]: array([nan, nan])

hameerabbasi · 2018-06-28T09:18:15Z

Of course. In one case, you're doing inf - inf and in the other inf / inf. Both are nan in the IEEE spec.

WarrenWeckesser mentioned this pull request May 27, 2018

ENH: Softmax #8556

Closed

hameerabbasi approved these changes May 27, 2018

View reviewed changes

WarrenWeckesser force-pushed the softmax-continued branch from 5907f8a to 161d72f Compare May 27, 2018 18:03

WarrenWeckesser added enhancement A new feature or improvement scipy.special labels May 27, 2018

hameerabbasi approved these changes May 28, 2018

View reviewed changes

hildensia and others added 2 commits June 2, 2018 14:47

ENH: add softmax function

10e5129

Adds the softmax function, commonly used in machine learning and statistics, to scipy.special.

WarrenWeckesser force-pushed the softmax-continued branch from 161d72f to 12dd12e Compare June 2, 2018 18:48

DOC: Note addition of 'special.softmax' in the 1.2.0 release notes.

607be34

WarrenWeckesser force-pushed the softmax-continued branch from 0d9f6f9 to 607be34 Compare June 2, 2018 20:59

WarrenWeckesser added this to the 1.2.0 milestone Jun 2, 2018

hameerabbasi approved these changes Jun 3, 2018

View reviewed changes

eric-wieser reviewed Jun 3, 2018

View reviewed changes

rgommers merged commit 5e5a2d2 into scipy:master Jun 28, 2018

WarrenWeckesser deleted the softmax-continued branch June 28, 2018 16:12

pmli mentioned this pull request May 14, 2022

ENH: more numerically stable version of softmax #16186

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: special: Add the softmax function #8872

ENH: special: Add the softmax function #8872

WarrenWeckesser commented May 27, 2018

hameerabbasi left a comment

hameerabbasi May 27, 2018

WarrenWeckesser May 27, 2018

eric-wieser May 28, 2018

WarrenWeckesser Jun 2, 2018

hameerabbasi May 27, 2018

WarrenWeckesser May 27, 2018

WarrenWeckesser commented May 27, 2018

person142 commented May 27, 2018

hameerabbasi left a comment

hameerabbasi May 28, 2018 •

edited

Loading

WarrenWeckesser Jun 2, 2018

WarrenWeckesser Jun 2, 2018

hameerabbasi May 28, 2018

WarrenWeckesser Jun 2, 2018

hameerabbasi Jun 2, 2018

WarrenWeckesser Jun 2, 2018

eric-wieser Jun 3, 2018 •

edited

Loading

hameerabbasi left a comment

eric-wieser Jun 3, 2018 •

edited

Loading

hameerabbasi Jun 3, 2018

eric-wieser commented Jun 3, 2018 •

edited

Loading

rgommers commented Jun 28, 2018

rgommers commented Jun 28, 2018

hameerabbasi commented Jun 28, 2018

rgommers commented Jun 28, 2018

hameerabbasi commented Jun 28, 2018

		@@ -0,0 +1,59 @@
		from __future__ import division, print_function, absolute_import

ENH: special: Add the softmax function #8872

ENH: special: Add the softmax function #8872

Conversation

WarrenWeckesser commented May 27, 2018

hameerabbasi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WarrenWeckesser commented May 27, 2018

person142 commented May 27, 2018

hameerabbasi left a comment

Choose a reason for hiding this comment

hameerabbasi May 28, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eric-wieser Jun 3, 2018 • edited Loading

Choose a reason for hiding this comment

hameerabbasi left a comment

Choose a reason for hiding this comment

eric-wieser Jun 3, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eric-wieser commented Jun 3, 2018 • edited Loading

rgommers commented Jun 28, 2018

rgommers commented Jun 28, 2018

hameerabbasi commented Jun 28, 2018

rgommers commented Jun 28, 2018

hameerabbasi commented Jun 28, 2018

hameerabbasi May 28, 2018 •

edited

Loading

eric-wieser Jun 3, 2018 •

edited

Loading

eric-wieser Jun 3, 2018 •

edited

Loading

eric-wieser commented Jun 3, 2018 •

edited

Loading