Reference implementations for softmax, log_softmax, logsumexp #79423

IvanYashchuk · 2022-06-13T16:04:00Z

This PR adds references for:

torch.softmax
torch.log_softmax
torch.logsumexp

Unfortunately, none of them currently pass test_python_ref_executor even with "aten" executor.

facebook-github-bot · 2022-06-13T16:04:05Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/79423
📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓Need help or want to give feedback on the CI? Visit our office hours

✅ No Failures (0 Pending)

As of commit 8893024 (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

ngimel · 2022-06-13T16:36:12Z

What are the errors with aten executor?

IvanYashchuk · 2022-06-13T19:20:02Z

What are the errors with aten executor?

For logsumexp the generated Python code from the FX graph is invalid leading to the SyntaxError.
For log_softmax and softmax the error is due to #78923.

mruberry · 2022-06-14T00:10:28Z

torch/_refs/__init__.py

+) -> TensorLikeType:
+    result_dtype = dtype or a.dtype
+    computation_dtype = utils.get_computation_dtype(a.dtype)
+    a = prims.convert_element_type(a, computation_dtype)


Let's make this conditional on the conversion being required -- we actually have a function for this if you'd prefer, but it's in an awkward place (it could be moved to utils if you'd rather use it vs. a custom conditional)

pytorch/torch/_prims/wrappers.py

Line 20 in 38e717d

def _maybe_convert_to_dtype(

Thanks! _maybe_convert_to_dtype is already used in this file, so we can keep it defined in the awkward place for now 🙂

mruberry · 2022-06-14T00:12:07Z

torch/_refs/__init__.py

+    computation_dtype = utils.get_computation_dtype(a.dtype)
+    a = prims.convert_element_type(a, computation_dtype)
+    a_max = amax(a, dim, keepdim=True)
+    shifted = a - a_max


Nit: comment for why the shift occurs would be nice

Oh, the shift is actually not required here because stabilized logsumexp is used. I will remove it.

mruberry · 2022-06-14T00:12:29Z

torch/_refs/__init__.py

+    a_max = amax(a, dim, keepdim=True)
+    shifted = a - a_max
+    shifted_logsumexp = logsumexp(shifted, dim, keepdim=True)
+    return prims.convert_element_type(


Let's also make this conditional on the conversion being required

mruberry · 2022-06-14T00:14:02Z

torch/_refs/__init__.py

+    )
+
+
+def _squeeze_multiple(a: TensorLikeType, dims: DimsSequenceType) -> TensorLikeType:


Unify this with prims.squeeze -- see

pytorch/torch/_prims/__init__.py

Line 1636 in 38e717d

squeeze = _make_prim(

I think we only need one? But maybe this implementation is better for the prim?

I didn't notice that prims.squeeze works for multiple specified dimensions. Both implementations are fine, I will not touch prims.squeeze at this time.

mruberry · 2022-06-14T00:17:00Z

torch/_refs/__init__.py

+@out_wrapper
+def logsumexp(
+    a: TensorLikeType,
+    dims: DimsType,


"dim"?

https://pytorch.org/docs/master/generated/torch.logsumexp.html?highlight=logsumexp#torch.logsumexp

Right, it should be dim to match the torch namespace. I think I was confused because it also accepts and works for several dimensions

dim (int or tuple of python:ints)

import torch a = torch.ones(3, 3) torch.logsumexp(a, (0, 1)) # tensor(3.1972)

mruberry · 2022-06-14T00:18:10Z

torch/_refs/__init__.py

+    keepdim: bool = False,
+) -> TensorLikeType:
+    dims = utils.canonicalize_dims(a.ndim, dims)
+    # ATen specifies int[1] type dims which expands integers to tuples of length 1


Nice comment

mruberry · 2022-06-14T00:19:34Z

torch/_refs/__init__.py

+        a_max_squeezed = _squeeze_multiple(a_max, dims) if not keepdim else a_max
+        result = log(sum(exp(a - a_max), dims, keepdim=keepdim)) + a_max_squeezed
+    else:
+        result = log(sum(exp(a), dims, keepdim=keepdim))


Add a comment for what this case covers (integer and boolean dtypes)

mruberry · 2022-06-14T00:19:59Z

torch/_refs/__init__.py

+        dims = (dims,)
+    if utils.is_float_dtype(a.dtype) or utils.is_complex_dtype(a.dtype):
+        a_max = amax(a, dims, keepdim=True)
+        a_max = where(abs(a_max) == float("inf"), 0.0, a_max)


really elegant code here

mruberry · 2022-06-14T00:20:23Z

torch/_refs/__init__.py

+) -> TensorLikeType:
+    result_dtype = dtype or a.dtype
+    computation_dtype = utils.get_computation_dtype(a.dtype)
+    a = prims.convert_element_type(a, computation_dtype)


Similar comments here as with log_softmax re: conditional conversions

mruberry

This is awesome, @IvanYashchuk! I made some inline comments for your review, nothing major, and the lint job needs to be fixed, but approving this for velocity because I'm sure you'll sort out the review and the jobs.

…used

IvanYashchuk · 2022-06-14T08:13:25Z

_maybe_convert_to_dtype is now used for optional conversions. I removed unnecessary shift from log_softmax and fixed mypy errors. In case when _maybe_convert_to_dtype is the returned variable mypy errors are silenced.

Forward AD test fails with new sample input that reduces over multiple dims

IvanYashchuk · 2022-06-14T19:42:35Z

@pytorchbot merge -g

pytorchmergebot · 2022-06-14T19:43:47Z

@pytorchbot successfully started a merge job. Check the current status here

github-actions · 2022-06-14T19:44:27Z

Hey @IvanYashchuk.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

#79423) Summary: This PR adds references for: - `torch.softmax` - `torch.log_softmax` - `torch.logsumexp` Unfortunately, none of them currently pass `test_python_ref_executor` even with `"aten"` executor. Pull Request resolved: #79423 Approved by: https://github.com/mruberry Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/4fc7832d72926d65d773ae4e4ae0ed7fc573f0c7 Reviewed By: malfet Differential Revision: D37156829 fbshipit-source-id: 88a1ed3d42fda30b880d8a2fe48f385ebdb98d22

IvanYashchuk added 3 commits June 13, 2022 18:11

Add reference implementation for softmax

3b935cb

Add reference implementation for logsumexp

e0ecdcc

Add reference implementation for log_softmax

4154823

IvanYashchuk added the module: primTorch label Jun 13, 2022

IvanYashchuk requested review from mruberry and ngimel as code owners June 13, 2022 16:04

facebook-github-bot added the cla signed label Jun 13, 2022

pytorchbot added the open source label Jun 13, 2022

mikaylagawarecki added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 13, 2022

mruberry reviewed Jun 14, 2022

View reviewed changes

mruberry approved these changes Jun 14, 2022

View reviewed changes

IvanYashchuk added 8 commits June 14, 2022 09:34

Black formatting

d95dc7d

Add multiple dims input for logsumexp

0d2c706

Shift is not required in log_softmax because stabilized logsumexp is …

7fd6025

…used

Use _maybe_convert_to_dtype

68495ad

Add comments for logsumexp

5ab95b1

Use prims.squeeze instead of _squeeze_multiple

6a3801f

Rename dims -> dim

b7a81b9

Skip MyPy return value of _maybe_convert_to_dtype

6aff181

IvanYashchuk added 4 commits June 14, 2022 11:00

Fix mypy assignment error for softmax

fe151c9

Fix mypy assignment error for log_softmax

ee4077c

Fix mypy arg-type error in softmax

0a16c0d

Fix mypy arg-type error for amax in logsumexp

0913307

IvanYashchuk added 2 commits June 14, 2022 12:32

Use gradcheck_fast_mode=False for logsumexp

544993d

Forward AD test fails with new sample input that reduces over multiple dims

Merge remote-tracking branch 'upstream/viable/strict' into refs-softmax

8893024

pytorchmergebot added the Merged label Jun 14, 2022

pytorchmergebot closed this in 4fc7832 Jun 14, 2022

		)


		def _squeeze_multiple(a: TensorLikeType, dims: DimsSequenceType) -> TensorLikeType:

Reference implementations for softmax, log_softmax, logsumexp #79423

Reference implementations for softmax, log_softmax, logsumexp #79423

Uh oh!

Conversation

IvanYashchuk commented Jun 13, 2022

Uh oh!

facebook-github-bot commented Jun 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

✅ No Failures (0 Pending)

Uh oh!

ngimel commented Jun 13, 2022

Uh oh!

IvanYashchuk commented Jun 13, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mruberry Jun 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mruberry left a comment

Choose a reason for hiding this comment

Uh oh!

IvanYashchuk commented Jun 14, 2022

Uh oh!

IvanYashchuk commented Jun 14, 2022

Uh oh!

pytorchmergebot commented Jun 14, 2022

Uh oh!

github-actions bot commented Jun 14, 2022

Uh oh!

Uh oh!

facebook-github-bot commented Jun 13, 2022 •

edited

Loading

mruberry Jun 14, 2022 •

edited

Loading