Rename symbols to support torch.einsum #41

fritzo · 2018-08-17T23:12:53Z

Description

This renames symbols to support PyTorch 0.4.1 which supports only symbols a-z. This is useful when the overall computation has more than 26 dimensions but any single contraction requires only 26 or fewer dimensions.

Questions

Do other backends work? I have only tested PyTorch. If other backends don't work, I'll simply limit the test to the PyTorch backend.

Status

Ready to merge as soon as tests pass

Tested

Added a test with non-lowercase symbols
Tested that this works in an application using Pyro Use opt_einsum for message passing in SVI pyro-ppl/pyro#1313

dgasmith

Overall LGTM. It looks like CI missed a trigger (very odd), can you make another commit to trigger CI when you get the chance.

dgasmith · 2018-08-18T00:22:03Z

opt_einsum/backends/torch.py

@@ -31,6 +31,12 @@ def transpose(a, axes):
 def einsum(equation, *operands):
    """Variadic version of torch.einsum to match numpy api.
    """
+    # rename symbols to support PyTorch 0.4.1 and earlier,
+    # which allow only symbols a-z.
+    symbols = sorted(set(equation) - set(',->'))


Extreme, but can we throw if len(symbols) > 26.

The advantage of not throwing is that this will continue to work when PyTorch fixes their einsum to allow more symbols. I'd rather let their current implementation throw.

Makes sense. I am not intimately familiar with the new torch symbols set, but would opt_einsum.parser.convert_to_valid_einsum_chars work here?

dgasmith · 2018-08-18T00:22:44Z

opt_einsum/backends/torch.py

+    # which allow only symbols a-z.
+    symbols = sorted(set(equation) - set(',->'))
+    rename = {s: get_symbol(i) for i, s in enumerate(symbols)}
+    equation = ''.join(rename.get(s, s) for s in equation)


@jcmgray We do this for einsum as well I believe. Is it time to have a single expression which converts from global to local symbols?

dgasmith · 2018-08-18T00:24:18Z

opt_einsum/backends/torch.py

@@ -39,8 +45,6 @@ def tensordot(x, y, axes=2):
    """Simple translation of tensordot syntax to einsum.
    """
    # XXX: tensordot should be directly implemented in torch soon
-    torch, _ = _get_torch_and_device()


@jcmgray Can you look over this as well?

(Note the only reason I've removed this line is so I can reuse the einsum() wrapper in this file, rather than calling torch.einsum in two different places)

jcmgray · 2018-08-18T00:37:49Z

Do other backends work? I have only tested PyTorch. If other backends don't work, I'll simply limit the test to the PyTorch backend.

einsum in numpy, cupy, dask and tensorflow (from only recently) all support upper case letters so tests should be fine. The ideal situation would be for pytorch to add support for upper case, but the implementation is c++ so its not super obvious to me how easy that would be (@t-vi?). Probably easier to add it here anyway!

It might be cleaner, and more useful for potential future backends, to modify the existing machinery:

opt_einsum.parser.convert_to_valid_einsum_chars
opt_einsum.parser.has_valid_einsum_chars_only
opt_einsum.parser.is_valid_einsum_char

probably with a allow_uppercase keyword, which could just switch is_valid_einsum_char between checking against:

import string
string.ascii_lowercase  # or
string.ascii_letters

Also, for what its worth, your strategy of just replacing all characters might be faster than replacing just the invalid ones, in which case do update convert_to_valid_einsum_chars with your snippet.

t-vi · 2018-08-18T09:15:37Z

PyTorch's einsum makes some use of the fact that valid letters are consecutive and from a small fixed set (based on review comments when I previously dynamically allocated indices). It's not hard to extend the set and deal with several ranges but my impression is that lowercase a-z is all anyone ever uses in the wild. That said, I still plan on having an extended greedy optimisation in ATen c++, maybe in a month or so.

fritzo · 2018-08-18T12:43:59Z

@t-vi lowercase a-z is all anyone ever uses in the wild

We're using opt_einsum.contract() for hundreds of variables in Pyro. Each local contraction requires only a few variables (hence this PR), but we have one variable per time step in a Hidden Markov Model.

jcmgray · 2018-08-18T13:23:00Z

Yes just to clarify, opt_einsum already maps pairwise contractions into the [a-zA-Z] range if necessary (since large contractions can have thousands of indices). The real problem is if you want 26+ indices in a single pairwise contraction - i.e. a tensor with 26+ dimensions. Sounds unusual but this is actually quite a likely/necessary situation if, for example, you are simulating large quantum circuits.

Now, opt_einsum tries to call tensordot as much as possible, so will probably avoid this niche case (torch backend + non-tensordot-able contractions + >26 dimensions), but it's certainly possible!

t-vi · 2018-08-18T14:37:58Z

If there is a use case, supporting A-Z is easy. One could also consider a "post-parsing interface" (i.e. taking preprocessed equations) if that is useful. I must admit I always thought of einsum as a convenient as hoc interface, so I learn something new here (thanks!).

dgasmith · 2018-08-18T15:28:24Z

Closed/opened to trigger Travis. Once that passes and the opt_einsum.parser.convert_to_valid_einsum_chars is either used or updated this is ready to go.

codecov-io · 2018-08-18T15:34:39Z

Codecov Report

Merging #41 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

fritzo · 2018-08-18T16:32:59Z

Thanks for the quick review @dgasmith! I've moved the convert-all-chars implementation up to convert_to_valid_einsum_chars and used it in the torch backend.

jcmgray

Thanks for this update, all looks good to me,!

dgasmith · 2018-08-18T16:58:50Z

@fritzo Thanks for the PR, everything looks great and we will get this into the next release.

fritzo added 2 commits August 17, 2018 15:59

Rename symbols to support torch.einsum

20fa63e

Add test for upper-case characters

a24dfe0

dgasmith reviewed Aug 18, 2018

View reviewed changes

dgasmith requested a review from jcmgray August 18, 2018 00:25

dgasmith closed this Aug 18, 2018

dgasmith reopened this Aug 18, 2018

Move implementation to convert_to_valid_einsum_chars

2656141

jcmgray approved these changes Aug 18, 2018

View reviewed changes

dgasmith merged commit 49e2a91 into dgasmith:master Aug 18, 2018

dgasmith mentioned this pull request Aug 25, 2018

v2.2 patch notes #45

Closed

3 tasks

dgasmith added bug enhancement labels Aug 25, 2018

dgasmith modified the milestones: v2.1, v2.2 Aug 25, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rename symbols to support torch.einsum #41

Rename symbols to support torch.einsum #41

fritzo commented Aug 17, 2018 •

edited

dgasmith left a comment

dgasmith Aug 18, 2018

fritzo Aug 18, 2018 •

edited

dgasmith Aug 18, 2018

dgasmith Aug 18, 2018

dgasmith Aug 18, 2018

fritzo Aug 18, 2018

jcmgray commented Aug 18, 2018

t-vi commented Aug 18, 2018 via email

fritzo commented Aug 18, 2018 •

edited

jcmgray commented Aug 18, 2018

t-vi commented Aug 18, 2018 via email

dgasmith commented Aug 18, 2018

codecov-io commented Aug 18, 2018 •

edited

fritzo commented Aug 18, 2018

jcmgray left a comment

dgasmith commented Aug 18, 2018

Rename symbols to support torch.einsum #41

Rename symbols to support torch.einsum #41

Conversation

fritzo commented Aug 17, 2018 • edited

Description

Questions

Status

Tested

dgasmith left a comment

Choose a reason for hiding this comment

dgasmith Aug 18, 2018

Choose a reason for hiding this comment

fritzo Aug 18, 2018 • edited

Choose a reason for hiding this comment

dgasmith Aug 18, 2018

Choose a reason for hiding this comment

dgasmith Aug 18, 2018

Choose a reason for hiding this comment

dgasmith Aug 18, 2018

Choose a reason for hiding this comment

fritzo Aug 18, 2018

Choose a reason for hiding this comment

jcmgray commented Aug 18, 2018

t-vi commented Aug 18, 2018 via email

fritzo commented Aug 18, 2018 • edited

jcmgray commented Aug 18, 2018

t-vi commented Aug 18, 2018 via email

dgasmith commented Aug 18, 2018

codecov-io commented Aug 18, 2018 • edited

Codecov Report

fritzo commented Aug 18, 2018

jcmgray left a comment

Choose a reason for hiding this comment

dgasmith commented Aug 18, 2018

fritzo commented Aug 17, 2018 •

edited

fritzo Aug 18, 2018 •

edited

fritzo commented Aug 18, 2018 •

edited

codecov-io commented Aug 18, 2018 •

edited