New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
einops.einsum #73
Comments
I miss that sometimes as well, but so far refrain from wrapping existing functionality just for the sake of this feature. Let me keep this issue open to collect cases when people really need this being reimplemented in einops |
+1 to this! My code right now is a mix of einops and torch.einsum, but I think it would be very nice/consistent if it was all the same syntax through einops. As the name I think a wrapper could start off by just being something that takes, e.g.,
and simply parses it into usual einsum notation (for each corresponding backend):
Maybe eventually there could be cool features to consider related to the other syntax innovations in this awesome package. Cheers! |
One useful feature not found in existing libraries could be a combined reshape->einsum represented as a single einops expression, like so:
This would first rearrange |
@MilesCranmer this last thing is already in plans, you can read RFC #71 @cgarciae I think it makes sense to look that RFC, as to my mind it covers large fraction of use cases for einsum in deep learning. |
Hey @arogozhnikov, I'd already seen the |
I also would find einsum with more descriptive names useful. Jamming together everything into single letters, often with no whitespace, is unpleasant once there's 3+ indices to consider. I was writing out the multi-head attention from the transformer paper to practice einops and ran into this when forced to use regular einsum. Just noted that http://einops.rocks/pytorch-examples.html has |
Would you be open to a functional version @arogozhnikov? I can help add it but I want to confirm your approval first. |
@MilesCranmer https://gist.github.com/rockt/a3191f517728ea9a136a204f578d27c8 I just want to discuss some issues before they appear: parsing likely should be cached, and standard backend-guessing should be applied. Torch's scripting doesn't make any friendship with caching, so likely a layer would be required as well. |
I think that syntax would be the desired style. It would also be nice to have einops.einsum("batch (height width), height width channel -> batch channel", x, filter) but maybe for this you would rather have the user split into two separate operations, one Cheers, |
There are several sides:
In total, I'd better have a simple version without compositions/decompositions + a corresponding layer, and leave extending support for future |
PR ready for comments! #197 |
Any opinions on syntax? We are discussing the following options in the PR. The following compares y = reduce(x, "i j -> i")
y = einsum("i j -> i", x) # same as above
y2 = einsum("i j, i j -> i", x, x) y = reduce(x, "i j -> i")
y = einsum(x, "i j -> i")
y2 = einsum(x, x, "i j, i j -> i") y = reduce(x, "i j -> i")
y = einsum("i <- i j", x)
y2 = einsum("i <- i j, i j", x, x) (1) is technically the same as (2) is my preference, although the potential issue is that you would have a single mixed-type (3) is another option, putting the tensors at the end but keeping the indices on the same side as the tensor. (x is on the right of the pattern, and has indices on the right; likewise for y). |
I like 2 but if it causes issues 1 is fine. 3 looks odd |
Pinging this thread - let me know of any other opinions. I'll adapt the PR to use (2) otherwise. |
Option 2 looks better, but can't work with type hints, and that's a strong argument against. |
Up to you, I am happy to implement any option. To get 2 working with type hints, would the following be an option? @typing.overload
def einsum(tensor: Tensor, pattern: str) -> Tensor: ...
@typing.overload
def einsum(tensor1: Tensor, tensor2: Tensor, pattern: str) -> Tensor: ...
@typing.overload
def einsum(tensor1: Tensor, tensor2: Tensor, tensor3: Tensor, pattern: str) -> Tensor: ...
@typing.overload
def einsum(tensor1: Tensor, tensor2: Tensor, tensor3: Tensor, tensor4: Tensor, pattern: str) -> Tensor: ...
def einsum(*tensors_and_pattern) -> Tensor:
... # Actual function This seems to give me the correct hints. If you are thinking about using the type hints for debugging with mypy, would the following modification give the type checker enough information? def f(*args):
args[:-1]: List[Tensor]
args[-1]: str
... |
this will correctly work with IDEs (pycharm / vscode), and should pass mypy (though did not check latter). There is a second part when typing matters: scripting and optimization. If we resolve all problems with dynamic dispatching (e.g. by array-api), pytorch and others still won't be able to script such function because Optimization (still anticipated feature of cpython 3.12) as I understand would use similar mechanics to compile some functions based on type hinting. |
Looking at examples # your example
result = einsum(batched_images, filters,
"batch h w, h w channel -> batch channel")
# my example
result = einsum(activations, activations,
"b h w head c, b h2 w2 head c -> b h w h2 w2 head") I believe that should be a good habit to write pattern and tensors on different lines and in this case position of pattern (first or second line = first or last) shouldn't play a big role. |
Good point, I didn't realize this aspect about how compilation libraries work. Will think about this more... |
A fourth option for syntax would be the following: result = einsum([activations, activations],
"b h w head c, b h2 w2 head c -> b h w h2 w2 head") i.e., the first argument is a list of tensors. This gives type stability from I tried scripting this in PyTorch, but a few other things broke: |
Actually, even >>> torch.jit.script(torch.einsum)
NotSupportedError: Compiled functions can't take variable number of arguments or use keyword-only arguments with defaults: Maybe the syntax (2) is fine as-is? I see the PyTorch version is given as So perhaps we could use syntax (2), with Edit: It looks like you can get Edit 2: In PyTorch, they dispatch to this internal library called "_VariableFunctions": https://github.com/pytorch/pytorch/blob/1022443168b5fad55bbd03d087abf574c9d2e9df/torch/_VF.py. Anyways, I think with variable number of arguments (in any part of einops), you may have to do tracing rather than scripting. |
Pinging this thread. I'm fine to go ahead with |
Lists are reserved for semantically different thing (e.g. see indexing or 'stacking' with rearrange). To weight on pattern-first and pattern-last I want to go through a set of larger examples where op is used in a context. Sorry still didn't get to it, but I will try today. |
and the winner is ... <pretends he didn't read contents of the envelope> pattern-last order of arguments 🎉 With pattern-last it is easier to track the flow of data (which variables define which), while reading/analysis of pattern can be delayed to the moment when general flow of the program is clear. I believe this overweights different technical downsides. Will merge tomorrow Comment about torch.einsum: it is scriptable in the context (like, if you pass arguments and patterns, i.e. specify operation, it is scriptable). |
You can also add the other orders with different names. E.g., |
einops.einsum is live since 0.5.0, closing. |
Hey! Loving
einops
, so much that now I feel a bit sad about standardeinsum
not being able to use descriptive names for dimensions. It would be amazing ifeinops
implementedeinsum
with the same conveniences.The text was updated successfully, but these errors were encountered: