Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add einsum, closes #124 #363

Merged
merged 36 commits into from
Jun 30, 2019
Merged

Add einsum, closes #124 #363

merged 36 commits into from
Jun 30, 2019

Conversation

Vindaar
Copy link
Collaborator

@Vindaar Vindaar commented Jun 17, 2019

This PR adds the Einstein summation via the einsum macro to Arraymancer.

It's still a WIP, but all features described here:
https://rockt.github.io/2018/04/30/einsum
are implemented and tested in the tests (as a bonus, taking the diagonal of a tensor also works, which according to the article does not in pytorch ;) ).

The major thing still missing is actually taking the type of the input tensor into account. At the moment I just map to float tensors.

The code has to be clean up and refactored in parts. Documentation also has to be added.

In principle the macro works as follows. Assuming we have two (theoretically also only one or more tensors) tensors a, b, we can use Einstein summation as follows:

let res = einsum(a, b):
  a[i,j] * b[j,k]

to express a matrix multiplication implicitly or:

let res = einsum(a, b):
  res[i,k] = a[i,j] * b[j,k]

to assign the resulting indices explicitly. In the former case we calculate the indices that will be contracted automatically and assign the result along the axis specified by the order in which non contracting indices appear. The explicit usage allows to also transpose the result directly (e.g. exchange i, k -> k, i in the res LHS above). In some cases the explicit form is strictly necessary, e.g. if one wants to only iterate over (not contract) an axis, but keep it in the result. All tests are done in implicit and explicit form. If no implicit form is tested, it means the test case does not make sense without the constraints from the explicit LHS (or would yield a different result, e.g. look at the Hadamard product) in a hypothetical implicit form:

let res = einsum(a, b):
  a[i,j] * b[i,j]

normal Einstein summation demands a full contraction of the tensors. Only by explicitly stating to keep both axes:

let res = einsum(a, b):
  res[i,j] = a[i,j] * b[i,j]

can it be expressed. This means that the implicit form favors actual Einstein summation, i.e. an index appearing more than once is summed over.

Note that the identifier used on the LHS of the statement in the explicit case does not actually matter. It can be the same as what you assign to, but it can be something else entirely too.

The implementation simply writes nested for loops over the non contracting axes and within over the contracting axes and assigns the result.

I open the PR already, because I'm unsure about the file structure of this.

  • Where should the test go? Is the tensors subdirectory correct?
  • How do I run the test from the nimble file? Is every test_ file run in the tests directory?

@mratsim
Copy link
Owner

mratsim commented Jun 18, 2019

Excellent.

The major thing still missing is actually taking the type of the input tensor into account. At the moment I just map to float tensors.

Feel free to use getSubtype from

macro getSubType*(TT: typedesc): untyped =
# Get the subtype T of an AnyTensor[T] input
getTypeInst(TT)[1][1]
.
I don't mind adding float only at first, working with types in macros is tricky nim-lang/RFCs#44 but I usually manage to find a way for those to work so if you struggle just use floats at first.

If no implicit form is tested, it means the test case does not make sense without the constraints from the explicit LHS (or would yield a different result, e.g. look at the Hadamard product) in a hypothetical implicit form

We should probably have an assert or compile-time error or at the very-least a documentation paragraph that says "Please use the explicit form of einstein summation, for example C[i, j] = A[i, j] * B[i, j] instead of just A[i, j] * B[i, j]" where we can.

Note that the identifier used on the LHS of the statement in the explicit case does not actually matter. It can be the same as what you assign to, but it can be something else entirely too.

No problem, we can also consider this syntax

einsum(res, a, b):
  let res[i,k] = a[i,j] * b[j,k]

instead of

let res = einsum(a, b):
  res[i,k] = a[i,j] * b[j,k]

Where should the test go? Is the tensors subdirectory correct?

Yes it's correct

How do I run the test from the nimble file? Is every test_ file run in the tests directory?

Add it to:

@Vindaar
Copy link
Collaborator Author

Vindaar commented Jun 23, 2019

Ok, finally did some refactoring.

Regarding the explicit / implicit syntax. Unfortunately the syntax you proposed:

einsum(res, a, b):
  let res[i,k] = a[i,j] * b[j,k]

is not valid Nim syntax (throws a Error: ':' or '=' expected, but got '[').

So right now I just introduced a dual behavior. Implicit is mapped to just the block, user must assign to a variable. While explicit creates a let var of the user desired identifier.
That way unfortunately we can't have the macro produce a var variable. :/

Not sure in general if in general the rather subtle, profound change in the way einsum works is desirable for these two different cases. Maybe have those under different macro names?

edit: I also started adding a file that contains examples that are supposed to fail. I know that testament supports failing tests, but unittest does not, does it? Is it worth to expand on this?

The types of all tensors given as arguments to `einsum` must
match. Thus we check whether they are all the same. If they are, we
use the type of the first tensor.
In both cases `einsum` again just returns a tensor.
Instead of working with the given tensors, we now create local copies,
which are made contiguous and (if required) converted to row major
order. This way our iteration should be more efficient, in case column
major tensors are present.
This way the right most indices of the accessor will be the inner most
loops. Since we force row major ordering, those indices will be
closest together.
@Vindaar
Copy link
Collaborator Author

Vindaar commented Jun 26, 2019

There's still some stuff I would consider changing (I'm not happy with the main macro code, to be honest; but refactoring more seemed like writing procs with tons of arguments :/) and I'm not sure if the documentation will come out fine (almost no RST experience).

Aside from that I consider this PR to be mostly done for now.

We now create local tensors with asContiguous, forcing row major order.

edit:
Oh, one thing I just remembered: I didn't do any proper name mangling yet, because I had in mind that you had a way to do that in the library, but I couldn't find it before.
If I'm mistaken, I'd just replace T0Mangle, tmp and res by a call to genSym.
edit_end

For a complex example, the code created now looks like this:

let b = einsum(m, n):
  b[p,s,t,u,v] = m[p,q,r,s] * n[t,u,q,v,r]
# will expand to
let b = block:                                                                                                                
  type                                                                                                                
    T0Mangle = getSubType(type(m))                                                                                    
  when T0Mangle isnot getSubType(type(m)):                                                                            
    {.error: "All tensors must be of the same type! " &                                                               
        $"m" & " is of " & "type " &                                                                                  
        $typeName(getSubType(type(m))) & " while " &                                                                  
        $"m" & " is of type " &                                                                                       
        $typeName(getSubType(type(m))) & "!".}                                                                        
  elif T0Mangle isnot getSubType(type(n)):                                                                            
    {.error: "All tensors must be of the same type! " &                                                               
        $"m" & " is of " & "type " &                       
        $typeName(getSubType(type(m))) & " while " &       
        $"n" & " is of type " &                            
        $typeName(getSubType(type(n))) & "!".}             
  let mCont = asContiguous[getSubType(type(m))](m, layout = rowMajor, force = true)
  let nCont = asContiguous[getSubType(type(n))](n, layout = rowMajor, force = true)                                   
  doAssert nCont.rank ==                                                                                              
      5                                                                                                               
  var shapes = newSeq[int](5) 
  []=(shapes, 0,        
      mCont.shape[0])                                      
  []=(shapes, 1,            
      mCont.shape[3])  
  []=(shapes, 2,            
      nCont.shape[0])
  []=(shapes, 3,                                           
      nCont.shape[1])                                      
  []=(shapes, 4,                                           
      nCont.shape[3])                                      
  var shapesContr = newSeq[int](2)                         
  []=(shapesContr, 0,                                      
      nCont.shape[2])                                      
  []=(shapesContr, 1,                                      
      nCont.shape[4])                                                                                                 
  var tmp = newTensor[getSubType(type(m))](shapes)
  for p in 0 ..<                                           
      shapes[0]:                                           
    for s in 0 ..<       
        shapes[1]:    
      for t in 0 ..<                                       
          shapes[2]:                                                                                                  
        for u in 0 ..<      
            shapes[3]:  
          for v in 0 ..<
              shapes[4]:
            var res: getSubType(type(m))
            for q in 0 ..<                                                                                            
                shapesContr[0]:                                                                                       
              for r in 0 ..<
                  shapesContr[1]:
                res += mCont[p, q, r, s] * nCont[t, u, q, v, r]
            tmp[p, s, t, u, v] = res                       
  tmp 

@Vindaar Vindaar marked this pull request as ready for review June 26, 2019 22:20
@mratsim
Copy link
Owner

mratsim commented Jun 27, 2019

Regarding name_mangling you are probably mixing that up with laser: https://github.com/numforge/laser/blob/bf751f4bbec3d178cd3a80da73e446658d0f8dff/laser/openmp.nim#L13-L25

var mangling_rng {.compileTime.} = initRand(0x1337DEADBEEF)
var current_suffix {.compileTime.} = ""

proc omp_suffix*(genNew: static bool = false): string {.compileTime.} =
  ## genNew:
  ##   if false, return the last suffix
  ##   else return a fresh one
  # This is exported because you cannot bind the symbol early enough
  # for exportc

  if genNew:
    current_suffix = mangling_rng.rand(high(uint32)).toHex
  result = current_suffix

I don't even mangle the layers in the neural network DSL:

proc hash*(x: NimNode): Hash =
assert x.kind == nnkIdent
result = hash($x)

@Vindaar
Copy link
Collaborator Author

Vindaar commented Jun 27, 2019

Oh, then I saw it in laser.

I now use genSym for all variables. I added a few more comments to the documentation and cleaned up a few things here and there.

@mratsim mratsim merged commit 686836e into mratsim:master Jun 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants