[Relay][Training] Additional gradients #8307

altanh · 2021-06-22T18:27:28Z

New gradients:

cast_like
not_equal
strided_slice
one_hot

Also simplified log_softmax gradient.

For strided_slice, I wasn't sure how to add support for the recently introduced axes argument (which if I understand correctly aims to circumvent limitations in the type system for dynamic shape inference), so I just added a check. In my mind, strided_slice (as opposed to dyn.strided_slice) should be used when everything is concrete, but I think now it allows for limited shape dynamism which atm isn't a good idea for training anyway. cc @masahi for confirmation

masahi · 2021-06-22T21:11:48Z

Yes, strided_slice with static begin, end etc does support dynamic input shape. If some dims in input shape are dynamic, the corresponding output dims would also be dynamic, regard less of begin, end, and stride.

tvm/include/tvm/topi/detail/strided_slice.h

Lines 132 to 147 in cbe3dca

    
           if (ishape[axes[i]]->IsInstance<tvm::IntImmNode>()) { 
        
             const int64_t dim_i = GetConstInt(ishape[axes[i]]); 
        
             ICHECK(begin_canonicalized[i]->IsInstance<tvm::IntImmNode>()); 
        
             int64_t begin_i = GetConstInt(begin_canonicalized[i]); 
        
             int64_t end_i = CanonicalizeIndex(end[i], dim_i, strides[i]); 
        
             int interval = std::abs(end_i - begin_i); 
        
             int slice_size = 
        
                 static_cast<int>((interval + std::abs(strides[i]) - 1) / std::abs(strides[i])); 
        
             ICHECK(strides[i] < 0 ? (end_i <= begin_i) : (begin_i <= end_i)) 
        
                 << ": Input [Begin=" << begin[i] << ", End=" << end[i] << "] is invalid for axis=" << i; 
        
             out_shape.Set(axes[i], cast(out_shape[i].dtype(), PrimExpr(slice_size))); 
        
           } else if (use_any) { 
        
             out_shape.Set(axes[i], tvm::tir::Any()); 
        
           } else { 
        
             out_shape.Set(axes[i], tvm::tir::Var("dim", out_shape[i]->dtype)); 
        
           }

tvm/include/tvm/topi/detail/strided_slice.h

Lines 98 to 113 in cbe3dca

    
           if (ishape[axes[i]]->IsInstance<tvm::IntImmNode>()) { 
        
             int64_t dim_i = GetConstInt(ishape[axes[i]]); 
        
             int64_t begin_i = CanonicalizeIndex(begin[i], dim_i, strides[i]); 
        
             begin_expr.push_back(make_const(dtype, begin_i)); 
        
           } else { 
        
             auto idim = ishape[axes[i]]; 
        
             auto b_expr = make_const(dtype, begin[i]); 
        
             PrimExpr b = begin[i] < 0 ? b_expr + idim : b_expr; 
        
             auto s = strides[i]; 
        
             if (s < 0) { 
        
               b = tvm::min(b, idim - 1); 
        
             } else { 
        
               b = tvm::if_then_else(b < 0, 0, b); 
        
             } 
        
             begin_expr.push_back(b); 
        
           }

manupak · 2021-06-23T12:43:18Z

This looks interesting. Do we have a RFC or a tracking issue that indicates where the Training support of TVM is going ?

altanh · 2021-06-23T18:13:35Z

This looks interesting. Do we have a RFC or a tracking issue that indicates where the Training support of TVM is going ?

There's no RFC up at the moment, but we've had some discussion threads (see https://discuss.tvm.apache.org/t/two-missing-pieces-for-training/10037 for example). I'm currently in the process of upstreaming general improvements (gradients, AD bug fixes / improvements, new training ops, other supporting changes like contrib library support) and longer term hoping to open source a proof-of-concept TVM training framework around Q3 (developed at OctoML). Would love to hear your thoughts and perhaps it would be worth opening a long term tracking/discussion thread for training related topics!

cc @tqchen

altanh added 3 commits June 11, 2021 16:01

work on grads

d174868

add tests

c5c71fb

finish up

88d8499

jroesch approved these changes Jun 24, 2021

View reviewed changes

tqchen merged commit b9d2899 into apache:main Jun 24, 2021

ylc pushed a commit to ylc/tvm that referenced this pull request Sep 29, 2021

[Relay][Training] Additional gradients (apache#8307)

0e78b8a

junrushao mentioned this pull request Nov 1, 2021

Apache TVM v0.8 Release Note Candidate #9416

Closed

zxy844288792 pushed a commit to zxy844288792/tvm that referenced this pull request Mar 4, 2022

[Relay][Training] Additional gradients (apache#8307)

6a2cd6f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Relay][Training] Additional gradients #8307

[Relay][Training] Additional gradients #8307

altanh commented Jun 22, 2021

masahi commented Jun 22, 2021

manupak commented Jun 23, 2021

altanh commented Jun 23, 2021

[Relay][Training] Additional gradients #8307

[Relay][Training] Additional gradients #8307

Conversation

altanh commented Jun 22, 2021

masahi commented Jun 22, 2021

manupak commented Jun 23, 2021

altanh commented Jun 23, 2021