Conversation
cbf43ef to
a1c801c
Compare
|
Overall, I find this approach a bit too intrusive to the framework. It's injecting a kernel level control from the RunOption. I think this is breaking the original design principle that an op is self-descriptive and stateless. Input+Attribute alone should determine the behavior of an op. If this flag is introduced, the op's behavior would depend on the config set by run time. For dropout, we have a "ratio" input to determine if it should invoke training mode. To make these nodes to operate in the eval mode, we can override the inputs of there nodes. Initializer of a node can still be override with graph feeds. |
|
I agree with Sherlock, and he summarized the design quite well. There's a relatively small number of operators that need different behaviors in training vs inferencing/evaluation mode. We should favor controlling those behaviors with operator inputs, and if we need to control it per |
This reverts commit a1c801c.
add dropout ratio node to graph input
|
Do not forget to rebase and target to the master branch |
Description:
Fix evaluation issues
Motivation and Context
Override drop ratio in eval with 0 so that Dropout behaves as an Identity node so that evaluation results are correct.
Added unit tests for python frontend.