-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/evaluator #5331
Feature/evaluator #5331
Changes from 2 commits
cf302bd
debfb00
796eaf3
83e6500
233a305
bdc832c
c09ad73
8d9b334
c4ac7fa
7874399
79a2ce4
e34e129
b8f557f
46c61b3
cfbc92e
9e1799c
7c79243
fc117ec
12858ba
2f33f74
b32faa0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
## Evaluator Design | ||
|
||
### The Problem | ||
|
||
During training or serving, we provide the evaluation function to measure the model performance, e.g., accuracy, precision. In the operator based framework design, the data go through the network pipeline batch by batch. As a result, inside the operator, we only can calculate one minibatch metrics. We need to provide a mechanism to calculate the metrics for each N pass/batch the user wanted. | ||
|
||
### Evaluator Design | ||
Currently, every operation is expressed in the graph. we divide the evaluator process into three steps. | ||
|
||
1. Initialize the metric state necessary and add it into the block. | ||
|
||
2. Calculate the statistic of the metric state in every mini-batch. The single operator is only responsible for calculating necessary statistics for one mini-batch. For example, accuracy operator only calculate a minibatch data if run once.\ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. remove "" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
|
||
|
||
3. Merge the mini-batch statistics to form the evaluation result for multiple mini-batches. When it comes to distributed training/Multi-GPU training, aggregate the value from different devices. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I saw the code below, it seems that we need some detailed description of how to implement evaluator operators in C++, how to save state in C++ and update them in python side. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, I will add the detail. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
|
||
### Implementation | ||
This design is shown in python API. There would be an abstract python interface and multiple inheritances for each evaluation method. | ||
|
||
```python | ||
class Evaluator(object): | ||
""" | ||
Evalutor Base class. | ||
""" | ||
def __init__(self): | ||
""" | ||
create metric states and append to block | ||
""" | ||
pass | ||
|
||
def _clear_state(self): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe this should not be a part of Python. |
||
""" | ||
clear metric states at the begin of each pass | ||
""" | ||
pass | ||
|
||
def _append_evalutor_op(self): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps we just need to specify what the interface looks like to the user. I think a very important general rule is: don't make the decision unless we absolutely have to. By deferring the decision about what private methods to use for as long as possible gives us more information to make a good decision. |
||
""" | ||
add mini-batch caculate operators to block | ||
add increment operator to accumulate the metric state | ||
""" | ||
pass | ||
|
||
def _merge(self): | ||
""" | ||
Merge the mini-batch statistics to form the evaluation result for multiple mini-batches. | ||
""" | ||
pass | ||
|
||
def evaluate(self): | ||
""" | ||
only one exported interface | ||
user calculate the result | ||
""" | ||
pass | ||
|
||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,6 +4,7 @@ | |
import unittest | ||
import op_test | ||
import numpy as np | ||
exit(0) | ||
|
||
|
||
class TestEvaluator(unittest.TestCase): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metric state necessary -> metric state
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done