Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/evaluator #5331

Merged
merged 21 commits into from
Nov 14, 2017
Merged

Feature/evaluator #5331

merged 21 commits into from
Nov 14, 2017

Conversation

dzhwinter
Copy link
Contributor

fix #3808
The previous evaluator doesn't separate compile time and runtime, and also we need to create some metric states for each evaluator.

@helinwang helinwang self-requested a review November 2, 2017 23:07
### Evaluator Design
Currently, every operation is expressed in the graph. we divide the evaluator process into three steps.

1. Initialize the metric state necessary and add it into the block.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metric state necessary -> metric state

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


1. Initialize the metric state necessary and add it into the block.

2. Calculate the statistic of the metric state in every mini-batch. The single operator is only responsible for calculating necessary statistics for one mini-batch. For example, accuracy operator only calculate a minibatch data if run once.\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove ""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"""
pass

def _clear_state(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should not be a part of Python.

"""
pass

def _append_evalutor_op(self):
Copy link
Contributor

@helinwang helinwang Nov 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we just need to specify what the interface looks like to the user.

I think a very important general rule is: don't make the decision unless we absolutely have to. By deferring the decision about what private methods to use for as long as possible gives us more information to make a good decision.

2. Calculate the statistic of the metric state in every mini-batch. The single operator is only responsible for calculating necessary statistics for one mini-batch. For example, accuracy operator only calculate a minibatch data if run once.


3. Merge the mini-batch statistics to form the evaluation result for multiple mini-batches. When it comes to distributed training/Multi-GPU training, aggregate the value from different devices.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw the code below, it seems that we need some detailed description of how to implement evaluator operators in C++, how to save state in C++ and update them in python side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I will add the detail.
Currently, we have two options.
option 1, just like the TensorFlow does, composing the low-level operators to compute metric. If the performance is a real bottleneck, rewrite them in the c++ side as a new operator.
option 2, we use c++ operator to calculate every mini-batch metric and maintain some states in Python side.
I'm not sure which is better now. I implement the option 2, how do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

else:
reset_program = program
for k, var in self._states.iteritems():
zeros = helper.create_tmp_variable(dtype=var.data_type)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

helper can only create vars and ops in main_program and startup_programe. However, here we need to create them in the reset_program. So it's not correct to use helper here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

# """
# raise NotImplementedError()

def reset(self, executor, program=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not expose reset_program to the upper level. It should be created, run and destroyed in the reset function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"""
raise NotImplementedError()

def reset(self, executor, program=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reset_program=None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


return acc_out

def eval(self, executor, program=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eval_program=None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -31,6 +32,8 @@
main_program=main_program,
startup_program=startup_program)

accuracy = evaluator.Accuracy(input=y_predict, label=y)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

evaluator.accuracy()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

JiayiFeng
JiayiFeng previously approved these changes Nov 10, 2017
Copy link
Contributor

@helinwang helinwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved just because @Canpio approved.

@helinwang helinwang merged commit d1d2100 into PaddlePaddle:develop Nov 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Evaluators need to store variable(tensor) in an op
4 participants