Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Memory]More memory optimization policy #8690

Merged
merged 20 commits into from
Mar 12, 2018

Conversation

QiJune
Copy link
Member

@QiJune QiJune commented Mar 1, 2018

After add a more optimized level, the image_classification demo memory reduced from 93024256 to 92807168. There is a little benefit.

There are still many die variables not be reused. Most of these are gradient variable. After sgd optimization, these gradient can be released. Maybe we have to delete them with a DeleteOperator.

I add another release memory policy with DeleteOp, and tested on resnet model:

Model no optimize reuse memory release memory forward memory
Resnet 170590208 92995584(reduce 45.5%) 78004224(reduce 54.3%) 77488128

Release memory policy has almost reached the upper limit(forward memory). If we want to reduce the memory occupation further, there are two ways:

  • Look carefully at forward pass, and fuse some small operators, and try to reduce some intermediate result.
  • Use re-computation policy, throw some results in forward pass and re-compute it in backward pass

@QiJune QiJune changed the title More level memory optimization level More memory optimization level Mar 1, 2018
@@ -118,7 +118,7 @@ def _find_var(self, block_desc, var_name, is_forward):
else:
return block_desc.find_var_recursive(str(var_name))

def memory_optimize(self):
def memory_optimize(self, level=0):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code style says that a function name should be a verb-subject phrase, like, optimize_memory, instead of memory_optimize.

Also, it seems that we cannot optimize the memory; what we could is to optimize the usage of the memory.

For this case, does it mean reuse_memory and should we rename level into reuse_tensor_with_the_same_size?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. It's mainly reuse memory. And level 0 means that we can reuse tensor with the same size. Level 1 means that we can reuse tensor if current tensor size is the same or less than cache pool tensor.

I will refine these codes accordingly. Thanks!

@dzhwinter dzhwinter added this to Doing in Performance Tuning Mar 6, 2018
@dzhwinter dzhwinter changed the title More memory optimization level [Memory]More memory optimization level Mar 6, 2018
@QiJune QiJune changed the title [Memory]More memory optimization level [Memory]More memory optimization policy Mar 8, 2018
Copy link
Contributor

@dzhwinter dzhwinter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR give an aggressive policy to reuse the memory--do stream synchronize after each operator is launch, so we can delete all the variables if the op is not running.
@QiJune
We will merge this PR since the Image mission deadline is looming. Please give some experiment detail of the effect on speed and complete the issue description. Thanks!

@QiJune QiJune merged commit f7e9fe5 into PaddlePaddle:develop Mar 12, 2018
Performance Tuning automation moved this from Doing to Done Mar 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants