Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polish fleet API to support cuda collective mode and nccl2 mode. #18966

Merged
merged 14 commits into from Aug 12, 2019

Conversation

gongweibao
Copy link
Contributor

Polish fleet API to support cuda collective mode and nccl2 mode.

  • Polish DistributedStrategy members and support dist_fc and local_sgd
  • Polish minimize function to support compiler function.


startup_program = startup_program if startup_program else \
fluid.framework.default_startup_program
_check(main_program, self._optimizer, self._strategy)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you specify what kind of check does this function do? maybe a detailed function name is needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@gongweibao gongweibao changed the title [WIP]Polish fleet API to support cuda collective mode and nccl2 mode. Polish fleet API to support cuda collective mode and nccl2 mode. Aug 7, 2019
io.save_persistables(self._executor, dirname, main_program, None)
io.save_persistables(executor, dirname, main_program, None)

def node_num(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function is not in fleet_base.py, we have a protocol that all sub class implementation should follow fleet_base.py inferface design.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Collaborator

@gavin1332 gavin1332 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@luotao1
Copy link
Contributor

luotao1 commented Aug 12, 2019

[05:35:17]	[Step 1/1] + echo 'You must have one RD (XiaoguangHu01,chengduoZH,Xreki,luotao1,sneaxiy,tensor-tang) approval for the api change! python/paddle/fluid/framework.py for the management reason of the underlying code for fluid.'

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the update on framework.py

Copy link

@sandyhouse sandyhouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gongweibao gongweibao merged commit 29d8781 into PaddlePaddle:develop Aug 12, 2019
@gongweibao gongweibao deleted the polishfleet branch August 12, 2019 09:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants