Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add skeleton of refactor notes #3705

Closed

Conversation

reyoung
Copy link
Collaborator

@reyoung reyoung commented Aug 27, 2017

No intent to give a global design doc for refactoring, just list what
problems we should concern.

这是重构的整体设计文档的第一步。主体是描述出重构中需要关注到的问题。这个PR只列出了提纲。可以使用这个思维导图进行review

如果有哪些话题没必要写或者需要写可以comments,每一个子问题会以后续独立的PR完善。

No intent to give a global design doc for refactoring, just list what
problems we should concern.

## 需要解决的问题

* [为什么要引入Op](notes/why_use_op.md)?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我能想到的几个具体的问题,感觉需要一个给开发人员看的faq。

  1. Op的infershape主要做什么,为什么需要做两次?
  2. compute为什么是无状态的,为什么不要在compute的时候创建 Var?
  3. Var主要存什么数据类型,一定要在scope中创建么,为什么
  4. 对于一个计算图而言,如何决定运行那一部分,是显示将net分段,还是通过依赖关系和target自动推导
  5. 如果有target,target应该是什么,是op还是var,为什么
  6. in place是什么,会出现在什么场景(例如parameter update)
  7. paddle构建分编译期和运行期么,如何区分,界限在哪里?(按照昨天的讨论,有两种可能性,VarDesc ==> scope 或者 带内存的tensor和不带内存的tensor),为什么这样分,有什么好处。

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

计算库选型问题:
为什么时候eigen,什么场景适合使用eigen,什么场景不适合,有哪些考虑
gpu code手写kernel的话,有什么规范和要求。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好。收到。

@@ -0,0 +1,13 @@
为什么引入Op

Copy link
Contributor

@qingqing01 qingqing01 Aug 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

觉得需要说明的问题有:

  1. 同一Op的不同设备(CPU、GPU)、不同数据类型(float, double)的实现是如何组织的?
  2. 为什么有InferShapeContext, ExecutionContext两个?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是的,目前考虑了float类型的kernel,还没有考虑double/fp16/int8等。以及怎么选择对应的kernel,是由用户在配置网络的时候指定,还是根据用户输入数据的类型来推导出来,运行时选择

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同一Op的不同设备(CPU、GPU)、不同数据类型(float, double)的实现是如何组织的?
为什么有InferShapeContext, ExecutionContext两个?

好的。

## 需要解决的问题

* [为什么要引入Op](notes/why_use_op.md)?
* [显式的Bacward图](notes/explicit_backward_in_topology.md)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 什么是计算图,计算图中的节点是什么,边是什么。我们需要什么样的计算图表示,control flow graph/data flow graph or sth.


* [为什么要引入Op](notes/why_use_op.md)?
* [显式的Bacward图](notes/explicit_backward_in_topology.md)
* [内存与GPU计算的优化](notes/optimization_for_memory_and_gpu_kernel.md)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 我们目前应该怎么做显存优化以及计算优化。paddle之前基于粗粒度layer的设计,有很多手动优化的策略;而现在既然是基于op的设计,那么应该选择什么样的优化策略。我们应该给出路线图,比如说现阶段手工优化为主(粗粒度的op),以后会基于计算图来做自动优化。
3 显存优化策略
4 GPU计算优化策略,包括kernel fusion,multi-stream等

* [显式的Bacward图](notes/explicit_backward_in_topology.md)
* [内存与GPU计算的优化](notes/optimization_for_memory_and_gpu_kernel.md)
* [更好的报错信息](notes/better_error_message.md)
* [更简化的Python实现](notes/thin_python_implementation.md)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个其实是想说更灵活好用的用户api吧

@luotao1
Copy link
Contributor

luotao1 commented Feb 1, 2019

感谢您给PaddlePaddle贡献文档。由于文档已迁移至FluidDoc repo,因此关闭您的PR,欢迎您向FluidDoc Repo贡献文档。
Thanks for contributing to PaddlePaddle! Since documents have been moved to FluidDoc repo, we close this PR. Welcome to contribute to FluidDoc repo.

@luotao1 luotao1 closed this Feb 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants