Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

init Inference top APIs #10549

Merged
merged 2 commits into from
May 10, 2018
Merged

Conversation

Superjomn
Copy link
Contributor

With a README.md with some description/plan of how to use the APIs.

@Superjomn Superjomn requested review from luotao1 and Xreki May 10, 2018 03:19
@@ -0,0 +1,27 @@
# Embed Paddle Inference in Your Application

Paddle inference offers the APIs in `C` and `C++` languages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里有必要分C和C++两个么?目前只是C++ api,能否先只写C++ api?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯,另外加一个 c api,估计另外一个pr里

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c如果暂时不需要就先别写了


Paddle inference offers the APIs in `C` and `C++` languages.

One can easily deploy a model trained by Paddle following the steps as below:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Paddle->PaddlePaddle


## Optimize the native Fluid Model

The native model that get from the training phase needs to be optimized for that.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我们是拿了train阶段的save_inference_model,这样会加入feed和fetch op,并做了一定的剪裁优化。如果直接拿train阶段的模型,没有feed和fetch op,就跑不了了。

这里提到的策略1,2,3,应该在save_inference_model的时候就做了。
这里是否应该只提供一些额外的优化策略,比如third-party engine, fuse operators等

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对,这里只是解释这个工具的必要性。

const std::vector<std::vector<int>>& input_shapes,
const std::vector<std::vector<int>>& output_shapes,
const std::vector<std::vector<float>>& input_data,
std::vector<std::vector<float>>* output_data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个接口,对NLP的已经不适用了。是否考虑接口中直接使用LoDTensor。
因为用户的数据格式千变万化,让用户自己转成LoDTensor比较合理。我们也可以给出一些转换的工具或函数,但run的接口里保持使用LoDTensor。

bool Run(const std::vector<LoDTensor>& input, 
         std::vector<LoDTensor>* output);

inputs和outputs不需要,feed和fetch op里面都有的。

void TestInference(const std::string& dirname,
const std::vector<paddle::framework::LoDTensor*>& cpu_feeds,
const std::vector<paddle::framework::LoDTensor*>& cpu_fetchs,
const int repeat = 1, const bool is_combined = false) {

单侧里面已经封装的比较干净了。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里还需要考虑多线程预测的情况,需要加一个const int thread_nums的参数。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

内部没有多线程,多线程是外面的线程调预测库。


class Predictor {
public:
struct Attr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attr-》Network?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不是Network,是 attribute

kAnakin, // Use Anakin for inference.
kTensorRT, // Use TensorRT for inference.
kAutoMixedAnakin, // Automatically mix Fluid with Anakin.
kAutoMixedTensorRT, // Automatically mix Fluid with TensorRT.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • kAutoMixedAnakin和kAutoMixedTensorRT可以去掉,kAnakin应该就包括kAutoMixedAnakin
  • kNone里面应该还要分CPU模式,GPU模式
  • MKLDNN属于kNone还是单列?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不包括,这里 kTensorRT指的是全图用,子图那个是单独的开关kAutoMixedTensorRT

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对用户来说,子图全图概念有点复杂,选了TensorRT,就理解为用TensorRT来做优化了,至于用子图还是全图优化(而且全图是子图的一部分),应该内部实现。

Copy link
Contributor Author

@Superjomn Superjomn May 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

部分支持的feature现在还没有,放在这里只剩为了让业务方知道我们在做这个feature

- Memory reuse for native Fluid executor;
- Translate the model storage format to some third-party engine's, so that the inference API can utilize the engine for acceleration;

We have an official tool to do the optimization, call `paddle_inference_optimize --help` for more information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paddle_inference_optimize是binary还是python脚本?
比如python paddle_inference_optimize src_model_dir dst_model_dir --inference_optimize_method=2 代表使用第二种优化策略。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

binary或者脚本

Copy link
Contributor

@panyx0718 panyx0718 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's kick off this thing. It's in contrib, just for experiment for now

@@ -0,0 +1,27 @@
# Embed Paddle Inference in Your Application

Paddle inference offers the APIs in `C` and `C++` languages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c如果暂时不需要就先别写了

@Superjomn Superjomn merged commit 6d371e4 into PaddlePaddle:develop May 10, 2018
@Superjomn Superjomn deleted the feature/inference_api branch May 10, 2018 12:04
@Xreki Xreki added the 预测 原名Inference,包含Capi预测问题等 label May 16, 2018
@Xreki Xreki added this to Basic Usage (DONE) in Inference Framework May 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
预测 原名Inference,包含Capi预测问题等
Projects
No open projects
Inference Framework
Basic Usage (DONE)
Development

Successfully merging this pull request may close these issues.

None yet

4 participants