Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ctc init #63

Closed
wants to merge 0 commits into from
Closed

ctc init #63

wants to merge 0 commits into from

Conversation

Superjomn
Copy link
Contributor

出现 gpu allocate 多次的问题

@Superjomn Superjomn requested a review from qingqing01 June 2, 2017 07:16
ctc/model.py Outdated
stride_x=1,
stride_y=1,
block_x=1,
block_y=3)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

依据 PaddlePaddle/Paddle#2296 这里看conv_features的 输出的特征c=128, h=11, w=3, block_y应该是11,估计这块设置不对,导致了GPU内存问题,以及issues #2296 的问题。

@Superjomn
Copy link
Contributor Author

训练的配置写好了,基本可以 PR 了。
训练效果和生成的配置准备下次 pull 加进去 @qingqing01

Copy link
Collaborator

@qingqing01 qingqing01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

另外,需要infer过程。

ctc/README.md Outdated
# CTC (Connectionist Temporal Classification) 模型CRNN教程
## 背景简介

现实世界中的序列学习任务需要从连续的输入序列中预测出对应标签序列,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

序列学习任务需要从连续的输入序列中预测出对应标签序列

连续的输入准确吗? 机器翻译的输入不连续吧?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

ctc/README.md Outdated
## 背景简介

现实世界中的序列学习任务需要从连续的输入序列中预测出对应标签序列,
比如语音识别任务从连续的语音中得到对应文字序列,类似于seq2seq任务;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seq2seq任务如果是个专有名字需要加链接。

ctc/README.md Outdated
CTC相关模型就是实现此类seq2seq任务的的一类算法,具体地,CTC模型为输入序列中每个时间步做一次分类输出一个标签(CTC中 Classification的来源),
最终对输出的标签序列处理成对应的输出序列(具体算法参见下文)。

CTC 算法在很多领域中有应用,比如手写数字识别、语音识别、手势识别、连续图像文字识别等,除去不同任务中的专业知识不同,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

连续图像文字识别不够准确。

ctc/README.md Outdated
CTC 算法在很多领域中有应用,比如手写数字识别、语音识别、手势识别、连续图像文字识别等,除去不同任务中的专业知识不同,
所有任务均为连续序列输入,标签序列输出。

本文将针对 **场景文字识别 (STR, Scene Text Recognition)** 任务,演示如何用 PaddlePaddle 实现 一个一站式 CTC 的模型 **CRNN(Convolutional Recurrent Neural Network)**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

一站式 -> 端到端

ctc/README.md Outdated
CTC 算法在很多领域中有应用,比如手写数字识别、语音识别、手势识别、连续图像文字识别等,除去不同任务中的专业知识不同,
所有任务均为连续序列输入,标签序列输出。

本文将针对 **场景文字识别 (STR, Scene Text Recognition)** 任务,演示如何用 PaddlePaddle 实现 一个一站式 CTC 的模型 **CRNN(Convolutional Recurrent Neural Network)**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要提下OCR,解释下和STR的区别?



if __name__ == '__main__':
image_file_list = '/home/disk1/yanchunwei/90kDICT32px/train_all.txt'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个脚本需要有下载数据的过程,这个路径可以换成下载之后的路径。

ctc/README.md Outdated
### 图像数据及处理
本任务使用数据集\[[4](#参考文献)\],数据中包括了图片数据和对应的目标文本,其中预测的目标文本需要转化为一维的ID列表,我们用如下类来实现

```python
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这块不用粘贴这么多代码,告诉用户是在哪个脚本和函数即可。

if self.fixed_shape:
image = cv2.resize(
image, self.fixed_shape, interpolation=cv2.INTER_CUBIC)
# image = to_chw(image)
Copy link
Collaborator

@qingqing01 qingqing01 Jul 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image = to_chw(image)

注释的代码去掉。

ctr/model.py Outdated
@@ -0,0 +1,76 @@
#!/usr/bin/env python
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CTR和DSSM部分代码和文档可以从这个PR里去掉吗?

ctc/train.py Outdated

trainer.train(
reader=paddle.batch(
paddle.reader.shuffle(dataset.train, buf_size=100),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buf_size可以增大点,加大shuffle范围吧。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants