Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack LSTM Net for Paddle Book6 #5503

Merged
merged 11 commits into from
Nov 13, 2017
Merged

Conversation

QiJune
Copy link
Member

@QiJune QiJune commented Nov 8, 2017

Fix #5504

'isReverse': is_reverse,
'gateActivation': gate_activation,
'cellActivation': cell_activation,
'candidateActivation': candidate_activation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these attr names have been changed to snake_case, please update.

'cellActivation': cell_activation,
'candidateActivation': candidate_activation
})
return hidden
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Cell is also the output.

inputs = [fc1, lstm1]

for i in range(2, stacked_num + 1):
fc = layers.fc(input=inputs, size=hid_dim)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这处和book不一致,book: https://github.com/PaddlePaddle/book/blob/develop/06.understand_sentiment/train.py#L58

这个fc有两个输入,有两组weight,每个weight的初始化,强调下lstm作为输入的weight初始化是0。

fc_para_attr = paddle.attr.Param(learning_rate=1e-3)
lstm_para_attr = paddle.attr.Param(initial_std=0., learning_rate=1.)

for i in range(2, stacked_num + 1):
fc = layers.fc(input=inputs, size=hid_dim)
lstm = layers.dynamic_lstm(
input=fc, size=hid_dim, is_reverse=(i % 2) == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里有一处和book不同:lstm的candidate_activation在book里(即book里的act)用的是relu

https://github.com/PaddlePaddle/book/blob/develop/06.understand_sentiment/train.py#L80

prediction = layers.fc(input=[fc_last, lstm_last],
size=class_dim,
act='softmax')
cost = layers.cross_entropy(input=prediction, label=label)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为了数值稳定性,我们有softmax_with_cross_entropy_op,建议demo里 softmax+ cross_entropy换成softmax_with_cross_entropy_op ?

paddle.reader.shuffle(
paddle.dataset.imdb.train(word_dict), buf_size=1000),
batch_size=BATCH_SIZE)
place = core.CPUPlace()
Copy link
Contributor

@qingqing01 qingqing01 Nov 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是否加个GPU的例子?

# place = core.GPUPlace(0)

outs = exe.run(g_main_program,
feed={"words": tensor_words,
"label": tensor_label},
fetch_list=[cost, acc])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

后续会作为demo吗? 如果作为demo,是不是应该测试下test集?(也可以加TODO,作为后续PR。)

Copy link
Contributor

@qingqing01 qingqing01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve this PR, but some mentioned reviews need to be updated later, I create an issue #5591

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants