【question】nce_layer的用法 #1388

pkuyym · 2017-02-20T08:53:34Z

文档介绍：

Noise-contrastive estimation. Implements the method in the following paper: A fast and simple algorithm for training neural probabilistic language models.

cost = nce_layer(input=layer1, label=layer2, weight=layer3,
num_classes=3, neg_distribution=[0.1,0.3,0.6])

我理解，应该是在训练时，以指定概率采样负样本进行更新，模型效果跟样本采样分布以及采样数量相关性很大

问题1:
上述理解是否有误，有没有经验说明nce_layer与完整softmax+negative log loss相比，对模型收敛影响多大？

问题2:
这个layer的输出是cost，如何在predict时获取每个类别的概率？

pkuyym · 2017-02-20T09:03:17Z

使用nce_layer时，出现错误：

[INFO 2017-02-20 16:57:45,611 dataprovider.py:20] dict len : 1972305
I0220 16:57:45.611878 28999 GradientMachine.cpp:135] Initing parameters..
I0220 16:58:14.458161 28999 GradientMachine.cpp:142] Init parameters done.
I0220 16:58:15.707015 31345 ThreadLocal.cpp:40] thread use undeterministic rand seed:31346
Thread [140038641317632] Forwarding nce_layer_0, fc_layer_1, fc_layer_0, last_seq_0, simple_gru2_0, __simple_gru2_0___transform, embedding_0, label, bidword_seq,
*** Aborted at 1487581099 (unix time) try "date -d @1487581099" if you are using GNU date ***
PC: @ 0x7f69e3625764 __log
*** SIGFPE (@0x7f69e3625764) received by PID 28999 (TID 0x7f5d49787700) from PID 18446744073229457252; stack trace: ***
@ 0x7f69e4238160 (unknown)
@ 0x7f69e3625764 __log
@ 0x5fd4d5 paddle::NCELayer::forward()
@ 0x6ecb40 paddle::NeuralNetwork::forward()
@ 0x6e15e9 paddle::TrainerThread::forward()
@ 0x6e3bd5 paddle::TrainerThread::computeThread()
@ 0x7f69e39b28a0 execute_native_thread_routine
@ 0x7f69e42301c3 start_thread
@ 0x7f69e312312d __clone

luotao1 · 2017-02-20T09:09:47Z

请贴出对应配置。另外NCE_layer不能在GPU上计算，请知晓。

pkuyym · 2017-02-20T09:12:34Z

@luotao1 谢谢提醒，配置如下：
bidword_seq = data_layer(name = 'bidword_seq', size = dict_size)
label = data_layer(name = 'label', size = dict_size)

######################Algorithm Configuration #############
settings(
batch_size = 2000,
learning_rate = 1e-7,
learning_method = MomentumOptimizer(momentum = 0.95),
regularization=L2Regularization(1e-5)
)

#######################Network Configuration #############
embed = embedding_layer(input = bidword_seq, size = embed_dim)
gru = simple_gru2(input = embed, size = rnn_dim)
fw_seq = last_seq(input = gru)
hid_layer1 = fc_layer(input = fw_seq, size = 200, act = ReluActivation())
hid_layer2 = fc_layer(input = hid_layer1, size = 100, act = ReluActivation())
outputs(nce_layer(input = [hid_layer2], label = label, num_classes = dict_size))

dqcao · 2017-04-06T14:57:36Z

遇到同样问题，请问解决了吗，怎么解决问题的！@luotao1 @pkuyym

pkuyym · 2017-07-31T03:27:05Z

@dqcao 请参照这个样例nce_cost使用nce，里面有使用nce_cost进行训练、预测的逻辑

* update install whl * update

lcy-seso self-assigned this Apr 7, 2017

xinghai-sun closed this as completed Aug 2, 2017

lizexu123 pushed a commit to lizexu123/Paddle that referenced this issue Feb 23, 2024

update install requirement (PaddlePaddle#1388)

a0d87b2

* update install whl * update

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【question】nce_layer的用法 #1388

【question】nce_layer的用法 #1388

pkuyym commented Feb 20, 2017

pkuyym commented Feb 20, 2017

luotao1 commented Feb 20, 2017

pkuyym commented Feb 20, 2017

dqcao commented Apr 6, 2017 •

edited

Loading

pkuyym commented Jul 31, 2017 •

edited

Loading

【question】nce_layer的用法 #1388

【question】nce_layer的用法 #1388

Comments

pkuyym commented Feb 20, 2017

pkuyym commented Feb 20, 2017

luotao1 commented Feb 20, 2017

pkuyym commented Feb 20, 2017

dqcao commented Apr 6, 2017 • edited Loading

pkuyym commented Jul 31, 2017 • edited Loading

dqcao commented Apr 6, 2017 •

edited

Loading

pkuyym commented Jul 31, 2017 •

edited

Loading