Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

關於Title input的問題 #28

Closed
ga2006084851 opened this issue Feb 19, 2020 · 2 comments
Closed

關於Title input的問題 #28

ga2006084851 opened this issue Feb 19, 2020 · 2 comments

Comments

@ga2006084851
Copy link

ga2006084851 commented Feb 19, 2020

作者您好,謝謝您的分享,我想請問一下,
問題一:
關於Title x 的作法,請問是把"(10)(a)牛仔外套女2019春秋装新款宽松学生韩版bf原宿风外套牛仔衣潮"這個當成x轉成embedding並與(10)(a)這個attribute的embedding做相加丟入encoder,
還是純粹把"牛仔外套女2019春秋装新款宽松学生韩版bf原宿风外套牛仔衣潮"當成x,
並與(10)(a)這個attribute的embedding做相加丟入呢?
問題二:
關於最後生成的personalized product description,生成的字數個數是隨機的嗎?
有辦法指定限制字數嗎? 還是這是根據訓練集的description長度來決定的呢?

@qibinc
Copy link
Collaborator

qibinc commented Feb 19, 2020

Hi @ga2006084851 ,

  1. The latter.
  2. 在我们的这篇工作和 code 中没有办法限制 decode 出来的字数,不过基本也都跟训练集的 description 长度差不多。可以在 beam search decode 的时候强行在某个 timestep 取 (end of sequence),但会影响效果。我觉得应该有一些文献尝试解决过这个问题,但效果应该有限,毕竟不能和训练集中的长度相差太多。如果训练集中大都是 100,而你需要 30,这样可以考虑用其他数据训练一个 summarization 的模型,对生成的过长的 description 做后处理。或者如果原来的训练数据充足,建议直接丢掉过长的 training sample.

Hope this helps!

@ga2006084851
Copy link
Author

我明白了,真的感謝您的幫忙!!!

@qibinc qibinc closed this as completed Feb 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants