Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to prepare the data for text generation task. Thank you very much. #2

Open
guotong1988 opened this issue Mar 25, 2019 · 3 comments

Comments

@guotong1988
Copy link

guotong1988 commented Mar 25, 2019

First, I'm not sure whether the model contains the encoder during training.

EOS means end-of-sentence. Encoder and decoder are part of transformer network.

If without-encoder, training time:

target: [E, F, G, H, EOS]
decoder input: [0, E, F, G, H]

If without-encoder, testing time:

decoder input: [0]

If with encoder, training time:

encoder input: [A, B, C, D]
target: [E, F, G, H, EOS]
decoder input: [0, E, F, G, H]

If with-encoder, testing time:

encoder input: [A, B, C, D]
decoder input: [0]

Am I exact right?

I know it is beyond the topic of this project, but hope you could help.
Thank you and thank you.

@pkumdl
Copy link

pkumdl commented Mar 25, 2019

I have the problem too. For translation, I know the input and target, but I am not sure the input and target for the language model.

@Astraithious
Copy link

Would also love to know how to properly prepare data for the model, I am quite new but this is very exciting to work with

@dadaas
Copy link

dadaas commented Jul 18, 2019

I am also wondering if we have different types of data. For example lyrics and recipes. Do we just deploy trained data inside 345M model and run it or we should separate those trained data and use only data for lyrics when we want lyrics and use recipes trained data only when we want to generate recipes?

And if that is the case, how do we do that in simple way?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants