Why the value calculate in generate and learn use different mask？ #14

Nightbringers · 2023-01-03T07:03:16Z

I'm very confused about the value calculate, why use different mask? In generate method, the mask include prompt. But when training in learn method, the mask did not include prompt.
this is in learn method:
action_masks = ~prompt_masks & masks
action_logits, values = self.actor_critic(
sequences,
mask = action_masks
)
and in generate method:
mask = None
if exists(eos_token):
mask = ((sequence == eos_token).cumsum(dim = -1) == 0)
mask = F.pad(mask, (1, -1), value = True) # include eos token
action_logits, value = self.forward(
sequence,
mask = mask,
return_values = return_values
)

lucidrains · 2023-01-03T08:53:59Z

@Nightbringers yes you are correct! thank you for catching this! a0b9774

lucidrains closed this as completed Jan 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why the value calculate in generate and learn use different mask？ #14

Why the value calculate in generate and learn use different mask？ #14

Nightbringers commented Jan 3, 2023

lucidrains commented Jan 3, 2023

Why the value calculate in generate and learn use different mask？ #14

Why the value calculate in generate and learn use different mask？ #14

Comments

Nightbringers commented Jan 3, 2023

lucidrains commented Jan 3, 2023