Draft: Initial draft implementation of CFG for LLM #996

Based on the paper Sanchez, Guillaume, et al. "Stay on topic with Classifier-Free Guidance." arXiv preprint arXiv:2306.17806 (2023). a draft implementation of classifier free guidance. This is simply for sharing internally and might very well be completely wrong. It is debatable if we should expose such a feature as a flag to the network or make it a separate classifier instance (or a mixin). In the past we were very much against special (potentially short-lived) feature flags and it was much nicer to have this implemented as an addon/callback. We might need to do something similar here as well.

- Makes it possible to set gamma parameter - Setting it to `None` disabled functionality completely

- `label_id` was misleading since it is actually a list of token ids related to a label and not a scalar value. Also the general process of generating logits it not related to labels at all but rather just to tokens - `kwargs` was named to be similar to transformers `generate` convention but is meant to be passed to `generate` and is therefore, in the context of `generate_logits` a model input. This should help the reader distinguish between expected input (`token_ids`) and model input (`model_input`)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: Initial draft implementation of CFG for LLM #996

Draft: Initial draft implementation of CFG for LLM #996

Commits on Jul 19, 2023

Commits on Jul 26, 2023

Draft: Initial draft implementation of CFG for LLM #996

Are you sure you want to change the base?

Draft: Initial draft implementation of CFG for LLM #996

Commits on Jul 19, 2023

Commits on Jul 26, 2023