Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Initial draft implementation of CFG for LLM #996

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Commits on Jul 19, 2023

  1. Initial draft implementation of CFG for LLM

    Based on the paper
    
            Sanchez, Guillaume, et al.
            "Stay on topic with Classifier-Free Guidance."
            arXiv preprint arXiv:2306.17806 (2023).
    
    a draft implementation of classifier free guidance.
    
    This is simply for sharing internally and might very well be
    completely wrong. It is debatable if we should expose such
    a feature as a flag to the network or make it a separate
    classifier instance (or a mixin). In the past we were
    very much against special (potentially short-lived) feature
    flags and it was much nicer to have this implemented as
    an addon/callback. We might need to do something similar
    here as well.
    ottonemo committed Jul 19, 2023
    Configuration menu
    Copy the full SHA
    26b7e7b View commit details
    Browse the repository at this point in the history
  2. Use cfg_gamma instead of use_cfg boolean flag

    - Makes it possible to set gamma parameter
    - Setting it to `None` disabled functionality completely
    ottonemo committed Jul 19, 2023
    Configuration menu
    Copy the full SHA
    1c34aca View commit details
    Browse the repository at this point in the history

Commits on Jul 26, 2023

  1. Rename label_id and kwargs

    - `label_id` was misleading since it is actually a list of token ids
      related to a label and not a scalar value. Also the general process
      of generating logits it not related to labels at all but rather just
      to tokens
    
    - `kwargs` was named to be similar to transformers `generate`
      convention but is meant to be passed to `generate` and is therefore,
      in the context of `generate_logits` a model input. This should help
      the reader distinguish between expected input (`token_ids`) and
      model input (`model_input`)
    ottonemo committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    da062b0 View commit details
    Browse the repository at this point in the history