Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encapsulate generation options in a config struct #193

Merged
merged 2 commits into from
Apr 6, 2023

Conversation

jonatanklosko
Copy link
Member

Closes #87.

Now whenever we generate text we will also load generation config (sourced from either generation_config.json or config.json).

# Load the config with model defaults
{:ok, generation_config} = Bumblebee.load_generation_config({:hf, "gpt2"})
# Further customize the options
generation_config = Bumblebee.configure(generation_config, max_new_tokens: 10)

This also adds generation strategy option, which is a map like %{type: :contrastive_search, top_k: 4, penalty_alpha: 0.6}. Since there are very few strategy-specific options I think the nesting is fine and we can require all of them. This way we avoid conflicting option names, and listing options that are not relevant for the configured strategy.

Comment on lines +231 to +242
# TODO: Whisper generation_config.json doesn't have task-specific
# tokens and those are instead added on the fly before generation.
# We need to add support for model-specific configuration like
# language and task and update configuration based on that
data =
case data do
%{"forced_decoder_ids" => [[1, nil], [2, 50359]]} ->
put_in(data["forced_decoder_ids"], [[1, 50259], [2, 50359], [3, 50363]])

data ->
data
end
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, this basically maintains the current behaviour before we add language/task options. These options are Whisper-specific, but we need a way to pass them to the serving.

Copy link

@alissonfpmorais alissonfpmorais Apr 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a enterely beginner in AI so I'm not sure if this helps or even if it is a good practice, but I ended with this approach:

defp serving do
  {:ok, whisper} = Bumblebee.load_model({:hf, "openai/whisper-small"})
  {:ok, featurizer} = Bumblebee.load_featurizer({:hf, "openai/whisper-small"})
  {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "openai/whisper-small"})

  Bumblebee.Audio.speech_to_text(whisper, featurizer, tokenizer,
    max_new_tokens: 100,
    defn_options: [compiler: EXLA],
    forced_token_ids: [
      {1, 50267}, # %{"languages" => [{50259, en}, {50262, es}, {50267, pt}]}
      {2, 50359}, # %{"modes" => [{50358, translate}, {50359, transcribe}]}
      {3, 50363}  # %{"opts" => [{50363, notimestamps}]}
    ]
  )
end

And here's my first attempt:

defp serving do
  {:ok, whisper} = Bumblebee.load_model({:hf, "openai/whisper-small"})
  {:ok, featurizer} = Bumblebee.load_featurizer({:hf, "openai/whisper-small"})
  {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "openai/whisper-small"})

  spec = Bumblebee.configure(whisper.spec, forced_token_ids: [{1, 50267}, {2, 50359}, {3, 50363}])
  whisper = %{whisper | spec: spec}

  Bumblebee.Audio.speech_to_text(whisper, featurizer, tokenizer,
    max_new_tokens: 100,
    defn_options: [compiler: EXLA]
  )
end

While testing with livebook, both worked fine for me

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like this could work for this issue:

defp serving do
  {:ok, whisper} = Bumblebee.load_model({:hf, "openai/whisper-small"})
  {:ok, featurizer} = Bumblebee.load_featurizer({:hf, "openai/whisper-small"})
  {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "openai/whisper-small"})

  Bumblebee.Audio.speech_to_text(whisper, featurizer, tokenizer,
    max_new_tokens: 100,
    defn_options: [compiler: EXLA],
    # translate these options inside `Bumblebee.Audio.speech_to_text`
    lang: :pt,
    mode: :translate,
    timestamps: true
  )
end

What you think? It must have some standard since Bumblebee.Audio module is model agnostic, but I believe that at least language is an acceptable option. Also this module may have a internal mechanism to parse these generic options into model specific options.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alissonfpmorais thanks for the feedback, after the changes you could do:

{:ok, generation_config} = Bumblebee.load_generation_config({:hf, "openai/whisper-small"})

generation_config =
  Bumblebee.configure(generation_config, forced_token_ids: [{1, 50267}, {2, 50359}, {3, 50363}])

I think eventually the end-user API could be that serving can accept additional options like :lang in this case and then the generation behaviour (model module) can handle the additional options and alters the generation config. However, we may need to reconfigure the tokenizer as well. Also perhaps we can figure out making some of the options (like language) configurable at run time.

Basically there are a couple considerations that are not strictly coupled with this change, so I put them in a separate issue :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks the quick reply!

Oh I see, I'm gonna update my code after the changes.
Also enabling options being configurable at runtime would be awesome!

Thanks for the amazing job on the lib :))

@jonatanklosko jonatanklosko mentioned this pull request Apr 6, 2023
3 tasks
@jonatanklosko
Copy link
Member Author

CI fails because we try to run too many slow tests and run of resources (likely disk space). Everything passes locally, so I'm merging :)

@jonatanklosko jonatanklosko merged commit 17c6d60 into main Apr 6, 2023
@jonatanklosko jonatanklosko deleted the jk-generation-config branch April 6, 2023 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Encapsulate generation config
3 participants