`from_pretrained` should just work for DecisionGPT2LMHeadModel #2

thejaminator · 2023-02-26T11:26:42Z

Currently to create a model that accept the scalar reward, you need to do this if you want to use pretrained gpt2.

from transformers import GPT2LMHeadModel
from conditionme import DecisionGPT2LMHeadModel

loaded_model = GPT2LMHeadModel.from_pretrained("gpt2")
decision_model = DecisionGPT2LMHeadModel.from_loaded_pretrained_model(loaded_model)

rather than just

decision_model = DecisionGPT2LMHeadModel.from_loaded_pretrained_model("gpt2")

We should allow users to just pass "gpt2". And under the hood, we'll detect that its a pretrained gpt2 model.

Case 1 -
Its a DecisionGPT2LMHeadModel that got saved.
We should retrain the original behavior and load all weights

Case 2 -
Its a gpt2 pretrained model that doesn't have our embed_return layer.
We'll log that we detected that, and we'll randomly initialize our self.embed_return layer.
Write tests for this behavior :)

Case 3 -
Neither of these scenarios.
We should throw an exception here.
Write tests for this behavior :)

The text was updated successfully, but these errors were encountered:

thejaminator added the good first issue Good for newcomers label Feb 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`from_pretrained` should just work for DecisionGPT2LMHeadModel #2

`from_pretrained` should just work for DecisionGPT2LMHeadModel #2

thejaminator commented Feb 26, 2023 •

edited

Loading

from_pretrained should just work for DecisionGPT2LMHeadModel #2

from_pretrained should just work for DecisionGPT2LMHeadModel #2

Comments

thejaminator commented Feb 26, 2023 • edited Loading

`from_pretrained` should just work for DecisionGPT2LMHeadModel #2

`from_pretrained` should just work for DecisionGPT2LMHeadModel #2

thejaminator commented Feb 26, 2023 •

edited

Loading