Skip to content

from_tensor_slices() method not working with GPT-2 model in gpt2_causal_lm.py example #849

@Neeshamraghav012

Description

@Neeshamraghav012

Hi team,

I was doing some experiments with gpt2 causal lm, I started by running the examples that are given in the documentation
Link to documentation.
But this example didn't worked,

features = [
        "I don't listen to music while coding.",
        "But I watch youtube while coding!",
]

ds = tf.data.Dataset.from_tensor_slices(features)

gpt2_lm = keras_nlp.models.GPT2CausalLM.from_preset(
        "gpt2_base_en",
)
gpt2_lm.compile(
        loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
)
gpt2_lm.fit(ds, batch_size=2)

Output:
CI

But when I passed the features directly it worked. Here is the working code;

features = [
        "I don't listen to music while coding.",
        "But I watch youtube while coding!",
]

ds = tf.data.Dataset.from_tensor_slices(features)

gpt2_lm = keras_nlp.models.GPT2CausalLM.from_preset(
        "gpt2_base_en",
)
gpt2_lm.compile(
        loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
)
gpt2_lm.fit(features, batch_size=2)

It seems that there might be an issue with the from_tensor_slices() method not working properly with the GPT-2 model in this example. I wanted to report this issue and bring it to the attention of the KerasNLP community. I am very much interested in solving these kinds of bugs that are present in existing pre-trained models.

Metadata

Metadata

Labels

type:BugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions