Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
704ccec
Update f_net_backbone.py
ADITYADAS1999 Mar 19, 2023
4052381
Update f_net_classifier.py
ADITYADAS1999 Mar 19, 2023
ee496e4
Update f_net_masked_lm.py
ADITYADAS1999 Mar 19, 2023
c6942d8
Update f_net_masked_lm_preprocessor.py
ADITYADAS1999 Mar 19, 2023
bbc1794
Update f_net_preprocessor.py
ADITYADAS1999 Mar 19, 2023
1443679
Update f_net_tokenizer.py
ADITYADAS1999 Mar 19, 2023
a85e7db
Update code format f_net_backbone
ADITYADAS1999 Mar 20, 2023
41c1dc0
Update code format f_net_classifier
ADITYADAS1999 Mar 20, 2023
a405f32
Update code format f_net_masked_lm_preprocessor
ADITYADAS1999 Mar 20, 2023
7139387
Update code format f_net_preprocessor
ADITYADAS1999 Mar 20, 2023
9c56a28
Update code format f_net_tokenizer
ADITYADAS1999 Mar 20, 2023
0fe924d
Add some necessary changes
ADITYADAS1999 Mar 20, 2023
728012f
Add some necessary changes
ADITYADAS1999 Mar 20, 2023
1e1f2d0
Update f_net_classifier.py
ADITYADAS1999 Mar 20, 2023
1fefb8e
Add newline before this heading
ADITYADAS1999 Mar 21, 2023
d873c30
Add newline before this heading
ADITYADAS1999 Mar 21, 2023
b4743f5
minor fixes
ADITYADAS1999 Mar 21, 2023
bc613cc
minor fixes
ADITYADAS1999 Mar 22, 2023
ed9bdc6
Merge branch 'keras-team:master' into my_third_branch
ADITYADAS1999 Mar 22, 2023
f37e7e8
Merge branch 'keras-team:master' into my_third_branch
ADITYADAS1999 Mar 28, 2023
3aa7fdb
Fixes the wrong presets
ADITYADAS1999 Mar 28, 2023
3e05cf8
minor fixes
ADITYADAS1999 Mar 28, 2023
facbbc4
Remove custom vocab example
ADITYADAS1999 Mar 28, 2023
fdd7934
Merge branch 'keras-team:master' into my_third_branch
ADITYADAS1999 Mar 31, 2023
0f2cf90
update code format
ADITYADAS1999 Mar 31, 2023
a53de58
update code format
ADITYADAS1999 Mar 31, 2023
5fe0282
Merge branch 'keras-team:master' into my_third_branch
ADITYADAS1999 Apr 5, 2023
00dd693
update wrong presets
ADITYADAS1999 Apr 5, 2023
438c41d
Remove custom vocab
ADITYADAS1999 Apr 6, 2023
0686945
Fixes
mattdangerw Apr 12, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 14 additions & 10 deletions keras_nlp/models/f_net/f_net_backbone.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,16 +37,16 @@ def f_net_bias_initializer(stddev=0.02):

@keras_nlp_export("keras_nlp.models.FNetBackbone")
class FNetBackbone(Backbone):
"""FNet encoder network.
"""A FNet encoder network.

This class implements a bi-directional Fourier Transform-based encoder as
described in ["FNet: Mixing Tokens with Fourier Transforms"](https://arxiv.org/abs/2105.03824).
It includes the embedding lookups and `keras_nlp.layers.FNetEncoder` layers,
but not the masked language model or next sentence prediction heads.

The default constructor gives a fully customizable, randomly initialized FNet
encoder with any number of layers and embedding dimensions. To load
preset architectures and weights, use the `from_preset` constructor.
The default constructor gives a fully customizable, randomly initialized
FNet encoder with any number of layers and embedding dimensions. To
load preset architectures and weights, use the `from_preset()` constructor.

Note: unlike other models, FNet does not take in a `"padding_mask"` input,
the `"<pad>"` token is handled equivalently to all other tokens in the input
Expand Down Expand Up @@ -78,15 +78,19 @@ class FNetBackbone(Backbone):
),
}

# Randomly initialized FNet encoder with a custom config
# Pretrained BERT encoder.
model = keras_nlp.models.FNetBackbone.from_preset("f_net_base_en")
model(input_data)

# Randomly initialized FNet encoder with a custom config.
model = keras_nlp.models.FNetBackbone(
vocabulary_size=32000,
num_layers=12,
hidden_dim=768,
intermediate_dim=3072,
max_sequence_length=12,
num_layers=4,
hidden_dim=256,
intermediate_dim=512,
max_sequence_length=128,
)
output = model(input_data)
model(input_data)
```
"""

Expand Down
63 changes: 36 additions & 27 deletions keras_nlp/models/f_net/f_net_classifier.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,9 @@ class FNetClassifier(Task):
"""An end-to-end f_net model for classification tasks.

This model attaches a classification head to a
`keras_nlp.model.FNetBackbone` model, mapping from the backbone
outputs to logit output suitable for a classification task. For usage of
this model with pre-trained weights, see the `from_preset()` method.
`keras_nlp.model.FNetBackbone` instance, mapping from the backbone outputs
to logits suitable for a classification task. For usage of this model with
pre-trained weights, use the `from_preset()` constructor.

This model can optionally be configured with a `preprocessor` layer, in
which case it will automatically apply preprocessing to raw inputs during
Expand All @@ -55,41 +55,50 @@ class FNetClassifier(Task):
`None`, this model will not apply preprocessing, and inputs should
be preprocessed before calling the model.

Example usage:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it is missing most of the content on the BERT classifier, may be worth another look.

Examples:

Raw string data.
```python
features = ["The quick brown fox jumped.", "I forgot my homework."]
labels = [0, 3]

# Pretrained classifier.
classifier = keras_nlp.models.FNetClassifier.from_preset(
"f_net_base_en",
num_classes=4,
)
classifier.fit(x=features, y=labels, batch_size=2)
classifier.predict(x=features, batch_size=2)

# Re-compile (e.g., with a new learning rate).
classifier.compile(
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=keras.optimizers.Adam(5e-5),
jit_compile=True,
)
# Access backbone programatically (e.g., to change `trainable`).
classifier.backbone.trainable = False
# Fit again.
classifier.fit(x=features, y=labels, batch_size=2)
```

Preprocessed integer data.
```python
preprocessed_features = {
features = {
"token_ids": tf.ones(shape=(2, 12), dtype=tf.int64),
"segment_ids": tf.constant(
[[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0]] * 2, shape=(2, 12)
),
"padding_mask": tf.constant(
[[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0]] * 2, shape=(2, 12)
),
}
labels = [0, 3]

# Randomly initialize a FNet backbone.
backbone = keras_nlp.models.FNetBackbone(
vocabulary_size=32000,
num_layers=12,
hidden_dim=768,
intermediate_dim=3072,
max_sequence_length=12,
)

# Create a FNet classifier and fit your data.
classifier = keras_nlp.models.FNetClassifier(
backbone,
# Pretrained classifier without preprocessing.
classifier = keras_nlp.models.FNetClassifier.from_preset(
"f_net_base_en",
num_classes=4,
preprocessor=None,
)
classifier.compile(
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
)
classifier.fit(x=preprocessed_features, y=labels, batch_size=2)

# Access backbone programatically (e.g., to change `trainable`)
classifier.backbone.trainable = False
classifier.fit(x=features, y=labels, batch_size=2)
```
"""

Expand Down
39 changes: 17 additions & 22 deletions keras_nlp/models/f_net/f_net_masked_lm.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ class FNetMaskedLM(Task):
This model will train FNet on a masked language modeling task.
The model will predict labels for a number of masked tokens in the
input data. For usage of this model with pre-trained weights, see the
`from_preset()` method.
`from_preset()` constructor.

This model can optionally be configured with a `preprocessor` layer, in
which case inputs can be raw string features during `fit()`, `predict()`,
Expand All @@ -54,26 +54,33 @@ class FNetMaskedLM(Task):

Example usage:

Raw string inputs and pretrained backbone.
Raw string data.
```python
# Create a dataset with raw string features. Labels are inferred.

features = ["The quick brown fox jumped.", "I forgot my homework."]

# Create a FNetMaskedLM with a pretrained backbone and further train
# on an MLM task.
# Pretrained language model.
masked_lm = keras_nlp.models.FNetMaskedLM.from_preset(
"f_net_base_en",
)
masked_lm.fit(x=features, batch_size=2)

# Re-compile (e.g., with a new learning rate).
masked_lm.compile(
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=keras.optimizers.Adam(5e-5),
jit_compile=True,
)
# Access backbone programatically (e.g., to change `trainable`).
masked_lm.backbone.trainable = False
# Fit again.
masked_lm.fit(x=features, batch_size=2)
```

Preprocessed inputs and custom backbone.
Preprocessed integer data.
```python
# Create a preprocessed dataset where 0 is the mask token.
preprocessed_features = {
features = {
"token_ids": tf.constant(
[[1, 2, 0, 4, 0, 6, 7, 8]] * 2, shape=(2, 8)
),
Expand All @@ -85,23 +92,11 @@ class FNetMaskedLM(Task):
# Labels are the original masked values.
labels = [[3, 5]] * 2

# Randomly initialize a FNet encoder
backbone = keras_nlp.models.FNetBackbone(
vocabulary_size=50265,
num_layers=12,
hidden_dim=768,
intermediate_dim=3072,
max_sequence_length=12
)
# Create a FNet masked_lm and fit the data.
masked_lm = keras_nlp.models.FNetMaskedLM(
backbone,
masked_lm = keras_nlp.models.FNetMaskedLM.from_preset(
"f_net_base_en",
preprocessor=None,
)
masked_lm.compile(
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
)
masked_lm.fit(x=preprocessed_features, y=labels, batch_size=2)
masked_lm.fit(x=features, y=labels, batch_size=2)
```
"""

Expand Down
67 changes: 33 additions & 34 deletions keras_nlp/models/f_net/f_net_masked_lm_preprocessor.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,14 +28,14 @@ class FNetMaskedLMPreprocessor(FNetPreprocessor):
`keras_nlp.models.FNetMaskedLM` task model. Preprocessing will occur in
multiple steps.

- Tokenize any number of input segments using the `tokenizer`.
- Pack the inputs together with the appropriate `"<s>"`, `"</s>"` and
1. Tokenize any number of input segments using the `tokenizer`.
2. Pack the inputs together with the appropriate `"<s>"`, `"</s>"` and
`"<pad>"` tokens, i.e., adding a single `"<s>"` at the start of the
entire sequence, `"</s></s>"` between each segment,
and a `"</s>"` at the end of the entire sequence.
- Randomly select non-special tokens to mask, controlled by
3. Randomly select non-special tokens to mask, controlled by
`mask_selection_rate`.
- Construct a `(x, y, sample_weight)` tuple suitable for training with a
4. Construct a `(x, y, sample_weight)` tuple suitable for training with a
`keras_nlp.models.FNetMaskedLM` task model.

Args:
Expand Down Expand Up @@ -66,54 +66,53 @@ class FNetMaskedLMPreprocessor(FNetPreprocessor):
out of budget. It supports an arbitrary number of segments.

Examples:

Directly calling the layer on data.
```python
# Load the preprocessor from a preset.
preprocessor = keras_nlp.models.FNetMaskedLMPreprocessor.from_preset(
"f_net_base_en"
)

# Tokenize and mask a single sentence.
sentence = tf.constant("The quick brown fox jumped.")
preprocessor(sentence)
preprocessor("The quick brown fox jumped.")

# Tokenize and mask a batch of sentences.
sentences = tf.constant(
["The quick brown fox jumped.", "Call me Ishmael."]
)
preprocessor(sentences)
# Tokenize and mask a batch of single sentences.
preprocessor(["The quick brown fox jumped.", "Call me Ishmael."])

# Tokenize and mask a dataset of sentences.
features = tf.constant(
["The quick brown fox jumped.", "Call me Ishmael."]
# Tokenize and mask sentence pairs.
# In this case, always convert input to tensors before calling the layer.
first = tf.constant(["The quick brown fox jumped.", "Call me Ishmael."])
second = tf.constant(["The fox tripped.", "Oh look, a whale."])
preprocessor((first, second))
```

Mapping with `tf.data.Dataset`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newline before this heading

```python
preprocessor = keras_nlp.models.FNetMaskedLMPreprocessor.from_preset(
"f_net_base_en"
)
ds = tf.data.Dataset.from_tensor_slices((features))

first = tf.constant(["The quick brown fox jumped.", "Call me Ishmael."])
second = tf.constant(["The fox tripped.", "Oh look, a whale."])

# Map single sentences.
ds = tf.data.Dataset.from_tensor_slices(first)
ds = ds.map(preprocessor, num_parallel_calls=tf.data.AUTOTUNE)

# Alternatively, you can create a preprocessor from your own vocabulary.
vocab_data = tf.data.Dataset.from_tensor_slices(
["the quick brown fox", "the earth is round"]
)

# Creating sentencepiece tokenizer for FNet LM preprocessor
bytes_io = io.BytesIO()
sentencepiece.SentencePieceTrainer.train(
sentence_iterator=vocab_data.as_numpy_iterator(),
model_writer=bytes_io,
vocab_size=12,
model_type="WORD",
pad_id=0,
bos_id=1,
eos_id=2,
unk_id=3,
pad_piece="<pad>",
unk_piece="<unk>",
bos_piece="[CLS]",
eos_piece="[SEP]",
user_defined_symbols="[MASK]",
# Map sentence pairs.
ds = tf.data.Dataset.from_tensor_slices((first, second))
# Watch out for tf.data's default unpacking of tuples here!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not by this PR - I think it is worth calling out first and second will be concatenated if calling preprocessor in this way. Now the comment just says "watch out" without showing the output. Maybe we can add "sentence pairs are automatically packed before tokenization"? @mattdangerw thoughts on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that is not quite the issue here.

The fact that the outputs are concatenated is not that surprising. The fact the tf.data handles tuples specially is! Basically if you just called ds = ds.map(preprocessor) here, you would see your second input being passed as a label and not a feature. It's an annoying gotcha, but not our to solve I think.

It stems from the fact that these two calls are handled differently...

tf.data.Dataset.from_tensor_slices([[1, 2, 3], [1, 2, 3]]).map(lambda x: x)  # OK
tf.data.Dataset.from_tensor_slices(([1, 2, 3], [1, 2, 3])).map(lambda x: x)  # ERROR

We can update this comment if we want, but I would not do it on this PR. I would do that on a separate PR, for all the models at once (so we don't forget to update this elsewhere).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be fine, if we generate separate PR for different models at once

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the comment above was meant as an explainer. Let's stick to the language we have been using in other PRs verbatim for this PR.

# Best to invoke the `preprocessor` directly in this case.
ds = ds.map(
lambda first, second: preprocessor(x=(first, second)),
num_parallel_calls=tf.data.AUTOTUNE,
)
proto = bytes_io.getvalue()
tokenizer = keras_nlp.models.FNetTokenizer(proto=proto)
preprocessor = keras_nlp.models.FNetMaskedLMPreprocessor(tokenizer=tokenizer)
```
"""

Expand Down
Loading