Skip to content

Conversation

Cyber-Machine
Copy link
Contributor

Partially fixes: #867

  • Make sure to update any "custom vocabulary" examples to match the model actual vocabulary type and special token requirements (varies per model).
  • Test out all docstring snippets!
    Gist of all docstring snippets.
  • Make sure to follow our code style guidelines re indentation etc.

Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I see a few spots to fix, but can do that as I merge this. Thanks

replaced with a random token from the vocabulary. A selected token
will be left as is with probability
`1 - mask_token_rate - random_token_rate`.
Call arguments:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add newline

left-to-right manner and fills up the buckets until we run
out of budget. It supports an arbitrary number of segments.
Call arguments:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decrease indent

# Load the preprocessor from a preset.
preprocessor = keras_nlp.models.DistilBertPreprocessor.from_preset("distil_bert_base_en_uncased")
preprocessor = keras_nlp.models.DistilBertPreprocessor.from_preset(
"distil_bert_base_en_uncased"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent

Example usage:
Raw string inputs and pretrained backbone.
Raw string data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still needs some updates to match the new style.

@mattdangerw mattdangerw merged commit 620d86e into keras-team:master Mar 21, 2023
kanpuriyanawab pushed a commit to kanpuriyanawab/keras-nlp that referenced this pull request Mar 26, 2023
keras-team#881)

* Reworked distil_bert docstrings.

* Fixed Typos.

* Fixed typo in DistilBERT MaskedLM Preprocessor

* Updated distil_bert_classifier.py

* Added DistilBertPreprocessor to docs.

* Formatted using black.

* A few edits

* Another fix

---------

Co-authored-by: Matt Watson <mattdangerw@gmail.com>
kanpuriyanawab pushed a commit to kanpuriyanawab/keras-nlp that referenced this pull request Mar 26, 2023
keras-team#881)

* Reworked distil_bert docstrings.

* Fixed Typos.

* Fixed typo in DistilBERT MaskedLM Preprocessor

* Updated distil_bert_classifier.py

* Added DistilBertPreprocessor to docs.

* Formatted using black.

* A few edits

* Another fix

---------

Co-authored-by: Matt Watson <mattdangerw@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rework model docstrings for progressive disclosure of complexity
2 participants