Add model contribution guide #820

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

mattdangerw merged 23 commits into keras-team:master from abheesht17:model-contribution-readme

Mar 28, 2023

Collaborator

abheesht17 commented Mar 10, 2023 •

edited

Loading

Very rudimentary guide (😭) for adding backbone models

Resolves #505

abheesht17 added 3 commits

March 10, 2023 11:39


          Add initial model contribution guide


          Minor edit

c614542


          Complete guide

21ef7f7

abheesht17 marked this pull request as ready for review

March 10, 2023 12:26

abheesht17 added 7 commits

March 10, 2023 17:58


          Rename file

ee242a4


          Basic formatting

3e991b9


          Minor edits

1aefd9f


          Minor edit (1)

4a808e1


          Fix headings

07e28fb


          Fix template issue

b9cddd3


          Add blank line after template

313890d

mattdangerw requested changes

View reviewed changes

Member

mattdangerw left a comment •

edited

Loading

I think it might help if we start with a clear list of steps with checklists somewhere near the top. Something like

PR 1: Backbone & Colab

A xx/xx_backbone.py file in the style outlined below. [Example link].
A xx/xx_backbone_test.py file with a full set of unit test for the backbone. [Example link].
In the description PR, a colab demonstrating we can use the backbone to match outputs with a known source. [Example link].

PR 2: Tokenizer & Colab

... (similar to above)

PR 3: Presets & Conversion Script

A xx/xx_presets.py file with links to weights uploaded to a personal GCP bucket.
A xx/xx_presets_test.py file with runnable tests using each preset.
A tools/checkpoint_conversion/xx... reusable script for converting checkpoints.

PRs 4+: Preprocessors, Tasks, etc.

todo

Something like this at the top could really be a good chance to really spell out specifically what we are looking for and make sure we get back quality PRs.

keras_nlp/models/MODEL_CONTRIBUTION_GUIDE.md Outdated

+              In this guide, we will walk you through the steps one needs to take in order to contribute
+              a backbone model. For illustration purposes, let's assume that you want to
+              contribute the DistilBERT model. Before we dive in, we encourage you to go through

Member

mattdangerw Mar 10, 2023

maybe say to read both the getting stated guide and the CONTRIBUTING.md guide first, and link to both

keras_nlp/models/MODEL_CONTRIBUTION_GUIDE.md Outdated

		@@ -0,0 +1,204 @@
		# Model Contribution Guide

Member

mattdangerw Mar 10, 2023

Let's move this into the base of the repo, we don't really want to distribute random markdown with our package

keras_nlp/models/MODEL_CONTRIBUTION_GUIDE.md Outdated

+              A model is typically split into three/four sections. We would recommend you to
+              compare this side-by-side with the
+              [`keras_nlp.layers.DistilBERTBackbone` source code](https://github.com/keras-team/keras-nlp/blob/v0.4.1/keras_nlp/models/distil_bert/distil_bert_backbone.py#L108-L114)!

Member

mattdangerw Mar 10, 2023

maybe just link to the file on the master branch with no line linkages (we want to match the latest style, not the style at the 0.4.1 release!)

Collaborator Author

abheesht17 Mar 16, 2023

Cool, made changes. However, in certain cases, I've kept the line linkages. I guess the DistilBERT code in the master branch is fairly stable.

keras_nlp/models/MODEL_CONTRIBUTION_GUIDE.md Outdated

+              <br/>
+              The standard layers provided in Keras and KerasNLP are generally enough for
+% of the usecases and it is recommended to do a thorough search

Member

mattdangerw Mar 10, 2023

99% -> most. I think sadly 99 would be a little overblown, new models love to mess with random bits of the arch.

keras_nlp/models/MODEL_CONTRIBUTION_GUIDE.md Outdated

+              If the model introduces a paradigm shift, such as using relative attention instead
+              of vanilla attention, the contributor will have to implement complete custom layers. A case
+              in point is `keras_nlp.models.DebertaV3Backbone` where we had to [implement layers
+              from scratch](https://github.com/keras-team/keras-nlp/tree/v0.4.1/keras_nlp/models/distil_bert).

Member

mattdangerw Mar 10, 2023

did you mean to link to deberta here?

keras_nlp/models/MODEL_CONTRIBUTION_GUIDE.md Outdated

		3) "Add `DistilBertBackbone` Presets"


		### Open an issue/Claim an open issue

Member

mattdangerw Mar 11, 2023

Claim -> find

Given how many people we have looking for issues, we should discourage a "stake your claim" vibe :)

Collaborator Author

abheesht17 Mar 16, 2023

:P

keras_nlp/models/MODEL_CONTRIBUTION_GUIDE.md Outdated

+              us know that you're interested in working on it. This helps us keep track of who
+              is working on what and avoid duplicated effort.
+              In case you open a new issue, please follow this template:

Member

mattdangerw Mar 11, 2023

Might cut this template below for now.

Collaborator Author

abheesht17 Mar 16, 2023

As in, remove?

Collaborator Author

abheesht17 Mar 16, 2023

Or maybe we can just call it an example instead of a template.

"A sample issue is given below: "

keras_nlp/models/MODEL_CONTRIBUTION_GUIDE.md Outdated

+              you can furnish as much detail as possible to enable us to help you with the
+              contribution! 🙂
+              Occasionally, if the model requires, say a variant of the original self-attention,

Member

mattdangerw Mar 11, 2023

This feels kind of repetitive with what is above, I wonder if we can restructure the flow a bit.

Collaborator Author

abheesht17 Mar 16, 2023

Shifted it to the section above.

abheesht17 added 4 commits

March 16, 2023 12:15


          Address comments partially

8f94e1b


          Add checklist

b88b59a


          Move to root

563afad


          Add Colab link

6b9012e

abheesht17 requested a review from mattdangerw

March 16, 2023 07:28

mattdangerw requested changes

View reviewed changes

Member

mattdangerw left a comment

Ok one more overall structure idea. What about if the checklist and the detailed guide below have the same structure. Like this...

Checklist

Step 1: Fine an issue.

Find or open an issue on KerasNLP for work on a new model architecture.

Step 2: Add a backbone.

A xx/xx_backbone.py file...
Other bullets...

Detailed instructions

Step 1: Fine an issue.

Your description below about finding an issue.

Step 2: Add a backbone.

Your discussion below of backbone.

etc.

MODEL_CONTRIBUTION_GUIDE.md Outdated

+              In this guide, we will walk you through the steps one needs to take in order to contribute
+              a backbone model. For illustration purposes, let's assume that you want to
+              contribute the DistilBERT model. Before we dive in, we encourage you to go through
+              [our guide](https://keras.io/guides/keras_nlp/getting_started/) on getting started

Member

mattdangerw Mar 17, 2023

[our getting started guide](...) for an introduction to the library, and our [contribution guide](...) for an overview of our contribution process.

cut last sentence

MODEL_CONTRIBUTION_GUIDE.md Outdated

+              - [ ] A `xx/xx_backbone_test.py` file which has unit tests for the backbone [[Example Link](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone_test.py)].
+              - [ ] A Colab notebook link in the PR description which matches the outputs of the implemented backbone model with the original source [[Example Link](https://colab.research.google.com/drive/1SeZWJorKWmwWJax8ORSdxKrxE25BfhHa?usp=sharing)].
+              ### PR #2: Add XXTokenizer and XXPreprocessor

Member

mattdangerw Mar 18, 2023

Often our preprocessors seem to come with a lot of extra discussion (as they are higher level), and for gpt, bart, tf and whisper we have started with just the tokenizers. Maybe we should just formalize this in our process? And recommend just the tokenizer here.

Collaborator Author

abheesht17 Mar 21, 2023

Cool, added a small note.

MODEL_CONTRIBUTION_GUIDE.md Outdated

+              ### PR #3: Add XX Presets
+              - [ ] A `xx/xx_presets.py` file with links to weights uploaded to a personal GCP bucket/Google Drive [[Example Link](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_presets.py)].
+              - [ ] A `xx/xx_presets_test.py` file with runnable tests for each preset [[Example Link](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_presets_test.py)].
+              - [ ] A `tools/checkpoint_conversion/convert_xx_checkpoints.py` which is reusable script for converting checkpoints [[Example Link](https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/convert_distilbert_checkpoints.py)].

Member

mattdangerw Mar 18, 2023

I would also recommend asking for a colab showing an end to end task here. E.g. write a classifier for distilbert using the low level backbone and tokenizer (as a preview for step 4 and validation for our presets).

Collaborator Author

abheesht17 Mar 21, 2023

Do we have an example Colab for this?

Member

mattdangerw Mar 23, 2023

I am not sure, but making one now -> https://gist.github.com/mattdangerw/bf0ca07fb66b6738150c8b56ee5bab4e

Collaborator Author

abheesht17 Mar 23, 2023

Thanks!

MODEL_CONTRIBUTION_GUIDE.md Outdated

+              - [ ] A `xx/xx_presets_test.py` file with runnable tests for each preset [[Example Link](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_presets_test.py)].
+              - [ ] A `tools/checkpoint_conversion/convert_xx_checkpoints.py` which is reusable script for converting checkpoints [[Example Link](https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/convert_distilbert_checkpoints.py)].
+              ### (Optional) PR #4: Add XX Tasks

Member

mattdangerw Mar 18, 2023

I would write this as PR #4 and beyond: Add Tasks and Preprocessors.

MODEL_CONTRIBUTION_GUIDE.md Outdated

+              This PR is optional and is for adding tasks like classifier, masked LM, etc. [[Example Link](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_classifier.py)]
+              ## How is a Backbone model structured in KerasNLP?
+              To keep the code simple and readable, we follow

Member

mattdangerw Mar 18, 2023

Generally keep one empty newline after headings.

Collaborator Author

abheesht17 Mar 21, 2023

Yeah, I don't do that because I leave lines before headings...and then, it gets confusing if I leave lines before and after headings :P

Member

mattdangerw Mar 23, 2023

I think the linter I have dislikes this, which is probably this -> https://github.com/markdownlint/markdownlint

But doing a single space before and after would also fit with other docs in the library, e.g. https://github.com/keras-team/keras-nlp/blob/master/STYLE_GUIDE.md

Can we switch?

Collaborator Author

abheesht17 Mar 23, 2023

Gotcha, done!

MODEL_CONTRIBUTION_GUIDE.md Outdated

+              A model is typically split into three/four sections. We would recommend you to
+              compare this side-by-side with the
+              [`keras_nlp.layers.DistilBERTBackbone` source code](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone.py)!

Member

mattdangerw Mar 18, 2023

DistilBertBackbone (fix case)

MODEL_CONTRIBUTION_GUIDE.md Outdated

+              ### Inputs to the model
+              Generally, the standard inputs to any text model are:
+                  - `token_ids`: tokenised inputs (IDs in the form of text).

Member

mattdangerw Mar 18, 2023

"IDs in the form of text" is a little confusing. "An integer representation of the text sequence"?

Collaborator Author

abheesht17 Mar 21, 2023

Wow, I have no clue why I wrote that. That's a disgusting description. Maybe, I was sleepy, lol


          Address comments

639b223

abheesht17 requested a review from mattdangerw

March 22, 2023 09:10

mattdangerw self-assigned this

mattdangerw requested changes

View reviewed changes

Member

mattdangerw left a comment

Still should probably do a closer careful read for typos, but here's some quick comments! Overall looking good.

MODEL_CONTRIBUTION_GUIDE.md

		@@ -0,0 +1,218 @@
		# Model Contribution Guide

Member

mattdangerw Mar 23, 2023

silly nit, but we have CONTRIBUTING.md, should we make this CONTRIBUTING_MODELS.md or something to keep the filename parallel?

MODEL_CONTRIBUTION_GUIDE.md Outdated

+              This PR is optional and is for adding tasks like classifier, masked LM, etc. [[Example Link](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_classifier.py)]
+              ## How is a Backbone model structured in KerasNLP?
+              To keep the code simple and readable, we follow

Member

mattdangerw Mar 23, 2023

I think the linter I have dislikes this, which is probably this -> https://github.com/markdownlint/markdownlint

But doing a single space before and after would also fit with other docs in the library, e.g. https://github.com/keras-team/keras-nlp/blob/master/STYLE_GUIDE.md

Can we switch?

MODEL_CONTRIBUTION_GUIDE.md Outdated

		This PR is optional and is for adding tasks like classifier, masked LM, etc. [[Example Link](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_classifier.py)]


		## How is a Backbone model structured in KerasNLP?

Member

mattdangerw Mar 23, 2023

I think this whole section could still go into the "detailed instructions" for Step 2 (creating a backbone).

MODEL_CONTRIBUTION_GUIDE.md Outdated

+              ## Detailed Instructions
+              This section discusses, in details, every necessary step.
+              ### Open an issue/Find an open issue

Member

mattdangerw Mar 23, 2023

Step 1: ...

MODEL_CONTRIBUTION_GUIDE.md Outdated

+              ### PR #3: Add XX Presets
+              - [ ] A `xx/xx_presets.py` file with links to weights uploaded to a personal GCP bucket/Google Drive [[Example Link](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_presets.py)].
+              - [ ] A `xx/xx_presets_test.py` file with runnable tests for each preset [[Example Link](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_presets_test.py)].
+              - [ ] A `tools/checkpoint_conversion/convert_xx_checkpoints.py` which is reusable script for converting checkpoints [[Example Link](https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/convert_distilbert_checkpoints.py)].

Member

mattdangerw Mar 23, 2023

I am not sure, but making one now -> https://gist.github.com/mattdangerw/bf0ca07fb66b6738150c8b56ee5bab4e


          Address comments

ce1c8b9

abheesht17 requested a review from mattdangerw

March 23, 2023 02:24

abheesht17 added 3 commits

March 23, 2023 07:56


          A-->An

ed3cfd8


          Remove newline

f77b938


          Add newlines

af50f04

mattdangerw requested changes

View reviewed changes

Member

mattdangerw left a comment

LGTM! Gave a last pass with some copy edits and other comments.

CONTRIBUTING_MODELS.md Outdated

		@@ -0,0 +1,235 @@
		# Model Contribution Guide

		KerasNLP has a plethora of cutting-edge "backbone" model ranging from BERT and

Member

mattdangerw Mar 24, 2023

cutting-edge "backbone" model -> pre-trained large language models

CONTRIBUTING_MODELS.md Outdated

+              # Model Contribution Guide
+              KerasNLP has a plethora of cutting-edge "backbone" model ranging from BERT and
+              RoBERTa to FNet and DeBERTa. We are always looking for more models and are always

Member

mattdangerw Mar 24, 2023

BERT and RoBERTa to FNet and DeBERTa -> BERT to OPT

CONTRIBUTING_MODELS.md Outdated

+              open to contributions!
+              In this guide, we will walk you through the steps one needs to take in order to contribute
+              a backbone model. For illustration purposes, let's assume that you want to

Member

mattdangerw Mar 24, 2023

a backbone model -> a new pre-trained model to KerasNLP

(we are talking about more than just "backbones" below)

CONTRIBUTING_MODELS.md Outdated

+              - [ ] An `xx/xx_backbone.py` file which has the model graph [[Example Link](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone.py)].
+              - [ ] An `xx/xx_backbone_test.py` file which has unit tests for the backbone [[Example Link](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone_test.py)].
+              - [ ] A Colab notebook link in the PR description which matches the outputs of the implemented backbone model with the original source [[Example Link](https://colab.research.google.com/drive/1SeZWJorKWmwWJax8ORSdxKrxE25BfhHa?usp=sharing)].

Member

mattdangerw Mar 24, 2023

Maybe say to include this in the PR description? Same for other colabs.

Collaborator Author

abheesht17 Mar 24, 2023

I have. "A Colab notebook link in the PR description..."

CONTRIBUTING_MODELS.md Outdated

+              - [ ] An `xx/xx_backbone_test.py` file which has unit tests for the backbone [[Example Link](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone_test.py)].
+              - [ ] A Colab notebook link in the PR description which matches the outputs of the implemented backbone model with the original source [[Example Link](https://colab.research.google.com/drive/1SeZWJorKWmwWJax8ORSdxKrxE25BfhHa?usp=sharing)].
+              ### Step 3: PR #2 - Add XXTokenizer and XXPreprocessor [The preprocessor is optional at this stage]

Member

mattdangerw Mar 24, 2023

I still think we could keep this simple and just list XXTokenizer, leave preprocessor as a follow up.

CONTRIBUTING_MODELS.md Outdated


		For a full list of the tokenizers KerasNLP offers, please visit [this link](https://keras.io/api/keras_nlp/tokenizers/) and make use of the tokenizer your model uses!

		#### Preprocessor

Member

mattdangerw Mar 24, 2023

If we ditch preprocessor from step 3 entirely, we could probably rework this whole section to be a bit shorted and go in the final step

CONTRIBUTING_MODELS.md Outdated

+              #### Tokenizer
+              Most text models nowadays use subword tokenizers such as WordPiece, SentencePiece
+              and BPE Tokenizer. Hence, it becomes easy to implement the tokenizer layer for

Member

mattdangerw Mar 24, 2023

Hence, it becomes easy to implement the tokenizer layer for the model; the tokenizer class inherits from one of these base tokenizer classes. -> Since KerasNLP has implementation of most popular subword tokenizers, usually the tokenizer class inherits from a base tokenizer class.

CONTRIBUTING_MODELS.md Outdated

+              mask token, pad token, etc. These have to be
+              [added as member attributes](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_tokenizer.py#L91-L105)
+              to the tokenizer class. These member attributes are then accessed by the
+              preprocessor class.

Member

mattdangerw Mar 24, 2023

accessed by the preprocessor class. -> accessed by preprocessor layers.

CONTRIBUTING_MODELS.md Outdated


		### Step 4: PR #3 - Add XX Presets

		Once the two PR have been merged, you can open a PR for adding presets. For every

Member

mattdangerw Mar 24, 2023

two PR -> backbone and tokenizer have been merged

CONTRIBUTING_MODELS.md Outdated

+              After wrapping up the preset configuration file, you need to
+              add the `from_preset` function to all three classes, i.e., `DistilBertBackbone`,
+              `DistilBertTokenizer` and `DistilBertPreprocessor`,. Here is an

Member

mattdangerw Mar 24, 2023

remove trailing comma, and potentially DistilBertPreprocessor as mentioned above.

abheesht17 added 2 commits

March 25, 2023 04:53


          Address comments

10220f3


          Remove cutting-edge

db43a9d

abheesht17 added 2 commits

March 25, 2023 08:17


          Remove backbone

0dcaa34


          Remove preprocessor mention

7f294be

mattdangerw mentioned this pull request

Add XLNetBackbone #928

Closed

mattdangerw approved these changes

View reviewed changes

Member

mattdangerw left a comment

Awesome work! We might be able to take this for a spin with the XLNet PR.

mattdangerw merged commit 4d6d4b0 into keras-team:master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet