[FLAX] Question Answering Example #13649

kamalkraj · 2021-09-20T10:29:14Z

What does this PR do?

Flax Question Answering Example

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@patrickvonplaten @patil-suraj @sgugger

sgugger

Thanks for adding that example!

sgugger · 2021-09-20T13:44:34Z

examples/flax/question-answering/README.md

+```
+huggingface-cli repo create bert-qa-squad-test
+```
+
+Next we clone the model repository to add the tokenizer and model files.
+
+```
+git clone https://huggingface.co/<your-username>/bert-qa-squad-test
+```


Not linked to this PR per se, but this is not necessary while using the Repository API and all Flax examples should be updated to use it, while removing those instructions.

sgugger · 2021-09-20T13:45:13Z

examples/flax/question-answering/README.md

+
+
+### Usage notes
+Note that when contexts are long they may be split into multiple training cases, not all of which may contain


Suggested change

Note that when contexts are long they may be split into multiple training cases, not all of which may contain

Note that when contexts are long they may be split into multiple training cases, not all of which may contain

examples/flax/question-answering/utils_qa.py

patil-suraj

Looks really nice! Thanks for adding this example.

examples/flax/question-answering/run_qa.py

patil-suraj · 2021-09-21T08:40:19Z

examples/flax/question-answering/run_qa.py

+        revision=model_args.model_revision,
+        use_auth_token=True if model_args.use_auth_token else None,


We should pass dtype here for mixed-precision training, and seed for reproducibility.

For reference:

transformers/examples/flax/language-modeling/run_clm_flax.py

Lines 365 to 367 in 48fa42e

model = FlaxAutoModelForCausalLM.from_pretrained(

model_args.model_name_or_path, config=config, seed=training_args.seed, dtype=getattr(jnp, model_args.dtype)

)

@patil-suraj
Why there is no Dynamic loss scaling for dtype float16 ?

https://flax.readthedocs.io/en/latest/_autosummary/flax.optim.DynamicScale.html#flax.optim.DynamicScale

cc @patrickvonplaten

After the #13098, dtype will specify only the dtype of computation, so just the forward pass will be in half-precision and grads will still be computed in fp32 so for this case loss scaling is not required.

True mixed-precision training for flax examples is planned and will include loss scaling, will post more about it soon :)

Also, note that we don't use flax optimizers here, we are using optax.

Thanks for the explanation

examples/flax/question-answering/run_qa.py

1. Copyright Year updated 2. added dtype arg 3. passing seed and dtype to load model 4. Check eval flag before running eval

kamalkraj · 2021-09-21T12:53:22Z

@patil-suraj
Done changes according to your review.

* flax qa example * Updated README: Added Large model * added utils_qa.py FULL_COPIES * Updates: 1. Copyright Year updated 2. added dtype arg 3. passing seed and dtype to load model 4. Check eval flag before running eval * updated README * updated code comment

flax qa example

8a5dade

sgugger approved these changes Sep 20, 2021

View reviewed changes

kamalkraj added 2 commits September 20, 2021 07:24

Updated README: Added Large model

d3ac3f8

added utils_qa.py FULL_COPIES

827e0c9

patil-suraj approved these changes Sep 21, 2021

View reviewed changes

patil-suraj reviewed Sep 21, 2021

View reviewed changes

examples/flax/question-answering/run_qa.py Outdated Show resolved Hide resolved

kamalkraj added 3 commits September 21, 2021 05:28

Updates:

03a84d3

1. Copyright Year updated 2. added dtype arg 3. passing seed and dtype to load model 4. Check eval flag before running eval

updated README

36b3c66

updated code comment

0d6e7c3

patil-suraj merged commit 78807d8 into huggingface:master Sep 21, 2021

kamalkraj deleted the flax-qa branch September 21, 2021 13:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLAX] Question Answering Example #13649

[FLAX] Question Answering Example #13649

kamalkraj commented Sep 20, 2021

sgugger left a comment

sgugger Sep 20, 2021

sgugger Sep 20, 2021

patil-suraj left a comment

patil-suraj Sep 21, 2021

kamalkraj Sep 21, 2021

kamalkraj Sep 21, 2021

patil-suraj Sep 21, 2021

kamalkraj Sep 21, 2021

kamalkraj commented Sep 21, 2021



		### Usage notes
		Note that when contexts are long they may be split into multiple training cases, not all of which may contain

		revision=model_args.model_revision,
		use_auth_token=True if model_args.use_auth_token else None,

	model = FlaxAutoModelForCausalLM.from_pretrained(
	model_args.model_name_or_path, config=config, seed=training_args.seed, dtype=getattr(jnp, model_args.dtype)
	)

[FLAX] Question Answering Example #13649

[FLAX] Question Answering Example #13649

Conversation

kamalkraj commented Sep 20, 2021

What does this PR do?

Before submitting

Who can review?

sgugger left a comment

Choose a reason for hiding this comment

sgugger Sep 20, 2021

Choose a reason for hiding this comment

sgugger Sep 20, 2021

Choose a reason for hiding this comment

patil-suraj left a comment

Choose a reason for hiding this comment

patil-suraj Sep 21, 2021

Choose a reason for hiding this comment

kamalkraj Sep 21, 2021

Choose a reason for hiding this comment

kamalkraj Sep 21, 2021

Choose a reason for hiding this comment

patil-suraj Sep 21, 2021

Choose a reason for hiding this comment

kamalkraj Sep 21, 2021

Choose a reason for hiding this comment

kamalkraj commented Sep 21, 2021