-
Notifications
You must be signed in to change notification settings - Fork 25.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLAX] Question Answering Example #13649
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding that example!
``` | ||
huggingface-cli repo create bert-qa-squad-test | ||
``` | ||
|
||
Next we clone the model repository to add the tokenizer and model files. | ||
|
||
``` | ||
git clone https://huggingface.co/<your-username>/bert-qa-squad-test | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not linked to this PR per se, but this is not necessary while using the Repository
API and all Flax examples should be updated to use it, while removing those instructions.
|
||
|
||
### Usage notes | ||
Note that when contexts are long they may be split into multiple training cases, not all of which may contain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that when contexts are long they may be split into multiple training cases, not all of which may contain | |
Note that when contexts are long they may be split into multiple training cases, not all of which may contain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really nice! Thanks for adding this example.
revision=model_args.model_revision, | ||
use_auth_token=True if model_args.use_auth_token else None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should pass dtype
here for mixed-precision training, and seed
for reproducibility.
For reference:
transformers/examples/flax/language-modeling/run_clm_flax.py
Lines 365 to 367 in 48fa42e
model = FlaxAutoModelForCausalLM.from_pretrained( | |
model_args.model_name_or_path, config=config, seed=training_args.seed, dtype=getattr(jnp, model_args.dtype) | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@patil-suraj
Why there is no Dynamic loss scaling for dtype float16
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the #13098, dtype
will specify only the dtype
of computation, so just the forward pass will be in half-precision and grads will still be computed in fp32
so for this case loss scaling is not required.
True mixed-precision training for flax examples is planned and will include loss scaling, will post more about it soon :)
Also, note that we don't use flax optimizers here, we are using optax
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation
@patil-suraj |
* flax qa example * Updated README: Added Large model * added utils_qa.py FULL_COPIES * Updates: 1. Copyright Year updated 2. added dtype arg 3. passing seed and dtype to load model 4. Check eval flag before running eval * updated README * updated code comment
* flax qa example * Updated README: Added Large model * added utils_qa.py FULL_COPIES * Updates: 1. Copyright Year updated 2. added dtype arg 3. passing seed and dtype to load model 4. Check eval flag before running eval * updated README * updated code comment
What does this PR do?
Flax Question Answering Example
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@patrickvonplaten @patil-suraj @sgugger