Add compilation defaults for the Fnet MaskedLM task model #834

soma2000-lang · 2023-03-11T06:55:54Z

Partially fixes #830

mattdangerw · 2023-03-13T19:43:19Z

Code looks good. Have we tried this out with a few different presets to make sure we are converging generally?

keras_nlp/models/f_net/f_net_masked_lm.py

chenmoneygithub · 2023-03-14T09:33:48Z

Thanks for the PR! 🥂

@mattdangerw Matt, I am feeling for future PRs adding compilation defaults, we may want to require fitting results to justify the value we choose. It will add a bit overhead to such PRs, but I think it's important for us to ensure the correctness.

mattdangerw · 2023-03-14T20:33:41Z

@mattdangerw Matt, I am feeling for future PRs adding compilation defaults, we may want to require fitting results to justify the value we choose.

Agreed! I think this can be kinda add hoc for now, as we didn't really have an established process for classification defaults either. But long term, we for sure should invest in infra to experiment with multiple learning rates for multiple checkpoints (and potentially even multiple datasets!), to make sure we have good default. Obviously we are limited by compute and will have to make some tradeoffs.

For now, let's just sanity check that our constant are somewhat reasonable.

chenmoneygithub · 2023-03-15T02:18:25Z

@mattdangerw yea, budget for computing is a pain for us now. I think for now we can ask for sanity check on small/tiny from users, since they can fit in colab free-tier GPU. But as we don't have a guideline right now, we can merge this PR without asking for that.

chenmoneygithub · 2023-03-15T02:27:50Z

@mattdangerw I opened an issue to track.

kanpuriyanawab · 2023-03-15T04:26:49Z

@mattdangerw yea, budget for computing is a pain for us now. I think for now we can ask for sanity check on small/tiny from users, since they can fit in colab free-tier GPU. But as we don't have a guideline right now, we can merge this PR without asking for that.

Hi @chenmoneygithub, I think we can consider using Kaggle Notebooks as an alternative to colab free tier ( colab gpus are pretty slow compared to kaggle , see this. For #833 , following your example script, I am experimenting with 1e-5, 2e-5, 5e-5, 1e-4 as LRs for the three models.

Script is still running, while it demonstrates how to brute-force check with which LR model will perform better. Do you think it'll be a meaningful contribution against #850 ?

On a side note, I don't remember if colab still has that 1 hr limit or so, but kaggle doesn't.

I think we can use kaggle (private notebook if needed) to do tpu testing for new backbones we add.

Thanks.

PS. The Kaggle notebook I linked is still running (another cool feature of Kaggle which allows you to save your work when your script running is completed), once it's done you'll be able to see it. I'll ping here.

chenmoneygithub · 2023-03-15T04:45:01Z

@shivance It's good to know! thanks!

soma2000-lang · 2023-03-17T19:45:26Z

@mattdangerw @chenmoneygithub I have tested this performs the best at learning rate 1e-4.Accordingly updated the value.

mattdangerw · 2023-03-17T20:54:14Z

@soma2000-lang thanks! Do you have a way to share the experiments you did?

I think going off of @shivance testing on #833, it seems like 5e-5 might be a good general purpose learning rate here, and a lower number will be a little more conservative. Let's just stick with that for now for simplicity, but we can always update.

…#834) * Adding * Update f_net_masked_lm.py * Update f_net_masked_lm.py * Update f_net_masked_lm.py * Use 5e-5 --------- Co-authored-by: Matt Watson <1389937+mattdangerw@users.noreply.github.com>

Adding

2cd0fc1

soma2000-lang marked this pull request as ready for review March 11, 2023 06:56

soma2000-lang changed the title ~~Adding~~ Add compilation defaults for the Fnet MaskedLM task model Mar 11, 2023

Update f_net_masked_lm.py

666ab7f

mattdangerw reviewed Mar 14, 2023

View reviewed changes

keras_nlp/models/f_net/f_net_masked_lm.py Outdated Show resolved Hide resolved

soma2000-lang added 2 commits March 15, 2023 20:22

Update f_net_masked_lm.py

7d09801

Update f_net_masked_lm.py

82b0318

soma2000-lang requested a review from mattdangerw March 17, 2023 19:45

Use 5e-5

57a6755

mattdangerw merged commit ca1a62e into keras-team:master Mar 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add compilation defaults for the Fnet MaskedLM task model #834

Add compilation defaults for the Fnet MaskedLM task model #834

Uh oh!

soma2000-lang commented Mar 11, 2023 •

edited

Loading

Uh oh!

mattdangerw commented Mar 13, 2023

Uh oh!

Uh oh!

chenmoneygithub commented Mar 14, 2023

Uh oh!

mattdangerw commented Mar 14, 2023

Uh oh!

chenmoneygithub commented Mar 15, 2023

Uh oh!

chenmoneygithub commented Mar 15, 2023

Uh oh!

kanpuriyanawab commented Mar 15, 2023 •

edited

Loading

Uh oh!

chenmoneygithub commented Mar 15, 2023

Uh oh!

soma2000-lang commented Mar 17, 2023

Uh oh!

mattdangerw commented Mar 17, 2023

Uh oh!

Uh oh!

Add compilation defaults for the Fnet MaskedLM task model #834

Add compilation defaults for the Fnet MaskedLM task model #834

Uh oh!

Conversation

soma2000-lang commented Mar 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattdangerw commented Mar 13, 2023

Uh oh!

Uh oh!

chenmoneygithub commented Mar 14, 2023

Uh oh!

mattdangerw commented Mar 14, 2023

Uh oh!

chenmoneygithub commented Mar 15, 2023

Uh oh!

chenmoneygithub commented Mar 15, 2023

Uh oh!

kanpuriyanawab commented Mar 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chenmoneygithub commented Mar 15, 2023

Uh oh!

soma2000-lang commented Mar 17, 2023

Uh oh!

mattdangerw commented Mar 17, 2023

Uh oh!

Uh oh!

soma2000-lang commented Mar 11, 2023 •

edited

Loading

kanpuriyanawab commented Mar 15, 2023 •

edited

Loading