requirements for bert-large? #51

rush86999 · 2019-06-21T12:01:48Z

What if any issues would occur if bert-large was used? For example gpu requirements and training time? would it be too costly? Any reason why bert-base was used instead of bert-large?

jihun-hong · 2019-08-23T09:15:04Z

I'm also guessing that Yang Liu used bert-base instead of bert-large because bert-large would require more gpu, memory, and training time. Maybe using bert-large wouldn't result in greater improvements in performance, but I don't think the original paper talks about that. There aren't ablation studies about this in particular, but just my guess.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

requirements for bert-large? #51

requirements for bert-large? #51

rush86999 commented Jun 21, 2019

jihun-hong commented Aug 23, 2019

requirements for bert-large? #51

requirements for bert-large? #51

Comments

rush86999 commented Jun 21, 2019

jihun-hong commented Aug 23, 2019