Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cosine scheduler, datacollator, update clip #4826

Merged
merged 3 commits into from
Feb 16, 2023

Conversation

w5688414
Copy link
Contributor

…rWholeWordMask, Update CLIPVisionTransformer

PR types

  • New features

PR changes

  • APIs

Description

  • DataCollatorForLanguageModeling
  • DataCollatorForWholeWordMask
  • forward_pre & forward_post
  • get_polynomial_decay_schedule_with_warmup

…rWholeWordMask, Update CLIPVisionTransformer
@paddle-bot
Copy link

paddle-bot bot commented Feb 16, 2023

Thanks for your contribution!

Copy link
Collaborator

@sijunhe sijunhe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

collator这块的逻辑较为复杂,参考https://github.com/huggingface/transformers/blob/main/tests/trainer/test_data_collator.py 补上单测吧

@w5688414
Copy link
Contributor Author

collator这块的逻辑较为复杂,参考https://github.com/huggingface/transformers/blob/main/tests/trainer/test_data_collator.py 补上单测吧

已添加

@codecov
Copy link

codecov bot commented Feb 16, 2023

Codecov Report

Merging #4826 (49584b0) into develop (3ae3ea5) will decrease coverage by 0.17%.
The diff coverage is 12.17%.

@@             Coverage Diff             @@
##           develop    #4826      +/-   ##
===========================================
- Coverage    44.52%   44.36%   -0.17%     
===========================================
  Files          445      446       +1     
  Lines        64042    64332     +290     
===========================================
+ Hits         28515    28540      +25     
- Misses       35527    35792     +265     
Impacted Files Coverage Δ
paddlenlp/data/data_collator.py 19.32% <11.25%> (-14.89%) ⬇️
paddlenlp/trainer/trainer_utils.py 40.67% <12.50%> (-1.34%) ⬇️
paddlenlp/transformers/clip/modeling.py 57.92% <16.66%> (-0.96%) ⬇️
paddlenlp/transformers/bert/tokenizer.py 95.53% <66.66%> (ø)
paddlenlp/utils/downloader.py 64.60% <0.00%> (-0.89%) ⬇️
paddlenlp/transformers/mt5/modeling.py 84.67% <0.00%> (ø)
...dlenlp/experimental/autonlp/text_classification.py 98.30% <0.00%> (ø)
paddlenlp/transformers/mt5/converter.py 0.00% <0.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Copy link
Collaborator

@sijunhe sijunhe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR没问题,就一个小改动,tests/trainer/test_data_collator.py应该放到tests/data/test_data_collator.py, 和paddlenlp以下对应

@sijunhe sijunhe merged commit b7ae090 into PaddlePaddle:develop Feb 16, 2023
@w5688414 w5688414 deleted the bt3 branch June 7, 2023 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants