Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TODO LIST #15

Open
dhkim0225 opened this issue Sep 7, 2020 · 1 comment
Open

TODO LIST #15

dhkim0225 opened this issue Sep 7, 2020 · 1 comment

Comments

@dhkim0225
Copy link
Owner

dhkim0225 commented Sep 7, 2020

prompt

Calibrate Before Use: Improving Few-Shot Performance of Language Models (https://arxiv.org/abs/2102.09690)
p-tuning (https://arxiv.org/abs/2104.08691)
Do Prompt-Based Models Really Understand the Meaning of their Prompts? (https://arxiv.org/abs/2109.01247)
An Empirical Study on Few-shot Knowledge Probing for Pretrained Language Models (https://arxiv.org/pdf/2109.02772.pdf)
FLAN (https://arxiv.org/pdf/2109.01652.pdf)
Text Style Transfer (https://arxiv.org/abs/2109.03910)
prompt 생성해서 NMT (https://arxiv.org/abs/2110.05448)

LM

BART (https://arxiv.org/abs/1910.13461)
Primer (https://arxiv.org/abs/2109.08668)
NormFormer (https://arxiv.org/abs/2110.09456)
HTLM (https://arxiv.org/abs/2107.06955)

KIE Pretraining

LayoutLM (https://arxiv.org/abs/1912.13318)
LayoutLMv2 (https://arxiv.org/abs/2012.14740)
StructuralLM (https://arxiv.org/abs/2105.11210)
MarkupLM (https://arxiv.org/abs/2110.08518)

@dhkim0225
Copy link
Owner Author

dhkim0225 commented Jan 4, 2022

PyramidTNT: Improved Transformer-in-Transformer Baselines with Pyramid Architecture https://arxiv.org/pdf/2201.00978.pdf

ELSA: Enhanced Local Self-Attention for Vision Transformer https://arxiv.org/pdf/2112.12786v1.pdf

unicorn (Crossing the Format Boundary of Text and Boxes: Towards Unified Vision-Language Modeling)

deepspeed-moe

15초 nerf (Instant Neural Graphics Primitives with a Multiresolution Hash Encoding)

True Few-Shot Learning with Language Models
https://arxiv.org/pdf/2105.11447.pdf

FLAN
https://arxiv.org/pdf/2109.01652.pdf

T0
https://arxiv.org/pdf/2110.08207.pdf

ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization https://arxiv.org/abs/2201.06910

Transformer Quality in Linear Time
https://arxiv.org/abs/2202.10447

Hyperparm search

Warm Starting CMA-ES for Hyperparameter Optimization
https://arxiv.org/abs/2012.06932

SSL

UNDERSTANDING DIMENSIONAL COLLAPSE IN CONTRASTIVE SELF-SUPERVISED LEARNING
https://arxiv.org/pdf/2110.09348.pdf

OCR

STKM
https://openaccess.thecvf.com/content/CVPR2021/papers/Wan_Self-Attention_Based_Text_Knowledge_Mining_for_Text_Detection_CVPR_2021_paper.pdf

학습 초기?

On Warm-Starting Neural Network Training
https://arxiv.org/pdf/1910.08475.pdf

The break-even point on optimization trajectories of deep neural networks.
https://arxiv.org/abs/2002.09572

Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization
https://arxiv.org/pdf/2012.14193.pdf

On the Origin of Implicit Regularization in Stochastic Gradient Descent
https://arxiv.org/pdf/2101.12176.pdf

loss landscape

Sharpness-Aware Minimization for Efficiently Improving Generalization
https://arxiv.org/pdf/2010.01412.pdf

ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks
https://arxiv.org/abs/2102.11600

Towards Efficient and Scalable Sharpness-Aware Minimization (LookSAM)
https://arxiv.org/pdf/2203.02714v1.pdf

A Loss Curvature Perspective on Training Instability in Deep Learning
https://arxiv.org/pdf/2110.04369.pdf

Surrogate Gap Minimization Improves Sharpness-Aware Training
https://arxiv.org/abs/2203.08065

Augmentation 의 역할은 뭘까?

When vision transformers outperform resnets without pretraining or strong data augmentations
https://arxiv.org/abs/2106.01548

ICLR Oral

https://openreview.net/group?id=ICLR.cc/2022/Conference

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant