TODO LIST #15

dhkim0225 · 2020-09-07T04:15:56Z

prompt

Calibrate Before Use: Improving Few-Shot Performance of Language Models (https://arxiv.org/abs/2102.09690)
p-tuning (https://arxiv.org/abs/2104.08691)
Do Prompt-Based Models Really Understand the Meaning of their Prompts? (https://arxiv.org/abs/2109.01247)
An Empirical Study on Few-shot Knowledge Probing for Pretrained Language Models (https://arxiv.org/pdf/2109.02772.pdf)
FLAN (https://arxiv.org/pdf/2109.01652.pdf)
Text Style Transfer (https://arxiv.org/abs/2109.03910)
prompt 생성해서 NMT (https://arxiv.org/abs/2110.05448)

LM

BART (https://arxiv.org/abs/1910.13461)
Primer (https://arxiv.org/abs/2109.08668)
NormFormer (https://arxiv.org/abs/2110.09456)
HTLM (https://arxiv.org/abs/2107.06955)

KIE Pretraining

LayoutLM (https://arxiv.org/abs/1912.13318)
LayoutLMv2 (https://arxiv.org/abs/2012.14740)
StructuralLM (https://arxiv.org/abs/2105.11210)
MarkupLM (https://arxiv.org/abs/2110.08518)

dhkim0225 · 2022-01-04T05:31:49Z

PyramidTNT: Improved Transformer-in-Transformer Baselines with Pyramid Architecture https://arxiv.org/pdf/2201.00978.pdf

ELSA: Enhanced Local Self-Attention for Vision Transformer https://arxiv.org/pdf/2112.12786v1.pdf

unicorn (Crossing the Format Boundary of Text and Boxes: Towards Unified Vision-Language Modeling)

deepspeed-moe

15초 nerf (Instant Neural Graphics Primitives with a Multiresolution Hash Encoding)

True Few-Shot Learning with Language Models
https://arxiv.org/pdf/2105.11447.pdf

FLAN
https://arxiv.org/pdf/2109.01652.pdf

T0
https://arxiv.org/pdf/2110.08207.pdf

ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization https://arxiv.org/abs/2201.06910

Transformer Quality in Linear Time
https://arxiv.org/abs/2202.10447

Hyperparm search

Warm Starting CMA-ES for Hyperparameter Optimization
https://arxiv.org/abs/2012.06932

SSL

UNDERSTANDING DIMENSIONAL COLLAPSE IN CONTRASTIVE SELF-SUPERVISED LEARNING
https://arxiv.org/pdf/2110.09348.pdf

OCR

STKM
https://openaccess.thecvf.com/content/CVPR2021/papers/Wan_Self-Attention_Based_Text_Knowledge_Mining_for_Text_Detection_CVPR_2021_paper.pdf

학습 초기?

On Warm-Starting Neural Network Training
https://arxiv.org/pdf/1910.08475.pdf

The break-even point on optimization trajectories of deep neural networks.
https://arxiv.org/abs/2002.09572

Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization
https://arxiv.org/pdf/2012.14193.pdf

On the Origin of Implicit Regularization in Stochastic Gradient Descent
https://arxiv.org/pdf/2101.12176.pdf

loss landscape

Sharpness-Aware Minimization for Efficiently Improving Generalization
https://arxiv.org/pdf/2010.01412.pdf

ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks
https://arxiv.org/abs/2102.11600

Towards Efficient and Scalable Sharpness-Aware Minimization (LookSAM)
https://arxiv.org/pdf/2203.02714v1.pdf

A Loss Curvature Perspective on Training Instability in Deep Learning
https://arxiv.org/pdf/2110.04369.pdf

Surrogate Gap Minimization Improves Sharpness-Aware Training
https://arxiv.org/abs/2203.08065

Augmentation 의 역할은 뭘까?

When vision transformers outperform resnets without pretraining or strong data augmentations
https://arxiv.org/abs/2106.01548

ICLR Oral

https://openreview.net/group?id=ICLR.cc/2022/Conference

dhkim0225 pinned this issue Sep 7, 2020

ChaeChae0505 mentioned this issue Dec 30, 2021

TO-DO List ChaeChae0505/DL#1

Open

ChaeChae0505 mentioned this issue Jan 24, 2022

Todo list 라 적고 그냥 아무거나 가져왔다고,,, ChaeChae0505/Paper-reading#7

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TODO LIST #15

TODO LIST #15

dhkim0225 commented Sep 7, 2020 •

edited

Loading

dhkim0225 commented Jan 4, 2022 •

edited

Loading

TODO LIST #15

TODO LIST #15

Comments

dhkim0225 commented Sep 7, 2020 • edited Loading

prompt

LM

KIE Pretraining

dhkim0225 commented Jan 4, 2022 • edited Loading

Hyperparm search

SSL

OCR

학습 초기?

loss landscape

Augmentation 의 역할은 뭘까?

ICLR Oral

dhkim0225 commented Sep 7, 2020 •

edited

Loading

dhkim0225 commented Jan 4, 2022 •

edited

Loading