NLP, CV, Deep Learning Paper Reading Table

I read these papers that are related to NLP and Deep Learning. Here are various papers from basic to advanced. 😊 In addition, you can check my Korean paper reviews by clicking the link attached to the table. 😉

You can see more paper reviews, code implementation, and mathematics descriptions in my blog <- click here

My Insight 🧐

I write several articles to explain in detail some Deep Learning technologies. These articles can be found in the table below.

Title	Blog link
How has scaling law developed in NLP? 🤔	https://cartinoe5930.tistory.com/entry/How-has-scaling-law-developed-in-NLP-%F0%9F%A4%94-NLP%EC%97%90%EC%84%9C-scaling-law%EB%8A%94-%EC%96%B4%EB%96%BB%EA%B2%8C-%EB%B0%9C%EC%A0%84%EB%90%98%EC%97%88%EC%9D%84%EA%B9%8C
Closed-source🔒? Open-source🔓? What is that?? 🤨🤔	https://cartinoe5930.tistory.com/entry/The-hopes-of-researchers-Open-source-%F0%9F%A4%97-%EC%97%B0%EA%B5%AC%EC%9E%90%EB%93%A4%EC%9D%98-%ED%9D%AC%EB%A7%9D-Open-source-%F0%9F%A4%97
Context window of LM, should it be long? Should it be short? 📏🤨	https://cartinoe5930.tistory.com/entry/LM%EC%9D%98-context-window-%EA%B8%B8%EC%96%B4%EC%95%BC-%ED%95%A0%EA%B9%8C-%EC%A7%A7%EC%95%84%EC%95%BC-%ED%95%A0%EA%B9%8C-%F0%9F%93%8F%F0%9F%A4%A8
What is the most optimal way to evaluate LM? 😎	https://cartinoe5930.tistory.com/entry/LM%EC%9D%84-%EA%B0%80%EC%9E%A5-%EC%B5%9C%EC%A0%81%EC%9C%BC%EB%A1%9C-%ED%8F%89%EA%B0%80%ED%95%A0-%EC%88%98-%EC%9E%88%EB%8A%94-%EB%B0%A9%EB%B2%95%EC%9D%80-%EB%AC%B4%EC%97%87%EC%9D%BC%EA%B9%8C-%F0%9F%98%8E
The performance of ChatGPT is getting worse?!?!? 😲😲	https://cartinoe5930.tistory.com/entry/ChatGPT%EC%9D%98-%EC%84%B1%EB%8A%A5%EC%9D%B4-%EC%95%88-%EC%A2%8B%EC%95%84%EC%A7%80%EA%B3%A0-%EC%9E%88%EB%8B%A4%EA%B5%AC-%F0%9F%98%B2%F0%9F%98%B2
You can fine-tune too! with PEFT 🤗	https://cartinoe5930.tistory.com/entry/%EB%8B%B9%EC%8B%A0%EB%8F%84-Fine-tuning-%ED%95%A0-%EC%88%98-%EC%9E%88%EC%8A%B5%EB%8B%88%EB%8B%A4-with-PEFT-%F0%9F%A4%97
Let's think step by step like humans! 🧠🤔	https://cartinoe5930.tistory.com/entry/%ED%95%9C-%EB%8B%A8%EA%B3%84-%ED%95%9C-%EB%8B%A8%EA%B3%84%EC%94%A9-%EC%9D%B8%EA%B0%84%EC%B2%98%EB%9F%BC-%EC%83%9D%EA%B0%81%ED%95%B4%EB%B3%B4%EC%9E%90-%F0%9F%A7%A0%F0%9F%A4%94
Development process of fine-tuning method!! From fine-tuning to RLHF 🦖➡️🧑	https://cartinoe5930.tistory.com/entry/Fine-tuning-method%EC%9D%98-%EC%A7%84%ED%99%94-%EA%B3%BC%EC%A0%95-%F0%9F%A6%96%E2%9E%A1%EF%B8%8F%F0%9F%A7%91
It's time to fine-tune ChatGPT!! ⏰	https://cartinoe5930.tistory.com/entry/%EC%9D%B4%EC%A0%9C%EB%8A%94-ChatGPT%EB%A5%BC-fine-tuning-%ED%95%A0-%EC%8B%9C%EA%B0%84-%E2%8F%B0
Noise makes LLM better! - NEFTune 😉	https://cartinoe5930.tistory.com/entry/Noise-makes-LLM-better-NEFTune-%F0%9F%98%89

Natural Language Processing

Word Embedding & Neural Networks

Paper Title	Paper or reference site Link	Paper Review
Embedding Matrix	https://wikidocs.net/book/2155	https://cartinoe5930.tistory.com/entry/Embedding-Matrix-%ED%95%99%EC%8A%B5
LSTM: Long-Short Term Memory	https://colah.github.io/posts/2015-08-Understanding-LSTMs/	https://cartinoe5930.tistory.com/entry/%EC%95%8C%EA%B8%B0-%EC%89%BD%EA%B2%8C-LSTM-networks-%EC%9D%B4%ED%95%B4%ED%95%98%EA%B8%B0
GRU: Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation	https://arxiv.org/abs/1406.1078	https://cartinoe5930.tistory.com/entry/GRU-Empirical-Evaluation-of-Gated-Recurrent-Neural-Networks-on-Sequence-Modeling-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
LSTM vs. GRU: Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling	https://arxiv.org/abs/1412.3555	https://cartinoe5930.tistory.com/entry/LSTM-vs-GRU-%EB%AD%90%EA%B0%80-%EB%8D%94-%EB%82%98%EC%9D%84%EA%B9%8C-Empirical-Evaluation-of-Gated-Recurrent-Neural-Networks-on-Sequence-Modeling-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0

Language Models🤖

Basic📖

Paper Title	Paper or reference site Link	Paper Review
Transformer: Attention Is All You Need	https://arxiv.org/abs/1706.03762	https://cartinoe5930.tistory.com/entry/Transformer-Attention-Is-All-You-Need-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
ELMo: Deep contextualized word representations	https://arxiv.org/abs/1802.05365	https://cartinoe5930.tistory.com/entry/Pre-trained-Language-Modeling-paper-reading1-ELMo-Deep-contextualized-word-representations
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding	https://arxiv.org/abs/1810.04805	https://cartinoe5930.tistory.com/entry/Pre-trained-Language-Modeling-paper-reading2-BERT-Pre-training-of-Deep-Bidirectional-Transformers-for-Language-Understanding
GPT-1: Improving Language Understanding by Generative Pre-Training	https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf	https://cartinoe5930.tistory.com/entry/Pre-trained-Language-Modeling-paper-reading3-GPT-1-Improving-Language-Understanding-by-Generative-Pre-Training
GPT-2: Language Models are Unsupervised Multitask Learners	https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf	https://cartinoe5930.tistory.com/entry/GPT-2-Language-Models-are-Unsupervised-Multitask-Learners-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
GPT-3: Language Models are Few-Shot Learners	https://cartinoe5930.tistory.com/entry/GPT-3-Language-Models-are-Few-Shot-Learners-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0	https://cartinoe5930.tistory.com/entry/GPT-3-Language-Models-are-Few-Shot-Learners-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context	https://arxiv.org/abs/1901.02860	https://cartinoe5930.tistory.com/entry/Transformer-XL-Attentive-Language-Models-Beyond-a-Fixed-Length-Context-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Sparse Transformers: Generating Long Sequences with Sparse Transformers	https://arxiv.org/abs/1904.10509	https://cartinoe5930.tistory.com/entry/Sparse-Transformers-Generating-Long-Sequence-with-Sparse-Transformers-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
XLNET: Generalized Autoregressive Pretraining for Language Understanding	https://arxiv.org/abs/1906.08237	https://cartinoe5930.tistory.com/entry/XLNet-Generalized-Autoregressive-Pretraining-for-Language-Understanding-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
SpanBERT: Improving Pre-training by Representing and Predicting Spans	https://arxiv.org/abs/1907.10529	https://cartinoe5930.tistory.com/entry/SpanBERT-Improving-Pre-training-by-Representing-and-Predicting-Spans-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
RoBERTa: A Robustly Optimized BERT Pre-training Approach	https://arxiv.org/abs/1907.11692	https://cartinoe5930.tistory.com/entry/RoBERTa-A-Robustly-Optimized-BERT-Pretraining-Approach-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks	https://arxiv.org/abs/1908.10084	https://cartinoe5930.tistory.com/entry/Sentence-BERT-Sentence-Embeddings-using-Siamese-BERT-Networks-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations	https://arxiv.org/abs/1909.11942	https://cartinoe5930.tistory.com/entry/ALBERT-A-Lite-BERT-for-Self-supervised-Learning-of-Language-Representations-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension	https://arxiv.org/abs/1910.13461	https://cartinoe5930.tistory.com/entry/BART-Denoising-Sequence-to-Sequence-Pre-training-for-Natural-Language-Generation-Translation-and-Comprehension-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Pre-LN Transformer: On Layer Normalization in the Transformer Architecture	https://arxiv.org/abs/2002.04745	https://cartinoe5930.tistory.com/entry/Pre-LN-Transformer-On-Layer-Normalization-in-the-Transformer-Architecture-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
ELECTRA: Pre-training Text Encoders as Discriminators rather than Generators	https://arxiv.org/abs/2003.10555	https://cartinoe5930.tistory.com/entry/ELECTRA-Pre-training-Text-Encoders-as-Discriminators-rather-than-Generators
Longformer: The Long-Document Transformer	https://arxiv.org/abs/2004.05150	https://cartinoe5930.tistory.com/entry/Longformer-The-Long-Document-Transformer-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
BigBird: Transformers for Longer Sequences	https://arxiv.org/abs/2007.14062	https://cartinoe5930.tistory.com/entry/BigBird-Transformers-for-Longer-Sequences-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
WebGPT: Browser-assisted question-answering with human feedback	https://arxiv.org/abs/2112.09332	https://cartinoe5930.tistory.com/entry/WebGPT-Browser-assisted-question-answering-with-human-feedback-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
OPT: Open Pre-trained Transformer Language Models	https://arxiv.org/abs/2205.01068	https://cartinoe5930.tistory.com/entry/OPT-Open-Pre-trained-Transformer-Language-Models-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Mamba: Linear-Time Sequence Modeling with Selective State Spaces	https://arxiv.org/abs/2312.00752	No plan!

Efficient Models💸

Paper Title	Paper or reference site Link	Paper Review
TinyBERT: Distilling BERT for Natural Language Understanding	https://arxiv.org/abs/1909.10351	https://cartinoe5930.tistory.com/entry/TinyBERT-Distilling-BERT-for-Natural-Language-Understanding-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
DistilBERT: a distilled version of BERT	https://arxiv.org/abs/1910.01108	https://cartinoe5930.tistory.com/entry/DistilBERT-a-distilled-version-of-BERT-smaller-faster-cheaper-and-lighter-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
It's Not Just Size That Matters: Small Language Models are Also Few-Shot Learners(PET 응용)	https://arxiv.org/abs/2009.07118	https://cartinoe5930.tistory.com/entry/Its-Not-Just-Size-That-Matters-Small-Language-Models-Are-Also-Few-Shot-Learners-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0

Open-source Language Model(Scaling law)🤗

Paper Title	Paper or reference site Link	Paper Review
Chinchilla: Training Compute-Optimal Large Language Models	https://arxiv.org/abs/2203.15556	https://cartinoe5930.tistory.com/entry/%EC%A7%80%EA%B8%88-%EA%B9%8C%EC%A7%80%EC%9D%98-LM-Scaling-Law%EC%97%90%EB%8A%94-%EB%AC%B8%EC%A0%9C%EC%A0%90%EC%9D%B4-%EC%9E%88%EB%8B%A4-%F0%9F%98%B6%E2%80%8D%F0%9F%8C%AB%EF%B8%8F-Chinchilla-Training-Compute-Optimal-Large-Language-Models-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling	https://arxiv.org/abs/2304.01373	No plan!
LIMA: Less Is More for Alignment	https://arxiv.org/abs/2305.11206	https://cartinoe5930.tistory.com/entry/LIMA-Less-Is-More-for-Alignment-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
LLaMA: Open and Efficient Foundation Language Models	https://arxiv.org/abs/2302.13971	https://cartinoe5930.tistory.com/entry/LLaMA-Open-and-Efficient-Foundation-Language-Models-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
WizardLM: Empowering Large Language Models to Follow Complex Instructions	https://arxiv.org/abs/2304.12244	https://cartinoe5930.tistory.com/entry/Open-domain-instruction%EC%9D%98-%ED%9A%A8%EA%B3%BC-%F0%9F%AA%84-WizardLM-Empowering-Large-Language-Models-to-Follow-Complex-Instructions-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
WizardCoder: Empowering Code Large Language Models with Evol-Instruct	https://arxiv.org/abs/2306.08568	https://huggingface.co/WizardLM/WizardCoder-15B-V1.0
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct	https://arxiv.org/abs/2308.09583	https://huggingface.co/WizardLM/WizardMath-70B-V1.0
Alpaca: A Strong, Replicable Instruction-Following Model	https://crfm.stanford.edu/2023/03/13/alpaca.html	https://cartinoe5930.tistory.com/entry/Alpaca-A-Strong-Replicable-Instruction-Following-Model-%EB%A6%AC%EB%B7%B0
Vicuna: An Open-Source Chatbot Impressing GPT-4	https://lmsys.org/blog/2023-03-30-vicuna/	https://cartinoe5930.tistory.com/entry/Vicuna-An-Open-Source-Chatbot-Impressing-GPT-4-%EB%A6%AC%EB%B7%B0
Koala: A Dialogue Model for Academic Research	https://bair.berkeley.edu/blog/2023/04/03/koala/	https://cartinoe5930.tistory.com/entry/%EC%A4%91%EC%9A%94%ED%95%9C-%EA%B1%B4-%EA%BA%BE%EC%9D%B4%EC%A7%80-%EC%95%8A%EB%8A%94-high-quality-data-Koala%F0%9F%90%A8-A-Dialogue-Model-for-Academic-Researc
Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data	https://arxiv.org/abs/2304.01196	https://cartinoe5930.tistory.com/entry/%F0%9F%90%B2Baize-An-Open-Source-Chat-Model-with-Parameter-Efficient-Tuning-on-Self-Chat-Data-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Scaling Data-Constrained Language Models	https://arxiv.org/abs/2305.16264	https://www.youtube.com/watch?v=TK0-sitkCMw&pp=ygUgaHR0cHM6Ly9hcnhpdi5vcmcvYWJzLzIzMDUuMTYyNjQ%3D
Falcon & RefinedWeb	https://arxiv.org/abs/2306.01116	https://cartinoe5930.tistory.com/entry/Open-LLM-Leaderboard%EB%A5%BC-%ED%9C%A9%EC%93%B4-Falcon%F0%9F%A6%85-LLM-Falcon-RefinedWeb
Orca: Progressive Learning from Complex Explanation Traces of GPT-4	https://arxiv.org/pdf/2306.02707	https://cartinoe5930.tistory.com/entry/%F0%9F%90%ACOrca-Progressive-Learning-from-Complex-Explanation-Traces-of-GPT-4-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
phi-1: Text Books Are All You Need	https://arxiv.org/abs/2306.11644	https://cartinoe5930.tistory.com/entry/%ED%95%84%EC%9A%94%ED%95%9C-%EA%B1%B4-%EC%98%A4%EC%A7%81-%EA%B5%90%EA%B3%BC%EC%84%9C-%EC%88%98%EC%A4%80%EC%9D%98-%EB%8D%B0%EC%9D%B4%ED%84%B0%EB%BF%90-%F0%9F%93%96-phi-1-Textbooks-Are-All-You-Need-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
AlpaGasus: Training a Better Alpaca with Fewer Data	https://arxiv.org/abs/2307.08701	Will be uploaded later!
Llama 2: Open Foundation and Fine-Tuned Chat Models	https://arxiv.org/abs/2307.09288	https://cartinoe5930.tistory.com/entry/The-hopes-of-researchers-Open-source-%F0%9F%A4%97-%EC%97%B0%EA%B5%AC%EC%9E%90%EB%93%A4%EC%9D%98-%ED%9D%AC%EB%A7%9D-Open-source-%F0%9F%A4%97
Platypus: Quick, Cheap, and Powerful Refinement of LLMs	https://arxiv.org/abs/2308.07317	Will be uploaded later!
Code Llama: Open Foundation Models for Code	https://arxiv.org/abs/2308.12950	No plan
FLM-101B: An Open LLM and How to Train It with $100K Budget	https://arxiv.org/pdf/2309.03852	No plan!
Textbooks are All You Need II: phi-1.5 technical report	https://arxiv.org/abs/2309.05463	https://huggingface.co/microsoft/phi-1_5
OpenChat: Advancing Open-Source Language Models with Mixed-Quality Data	https://arxiv.org/abs/2309.11235	https://github.com/imoneoi/openchat
Mistral 7B	https://arxiv.org/abs/2310.06825	https://mistral.ai/news/announcing-mistral-7b/
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models	https://arxiv.org/abs/2310.08491	https://huggingface.co/papers/2310.08491#652a8e7f30355beba68c1be6
Zephyr: Direct Distillation of LM Alignment	https://arxiv.org/abs/2310.16944	https://www.youtube.com/watch?v=TkZBg3mKsIo
Orca2: Teaching Small Language Models How to Reason	https://arxiv.org/abs/2311.11045	https://www.microsoft.com/en-us/research/blog/orca-2-teaching-small-language-models-how-to-reason/
The Falcon Series of Open Language Models	https://arxiv.org/abs/2311.16867	No plan!
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling	https://arxiv.org/abs/2312.15166	No plan!

Large Language Models(LLMs)💣

Paper Title	Paper or reference site Link	Paper Review
LaMDA: Language Models for Dialog Applications	blog: https://ai.googleblog.com/2022/01/lamda-towards-safe-grounded-and-high.html, paper: https://arxiv.org/abs/2201.08239	https://cartinoe5930.tistory.com/entry/%EA%B5%AC%EA%B8%80%EC%9D%98-%EC%B5%9C%EA%B0%95-%EC%B1%97%EB%B4%87-LaMDA%EC%97%90-%EB%8C%80%ED%95%B4-%EC%95%8C%EC%95%84%EB%B3%B4%EC%9E%90-Language-Models-for-Dialog-Applications-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
PaLM: Scaling Language Modeling with Pathways	blog: https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html, paper: https://arxiv.org/abs/2204.02311	1: https://cartinoe5930.tistory.com/entry/LaMDA%EC%9D%98-%EB%92%A4%EB%A5%BC-%EC%9E%87%EB%8A%94-Pathways%EB%A5%BC-%ED%99%9C%EC%9A%A9%ED%95%9C-%EC%B4%88%EA%B1%B0%EB%8C%80-%EC%96%B8%EC%96%B4-%EB%AA%A8%EB%8D%B8-PaLM-%EB%A6%AC%EB%B7%B0, 2: https://cartinoe5930.tistory.com/entry/LaMDA%EC%9D%98-%EB%92%A4%EB%A5%BC-%EC%9E%87%EB%8A%94-Pathways%EB%A5%BC-%EC%82%AC%EC%9A%A9%ED%95%9C-%EC%B4%88%EA%B1%B0%EB%8C%80-%EC%96%B8%EC%96%B4-%EB%AA%A8%EB%8D%B8-PaLM-%EB%A6%AC%EB%B7%B02
GPT-4: Technical Review	blog: https://openai.com/research/gpt-4, paper: https://arxiv.org/abs/2303.08774	https://cartinoe5930.tistory.com/entry/GPT-4-Techinal-Report-Review
Gemini: A Family of Highly Capable Multimodal Models	https://arxiv.org/abs/2312.11805	No plan!
AlphaCode 2 Technical Report	https://storage.googleapis.com/deepmind-media/AlphaCode2/AlphaCode2_Tech_Report.pdf	No plan!

Fine-tuning

Instruction-tuning🧑‍🏫

Paper Title	Paper or reference site Link	Paper Review
FLAN: Fine-tuned Language Models are Zero-shot Learners	https://arxiv.org/abs/2109.01652	https://cartinoe5930.tistory.com/entry/FLAN-Fine-tuned-Language-Models-are-Zero-shot-Learners-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
T0: Multitask Prompted Training Enables Zero-shot Task Generalization	https://arxiv.org/abs/2110.08207	https://cartinoe5930.tistory.com/entry/T0-Multitask-Prompted-Training-Enables-Zero-shot-Task-Generalization-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Super-Natural Instructions: Generalization via Declarative Instructions on 1600+ NLP Tasks	https://arxiv.org/abs/2204.07705	https://cartinoe5930.tistory.com/entry/Super-Natural-Instructions-Generalization-via-Declarative-Instructions-on-1600-NLP-Tasks-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Unnatural Instructions: Tuning Language Models with (Almost) Not Human Labor	https://arxiv.org/abs/2212.09689	Will be uploaded later!
Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-shot Learners	https://arxiv.org/abs/2210.02969	https://cartinoe5930.tistory.com/entry/Guess-the-Instruction-Flipped-Learning-Makes-Language-Models-Stronger-Zero-shot-Learners-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Scaling Instruction-Finetuned Language Models	https://arxiv.org/abs/2210.11416	https://cartinoe5930.tistory.com/entry/Scaling-Instruction-Finetuned-Language-Models-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Exploring the Benefits of Training Expert Language Models over Instruction Tuning	https://arxiv.org/abs/2302.03202	https://cartinoe5930.tistory.com/entry/Exploring-the-Benefits-of-Training-Expert-Language-Models-over-Instruction-Tuning-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
ICIL: In-Context Instruction Learning	https://arxiv.org/abs/2302.14691	https://cartinoe5930.tistory.com/entry/ICIL-In-Context-Instruction-Learning-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Instruction tuning with GPT-4	https://arxiv.org/abs/2304.03277	https://cartinoe5930.tistory.com/entry/Instruction-Tuning-with-GPT-4-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
FIP: Fixed Input Parameterization for Efficient Prompting	https://aclanthology.org/2023.findings-acl.533.pdf	Will be uploaded later!
FlaCuna: unleashin the Problem Solving Power of Vicuna using FLAN Fine-tuning	https://arxiv.org/abs/2307.02053	Will be uploaded later!
Maybe Only 0.5% Data Is Needed: A Preliminary Exploration of Low Training Data Instruction Tuning	https://arxiv.org/abs/2305.09246	Will be uploaded later!
Becoming self-instruct: introducing early stopping criteria for minimal instruct tuning	https://arxiv.org/abs/2307.03692	Will be uploaded later!

Reinforcement Learning from Human Feedback(RLHF)👥

Paper Title	Paper or reference site Link	Paper Review
RLHF(Reinforcement Learning from Human Feedback)	https://huggingface.co/blog/rlhf	https://cartinoe5930.tistory.com/entry/%EC%82%AC%EB%9E%8C%EC%9D%98-%ED%94%BC%EB%93%9C%EB%B0%B1%EC%9D%84-%ED%86%B5%ED%95%9C-%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5-Reinforcement-Learning-from-Human-Feedback-RLHF
Red Teaming Language Models with Language Models	https://arxiv.org/abs/2202.03286	https://cartinoe5930.tistory.com/entry/Red-Teaming-Language-Models-with-Language-Models-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
InstructGPT: Training language models to follow instructions with human feedback	https://arxiv.org/abs/2203.02155	https://cartinoe5930.tistory.com/entry/InstructGPT-Training-language-models-to-follow-instructions-with-human-feedback-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Training a helpful and harmless assistant with reinforcement learning from human feedback	https://arxiv.org/abs/2204.05862	https://cartinoe5930.tistory.com/entry/Training-a-helpful-and-harmless-assistant-with-reinforcement-learning-from-human-feedback-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback	https://arxiv.org/abs/2305.14387	Will be uploaded later!
ALMoST: Aligning Large Language Models through Synthetic Feedback	https://arxiv.org/abs/2305.13735	https://cartinoe5930.tistory.com/entry/Aligning-Large-Language-Models-through-Synthetic-Feedback-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback	https://arxiv.org/abs/2307.15217	Will be uploaded later!
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback	https://arxiv.org/abs/2309.00267	No plan!
SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF	https://arxiv.org/abs/2310.05344	No plan!
HelpSteer: Multi-attribute Helpfulness Dataset for SteerLM	https://arxiv.org/abs/2311.09528	No plan!

Efficient-tuning ✨

Paper Title	Paper or reference site Link	Paper Review
Adapter: Parameter-Efficient learning for NLP	https://arxiv.org/abs/1902.00751	https://cartinoe5930.tistory.com/entry/%EB%8B%B9%EC%8B%A0%EB%8F%84-Fine-tuning-%ED%95%A0-%EC%88%98-%EC%9E%88%EC%8A%B5%EB%8B%88%EB%8B%A4-with-PEFT-%F0%9F%A4%97
Prefix-Tuning: Optimizing Continuous Prompts for Generation	https://arxiv.org/abs/2101.00190	https://cartinoe5930.tistory.com/entry/%EB%8B%B9%EC%8B%A0%EB%8F%84-Fine-tuning-%ED%95%A0-%EC%88%98-%EC%9E%88%EC%8A%B5%EB%8B%88%EB%8B%A4-with-PEFT-%F0%9F%A4%97
LoRA: Low-Rank Adaptation of Large Language Models	https://arxiv.org/abs/2106.09685	https://cartinoe5930.tistory.com/entry/%EB%8B%B9%EC%8B%A0%EB%8F%84-Fine-tuning-%ED%95%A0-%EC%88%98-%EC%9E%88%EC%8A%B5%EB%8B%88%EB%8B%A4-with-PEFT-%F0%9F%A4%97
Towards a Unified View of Parameter-Efficient Transfer Learning	https://arxiv.org/abs/2110.04366	Will be uploaded later!
UniPELT: A Unified Framework for Parameter-Efficient Language Model Tuning	https://arxiv.org/abs/2110.07577	Will be uploaded later!
(IA)^3: Few-Shot Parameter-Efficient Fine-TUning is Better and Cheaper than In-Context Learning	https://arxiv.org/abs/2205.05638	Will be uploaded later!
QLoRA: Efficient Fine-tuning of Quantized LLMs	https://arxiv.org/abs/2305.14314	Will be uploaded later!
Stack More Layers Differently: High-Rank Training Through Low-Rank Updates	https://arxiv.org/abs/2307.05695	Will be uploaded later!
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition	https://arxiv.org/abs/2307.13269	Will be uploaded later!

Dataset 💫

Paper Title	Paper or reference site Link	Paper Review
Instruction Mining: High-quality Instruction Data Selection for Large Language Models	https://arxiv.org/abs/2307.06290	No plan!
SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization	https://arxiv.org/abs/2212.10465	No plan!
MoDS: Model-oriented Data Selection for Instruction Tuning	https://arxiv.org/abs/2311.15653	No plan!
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models	https://arxiv.org/abs/2312.06585	No plan!
Magicoder: Source Code Is All You Need	https://arxiv.org/abs/2312.02120	No plan!
WaveCoder: Widespread and Versatile Enhanced Instruction Tuning with Refined Data Generation	https://arxiv.org/abs/2312.14187	No plan!
What Makes Good Data for Alignment: A Comprehensive Study of Automatic Data Selection in Instruction Tuning	https://arxiv.org/abs/2312.15685	No plan!

Prompt Engineering 👨‍🔧

Paper Title	Paper or reference site Link	Paper Review
What is the 'Prompt Engineering'?	See my blog!	https://cartinoe5930.tistory.com/entry/Prompt-Engineering%EC%9D%B4-%EB%AC%B4%EC%97%87%EC%9D%BC%EA%B9%8C
CoT: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models	blog: https://ai.googleblog.com/2022/05/language-models-perform-reasoning-via.html, paper: https://arxiv.org/abs/2201.11903	https://cartinoe5930.tistory.com/entry/LM%EC%9D%B4-%EC%82%AC%EB%9E%8C%EA%B3%BC-%EC%9C%A0%EC%82%AC%ED%95%9C-%EC%83%9D%EA%B0%81-%ED%94%84%EB%A1%9C%EC%84%B8%EC%8A%A4%EB%A5%BC-%EA%B0%80%EC%A7%80%EA%B2%8C-%EB%90%9C%EB%8B%A4%EB%A9%B4-Chain-of-Thought-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Zero-shot CoT: Large Language Models Are Zero-shot Reasoners	https://arxiv.org/abs/2205.11916	https://cartinoe5930.tistory.com/entry/Large-Language-Models-are-Zero-Shot-Reasoners-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Language Models are Multilingual Chain-of-Thought Reasoners	https://arxiv.org/abs/2210.03057	Will be uploaded later!
Auto-CoT: Automatic Chain of Thought Prompting in Large Language Models	https://arxiv.org/abs/2210.03493	Will be uploaded later!
CoT KD: Teaching Small Language Models to Reason	https://arxiv.org/abs/2212.08410	Will be uploaded later!
ToT: Tree of Thoughts: Deliberate Problem Solving with Large Language Models	https://arxiv.org/abs/2305.10601	https://cartinoe5930.tistory.com/entry/Tree-of-Thoughts-Deliberate-Problem-Solving-with-Large-Language-Models-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning	https://arxiv.org/abs/2305.14045	https://cartinoe5930.tistory.com/entry/CoT-Collection-Improving-Zero-shot-and-Few-shot-Learning-of-Language-Models-via-Chain-of-Thought-Fine-tuning-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Let's verify step-by-step	https://arxiv.org/abs/2305.20050	https://cartinoe5930.tistory.com/entry/Lets-verify-step-by-step-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Measuring Faitfulness in Chain-of-Thought Reasoning	https://arxiv.org/abs/2307.13702	Will be uploaded later!
SoT: Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding	https://arxiv.org/abs/2307.15337	Will be uploaded later!
Graph of Thoughts: Solving Elaborate Problems with Large Language Models	https://arxiv.org/abs/2308.09687	Will be uploaded later!
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting	https://arxiv.org/abs/2309.04269	No plan!
Chain-of-Verification Resuces Hallucination in Large Language Models	https://arxiv.org/abs/2309.11495	https://www.youtube.com/watch?v=l0zFjwRegog&pp=ygUgaHR0cHM6Ly9hcnhpdi5vcmcvYWJzLzIzMDkuMTE0OTU%3D
Contrastive Chain-of-Thought Prompting	https://arxiv.org/abs/2311.09277	No plan!
Thread of Thought Unraveling Chaotic Contexts	https://arxiv.org/abs/2311.08734	No plan!
System 2 Attention (Is Something You Might Need Too)	https://arxiv.org/abs/2311.11829	No plan!
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator	https://arxiv.org/abs/2312.04474	No plan!

Model Efficiency 🧰

Paper Title	Paper	Paper Review
FlashAttention: Fast and Memory-Efficient Exact Attention	https://arxiv.org/abs/2205.14135	https://gordicaleksa.medium.com/eli5-flash-attention-5c44017022ad
Exponentially Faster Language Modeling	https://arxiv.org/abs/2311.10770	No plan!
LLM in a flash: Efficient Large Language Model Inference with Limited Memory	https://arxiv.org/abs/2312.11514	No plan!

Method 📐

Paper Title	Paper or reference site Link	Paper Review
Data Augmentations in NLP	blogs: https://neptune.ai/blog/data-augmentation-nlp, https://amitness.com/2020/05/data-augmentation-for-nlp/?fbclid=IwAR11MkccCti-2cD93RYftNPHb7Wxdj7AlZG7NNG4EhPaBkmiJkcBPtdl1eo	https://cartinoe5930.tistory.com/entry/Data-Augmentation-methods-in-NLP
PET: Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference	https://arxiv.org/abs/2001.07676	https://cartinoe5930.tistory.com/entry/PET-Exploiting-Cloze-Questions-for-Few-Shot-Text-Classification-and-Natural-Language-Inference-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Pathways	https://blog.google/technology/ai/introducing-pathways-next-generation-ai-architecture/	https://cartinoe5930.tistory.com/entry/%EB%A7%8C%EC%95%BD-%EB%AA%A8%EB%8D%B8%EC%9D%B4-%EC%97%AC%EB%9F%AC-%EA%B0%90%EA%B0%81%EC%9D%84-%EB%8A%90%EB%82%84-%EC%88%98-%EC%9E%88%EA%B2%8C-%EB%90%9C%EB%8B%A4%EB%A9%B4-Pathways-%EB%A6%AC%EB%B7%B0
LMSI: Large Language Models Can Self-Improve	https://arxiv.org/abs/2210.11610	https://cartinoe5930.tistory.com/entry/LMSI-Large-Language-Models-can-Self-Improve-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Self-Instruct: Aligning Language Model with Self Generated Instruction	https://arxiv.org/abs/2212.10560	https://cartinoe5930.tistory.com/entry/Self-Instruct-Aligning-Language-Model-with-Self-Generated-Instructions-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Reflexion: Language Agents with Verbal Reinforcement Learning	https://arxiv.org/abs/2303.11366	https://cartinoe5930.tistory.com/entry/Reflexion-Language-Agents-with-Verbal-Reinforcement-Learning-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Self-Refine: Iterative Refinement with Self-Feedback	https://arxiv.org/abs/2303.17651	https://cartinoe5930.tistory.com/entry/Self-Refine-Iterative-Refinement-with-Self-Feedback-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
REFINER: Reasoning Feedback on Intermediate Representations	https://arxiv.org/abs/2304.01904	No plan!
SelFee: Iterative Self-Revising LLM Expowered by Self-Feedback Generation	https://kaistai.github.io/SelFee/	https://cartinoe5930.tistory.com/entry/SelFee-Iterative-Self-Revising-LLM-Expowered-by-Self-Feedback-Generation-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints	https://arxiv.org/abs/2305.13245	https://aliissa99.medium.com/-a596e4d86f79
Shpherd: A Critic for Language Model Generation	https://arxiv.org/abs/2308.04592	Will be uploaded later!
Self-Alignment with Instruction Backtranslation	https://arxiv.org/pdf/2308.06259	Will be uploaded later!
SCREWS: A Modular Framework for Reasoning with Revisions	https://arxiv.org/pdf/2309.13075	No plan!
NEFTune: Noisy Embeddings Improve Instruction Fineuning	https://arxiv.org/abs/2310.05914	https://cartinoe5930.tistory.com/entry/Noise-makes-LLM-better-NEFTune-%F0%9F%98%89
Language Models are Super Mario; Absorbing Abilities from Homologous Models as a Free Lunch	https://arxiv.org/abs/2311.03099	No plan!
LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment	https://arxiv.org/abs/2312.09979	No plan!

Retrieval Augmented Generation(RAG) 📚

Paper Title	Paper or reference site Link	Paper Review
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks	https://arxiv.org/abs/2005.11401	No plan!
Self-RAG: Learning to Retrieve, Generate, And Critique Through Self-Reflection	https://arxiv.org/abs/2310.11511	No plan!
InstructRetro: Instruction Tuning Post Retrieval-Augmented Pretraining	https://arxiv.org/abs/2310.07713	No plan!
Retrieval-Augmented Generation for Large Language Models: A Survey	https://arxiv.org/abs/2312.10997	No plan!

Benchmarks 🏆 & Evaluation Metric ⚔️

Paper Title	Paper or reference site Link	Paper Review
BIG-Bench Hard: Challenging BIG-Bench tasks and whether chain-of-thought can solve tham	https://arxiv.org/abs/2210.09261	Will be uploaded later!
Large Language Models are not Fair Evaluators	https://arxiv.org/abs/2305.17926	Will be uploaded later!
MT-Bench: Judging LLM-as-a-judge with MT-Bench	https://arxiv.org/abs/2306.05685	Will be uploaded later!
InstructEval: Towards Holistic Evaluation of Instruction-Tuned Large Language Models	https://arxiv.org/abs/2306.04757	Will be uploaded later!
FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets	https://arxiv.org/abs/2307.10928	Will be uploaded later!
GAIA: A Benchmark for General AI Assistants	https://arxiv.org/abs/2311.12983	No plan!

Context of LLM 📜

Paper Title	Paper or reference site Link	Paper Review
A Length-Extrapolatable Transformer	https://arxiv.org/abs/2212.10554	No plan!
Extending Context Window of Large Language Models via Positional Interpolation	https://arxiv.org/abs/2306.15595	https://cartinoe5930.tistory.com/entry/LM%EC%9D%98-context-window-%EA%B8%B8%EC%96%B4%EC%95%BC-%ED%95%A0%EA%B9%8C-%EC%A7%A7%EC%95%84%EC%95%BC-%ED%95%A0%EA%B9%8C-%F0%9F%93%8F%F0%9F%A4%A8
LongNet: Scaling Transformers to 1,000,000,000 Tokens	https://arxiv.org/abs/2307.02486	https://cartinoe5930.tistory.com/entry/LM%EC%9D%98-context-window-%EA%B8%B8%EC%96%B4%EC%95%BC-%ED%95%A0%EA%B9%8C-%EC%A7%A7%EC%95%84%EC%95%BC-%ED%95%A0%EA%B9%8C-%F0%9F%93%8F%F0%9F%A4%A8
Lost in the Middle: How Language Models Use Long Contexts	https://arxiv.org/abs/2307.03172	https://cartinoe5930.tistory.com/entry/LM%EC%9D%98-context-window-%EA%B8%B8%EC%96%B4%EC%95%BC-%ED%95%A0%EA%B9%8C-%EC%A7%A7%EC%95%84%EC%95%BC-%ED%95%A0%EA%B9%8C-%F0%9F%93%8F%F0%9F%A4%A8
YaRN: Efficient Context Window Extension of Large Language Models	https://arxiv.org/abs/2309.00071	No plan!

Analysis🔬

Paper Title	Paper or reference site Link	Paper Review
Why can GPT learn in-context?	https://arxiv.org/abs/2212.10559	https://cartinoe5930.tistory.com/entry/Why-can-GPT-learn-in-context-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Sparks of Artificial General Intelligence: Early experiments with GPT-4	paper: https://arxiv.org/abs/2303.12712, youtube: https://www.youtube.com/watch?v=Mqg3aTGNxZ0	https://cartinoe5930.tistory.com/entry/Sparks-of-Artificial-General-Intelligence-Early-experiments-with-GPT-4-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
The False Promise of Imitating Proprietary LLMs	https://arxiv.org/abs/2305.15717	https://cartinoe5930.tistory.com/entry/%EA%B8%B0%EC%A1%B4-imitation-model%EC%9D%80-%EC%9E%98%EB%AA%BB-%ED%95%99%EC%8A%B5%EB%90%98%EA%B3%A0-%EC%9E%88%EB%8B%A4-%F0%9F%AB%A2-The-False-Promise-of-Imitating-Proprietary-L
TULU: How Far Can Camels Go? Exploring the State of Instructiopn Tuning on Open Resources	https://arxiv.org/abs/2306.04751	Will be uploaded later!
How Is ChatGPT's Behavior Changing over Time?	https://arxiv.org/abs/2307.09009	https://cartinoe5930.tistory.com/entry/ChatGPT%EC%9D%98-%EC%84%B1%EB%8A%A5%EC%9D%B4-%EC%95%88-%EC%A2%8B%EC%95%84%EC%A7%80%EA%B3%A0-%EC%9E%88%EB%8B%A4%EA%B5%AC-%F0%9F%98%B2%F0%9F%98%B2
Large Language Models Cannot Self-Correct Reasoning Yet	https://arxiv.org/abs/2310.01798
How Far Are Large Language Models from Agents with Theory-of-Mind	https://arxiv.org/pdf/2310.03051	No plan!
Can LLMs Follow Simple Rules	https://arxiv.org/abs/2311.04235	https://www.youtube.com/watch?v=CY6o43037OY
Camels in a Changing Climate; Enhancing LM Adaptation with Tulu 2	https://arxiv.org/abs/2311.10702	No plan!
ChatGPT's One-year Anniversary; Are Open-Source Large Language Models Catching up	https://arxiv.org/abs/2311.15653	No plan!
An In-depth Look at Gemini's Language Abilities	https://arxiv.org/abs/2312.11444	No plan!

Interesting🫣

Paper Title	Paper or reference site Link	Paper Review
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature	https://arxiv.org/abs/2301.11305	https://cartinoe5930.tistory.com/entry/%EC%9D%B4-%EA%B8%80%EC%9D%B4-LM%EC%9D%B4-%EB%A7%8C%EB%93%A4%EC%96%B4%EB%82%B8-%EA%B8%80%EC%9D%BC%EA%B9%8C-%EB%8F%84%EC%99%80%EC%A4%98-DetectGPT-DetectGPT-Zero-Shot-Machine-Generated-Text-Detection-using-Probability-Curvature-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback	https://arxiv.org/abs/2302.12813	https://cartinoe5930.tistory.com/entry/ChatGPT%EC%9D%98-hallucination-%EC%96%B4%EB%96%BB%EA%B2%8C-%ED%95%B4%EA%B2%B0%ED%95%B4%EC%95%BC-%ED%95%A0%EA%B9%8C-Check-Your-Facts-and-Try-Again-Improving-Large-Language-Models-with-External-Knowledge-and-Automated-Feedback
RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text	https://arxiv.org/abs/2305.13304	https://cartinoe5930.tistory.com/entry/ChatGPT%EC%97%90-%EB%B0%98%EB%B3%B5-%EB%A9%94%EC%BB%A4%EB%8B%88%EC%A6%98LSTM%EC%9D%84-%EC%82%AC%EC%9A%A9%ED%95%9C%EB%8B%A4%EB%A9%B4-RecurrentGPT-Interactive-Generation-of-Arbitrarily-Long-Text-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Large Language Models as Tool Makers	https://arxiv.org/abs/2305.17126	https://cartinoe5930.tistory.com/entry/LM%EC%9D%B4-%EB%8F%84%EA%B5%AC%EB%A5%BC-%EC%82%AC%EC%9A%A9%ED%95%98%EA%B2%8C-%EB%90%9C%EB%8B%A4%EB%A9%B4-%F0%9F%94%AC-Large-Language-Models-as-Tool-Makers-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion	https://arxiv.org/abs/2306.02561	No plan!
Knowledge Distillation of Large Language Models	https://arxiv.org/abs/2306.08543	https://cartinoe5930.tistory.com/entry/KD%EC%97%90-%EC%82%B4%EC%A7%9D%EC%9D%98-%EB%B3%80%ED%99%94%EB%A5%BC-%EC%A4%98%EB%B3%B4%EC%9E%90-%F0%9F%98%9C-Knowledge-Distillation-of-Large-Language-Models-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Scaling Relationship on Learning Mathematical Reasoning with Large Language Models	https://arxiv.org/abs/2308.01825	Will be uploaded later!
ToolLLM: Facilitating Lare Language Models to Master 16000+ Real-World APIs	https://arxiv.org/abs/2307.16789	Will be uploaded later!
SelfCheck: Using LLMs to Zero-shot Check Their Own Step-by-Step Reasoning	https://arxiv.org/abs/2308.00436	Will be uploaded later!
Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification	https://arxiv.org/abs/2308.07921	Will be uploaded later!
Large Language Models as Optimizers	https://arxiv.org/abs/2309.03409	No plan!
FIAT: Fusing Learning Paradigms with Instruction-Accelerated Tuning	https://arxiv.org/abs/2309.04663	https://www.youtube.com/watch?v=EZsZEcRDte0&pp=ygUgaHR0cHM6Ly9hcnhpdi5vcmcvYWJzLzIzMDkuMDQ2NjM%3D
Contrastive Decoding Improves Reasoning in Large Language Models	https://arxiv.org/abs/2309.09117	https://www.youtube.com/watch?v=nMR56TkwC1Q&pp=ygUgaHR0cHM6Ly9hcnhpdi5vcmcvYWJzLzIzMDkuMDkxMTc%3D
Think before you speak: Training Language Models with Pause Tokens	https://arxiv.org/abs/2310.02226	https://www.youtube.com/watch?v=MtJ1jacr_yI
Large Language Models Can Learn Rules	https://arxiv.org/abs/2310.07064	No plan!
In-context Pretraining: Language Modeling Beyond Document Boundaries	https://arxiv.org/abs/2310.10638	https://www.youtube.com/watch?v=GI-0lAaILrU
Learning From Mistakes Makes LLM Better Reasoner	https://arxiv.org/abs/2310.20689	No plan!
Language Models can be Logical Solvers	https://arxiv.org/abs/2311.06158	No plan!
MART: Improving LLM Safety with Multi-round Automatic Red-Teaming	https://arxiv.org/abs/2311.07689	No plan!
Fine-tuning Language Models for Factuality	https://arxiv.org/abs/2311.08401	No plan!
Positional Description Matters for Transformers Arithmetic	https://arxiv.org/abs/2311.14737	No plan!
Weak-to-Strong Generalization: Eliciting Strong Capabilities with Weak Supervision	https://arxiv.org/abs/2312.09390	https://openai.com/research/weak-to-strong-generalization
TinyGSM: achieving higher than 80 percentage on GSM8k with small language models	https://arxiv.org/abs/2312.09241	No plan!

Korean LM 🇰🇷

Paper Title	Paper or reference site Link	Paper Review
Morpheme-aware Subword Tokenizer: An Empirical Study of Tokenization Strategies for Various Korean NLP Tasks	https://arxiv.org/abs/2010.02534	Will be uploaded later!
What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers	https://arxiv.org/abs/2109.04650	Will be uploaded later!

Computer Vision

Paper Title	Paper or reference site Link	Paper Review
history of CNN	LeNet, AlexNet, VGGNet, GoogLeNet, ResNet, ResNeXt, Sception, Mobilenet, DenseNet, EfficientNet, ConvNext	https://cartinoe5930.tistory.com/entry/CNN-network%EC%9D%98-%EC%97%AD%EC%82%AC
ViT: An Image Worth 16 x 16 Words: Transformers for Image Recognition at Scale	https://arxiv.org/abs/2010.11929	https://cartinoe5930.tistory.com/entry/ViT-An-Image-Worth-16-x-16-Words-Transformers-for-Image-Recognition-at-Scale
Swin Transformer: Hierarchical Vision Transformer using Shifted Winodws	https://arxiv.org/abs/2103.14030	https://cartinoe5930.tistory.com/entry/Swin-Transformer-Hierarchical-Vision-Transformer-using-Shifted-Windows-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
CLIP: Learning Transferable Visual Models From Natural Language Supervision	https://arxiv.org/abs/2103.00020	https://cartinoe5930.tistory.com/entry/CLIP-Learning-Transferable-Visual-Models-From-Natural-Language-Supervision-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0

Multi-modal Models

Paper Title	Paper or reference site Link	Paper Review
Let's learn about VLM(Visual-Language Model)	https://huggingface.co/blog/vision_language_pretraining#supporting-vision-language-models-in-%F0%9F%A4%97-transformers	https://cartinoe5930.tistory.com/entry/VLMVision-Language-Model%EC%97%90-%EB%8C%80%ED%95%B4-%EC%95%8C%EC%95%84%EB%B3%B4%EC%9E%90
VisualBERT: A simple and Performant Baseline for Vision and Language	https://arxiv.org/abs/1908.03557	https://cartinoe5930.tistory.com/entry/VisualBERT-A-Simple-and-Performant-Baseline-for-Vision-and-Language-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
ViLBERT: Pre-training Task-Agnostic Visiolinguistic Representations for Visual-and-Language Tasks	https://arxiv.org/abs/1908.02265	https://cartinoe5930.tistory.com/entry/ViLBERT-Pretraining-Task-Agnostic-Visiolinguistic-Representations-for-Visual-and-Language-Tasks
LXMERT: Learning Cross-Modality Encoder Representations from Transformers	https://arxiv.org/abs/1908.07490	https://cartinoe5930.tistory.com/entry/LXMERT-Learning-Cross-Modality-Encoder-Representations-from-Transformers-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
VL-BERT: Pre-training of Generic Visual-Linguistic Representations	https://arxiv.org/abs/1908.08530	https://cartinoe5930.tistory.com/entry/VL-BERT-Pre-training-of-Generic-Visual-Linguistic-Representations-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
VLP: Unified Vision-Language Pre-Training for Image Captioning and VQA	https://arxiv.org/abs/1909.11059	https://cartinoe5930.tistory.com/entry/VLP-Unified-Vision-Language-Pre-Traning-for-Image-Captioning-and-VQA-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks	https://arxiv.org/abs/2004.06165	https://cartinoe5930.tistory.com/entry/Oscar-Object-Semantics-Aligned-Pre-training-for-Vision-Language-Tasks-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
VinVL: Revisiting Visual Representations in Vision-Language Models	https://arxiv.org/abs/2101.00529	https://cartinoe5930.tistory.com/entry/VinVL-Revisiting-Visual-Representations-in-Vision-Language-Models-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision	https://arxiv.org/abs/2102.03334	https://cartinoe5930.tistory.com/entry/ViLT-Vision-and-Language-Transformer-Without-Convolution-or-Region-Supervision-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
ALIGN: Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision	https://arxiv.org/abs/2102.05918	https://cartinoe5930.tistory.com/entry/ALIGN-Scaling-up-Visual-and-Vision-Language-Representation-with-Noisy-Text-Supervision-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
ALBEF: Vision and Language Representation Learning with Momentum Distillation	https://arxiv.org/abs/2107.07651	https://cartinoe5930.tistory.com/entry/ALBEF-Vision-and-Language-Representation-Learning-with-Momentum-Distillation-%EB%85%BC%EB%AC%B8
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision	https://arxiv.org/abs/2108.10904	https://cartinoe5930.tistory.com/entry/SimVLM-Simple-Visual-Language-Model-Pre-training-with-Weak-Supervision-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
VLMo: Unified Vision-Language Pre-training with Mixture-of-Modality-Experts	https://arxiv.org/abs/2111.02358	https://cartinoe5930.tistory.com/entry/VLMo-Unified-Vision-Language-Pre-training-with-Mixture-of-Modality-Experts-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
LiT🔥 : Zero-Shot Transfer with Locked-image text Tuning	https://arxiv.org/abs/2111.07991	https://cartinoe5930.tistory.com/entry/LiT%F0%9F%94%A5-Zero-Shot-Transfer-with-Locked-image-text-Tuning-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
FLAVA: A Foundational Language And Vision Alignment Model	https://arxiv.org/abs/2112.04482	https://cartinoe5930.tistory.com/entry/FLAVA-A-Foundational-Language-And-Vision-Alignment-Model-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation	https://arxiv.org/abs/2201.12086	https://cartinoe5930.tistory.com/entry/BLIP-Bootstrapping-Language-Image-Pre-training-fro-Unified-Vision-Language-Understanding-and-Generation-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0

Deep Learning Concept

Paper or Posting Title	reference site Link	Review
Knowledge Distillation: Distilling the Knowledge in a Neural Network	https://arxiv.org/abs/1503.02531	https://cartinoe5930.tistory.com/entry/Distilling-the-Knowledge-in-a-Neural-Network-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0
What is Zero-shot, One-shot, Few-shot Learning?	see my blog!	https://cartinoe5930.tistory.com/entry/Zero-shot-One-shot-Few-shot-Learning%EC%9D%B4-%EB%AC%B4%EC%97%87%EC%9D%BC%EA%B9%8C

Name		Name	Last commit message	Last commit date
Latest commit History 120 Commits
Computer Vision		Computer Vision
Multimodal Models		Multimodal Models
Natural Language Processing		Natural Language Processing
README.md		README.md

gauss5930/Deep-Learning-Paper

Folders and files

Latest commit

History

Repository files navigation