[20230326] Weekly AI ArXiv 만담 시즌2 - 11회차 #77

scene-the-ella · 2023-03-24T09:35:13Z

No description provided.

jungwoo-ha · 2023-03-24T15:26:15Z

News

Conferences
- ICML 2023 and ACL 2023 Rebuttal 모두 수고 많으셨습니다.
- ICML author-reviewer discussion 기간이에요~ (26일까지)
- NeurIPS 2023 Datasets & Benchmarks Track: 6월 1일 (ABS) // 6월 7일 (Full)
ChatGPT Plugin
- ChatGPT 생태계가 만들어지는 큰그림 (Text-to-any app action)
- ChatGPT가 API를 호출할 수 있도록 만들어 그 자체로 다양한 앱을 연동토록 가능 (ChatGPT에 손발눈귀가 모두 달린다)
- 다양한 앱들을 통해 개인의 (앱활용) 데이터는 물론 ChatGPT를 통해 NLP-to-모든 액션이 가능하게...
- Third party Plugin 개발자들에게 새로운 application 기회를 제공, 이제 LM자체가 OS화
- 그렇게 해서 쌓인 NL-to-action 데이터들은 다시 ChatGPT 고도화에 활용...
- 브라우저(WebGPT를 한참 넘어선)와 python interpreter 제공
- Retrieval plugin은 오픈소스로 제공
- Partner third-party
- 현재 AI분야에선 연구와 프러덕트 간격이 엄청 좁다는 것 그래서 겨울이 올 가능성이 매우 낮아진 것을 보여주는 사례
- 대항해시대를 뛰어넘는 대AI시대의 서막 (모든 사용자와 서비스가 AI를 통해 초연결되는 시대)
[The growing influence of industry in AI research]
- 기본적으로 AI연구분야에서 산업계의 영향력이 점점 커지고 있다.
- 학교에서 산업계로 옮기는 교수들도 많아지고 있다 (창업의 영향도 있고 겸직의 영향도 있고)
- Large AI 쪽은 압도적으로 산업계 중심으로 흘러가고 있다.
- 결국 연구는 학교 & 활용은 산업 의 공식은 깨졌다. 연구도 활용도 같이 하는 체제를 모색해야..
GPT4-ChatGPT vs. Bing vs. Bard by The Verge
Bill Gates의 The Age of AI has begun
- 한국어 번역본 by 하이퍼클로바

ArXiv

Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
- Stable diffusion으로 비디오 데이터 추가 학습없이 zero shot 으로 text-to-video 만듬
- DDPM foward로 temporal dynamics를 만들 낸다고 그리고 frame별 consistency를 위해 conv와 cross-attn 사이에 cross-frame attention layer를 추가하고 첫째 프레임을 key, value로 나머지 프레임들을 query로 활용
- 일단 512x512 8 frame 이 메인 실험
- ControlNet에도 붙일수 있고 Instruct Video2Video 같은 응용도 가능하다고
- Code: https://github.com/Picsart-AI-Research/Text2Video-Zero (코드는 아직)

veritas9872 · 2023-03-25T11:25:24Z

News:
ChatGPT Plugins
Blog: https://openai.com/blog/chatgpt-plugins

ChatGPT에서 인터넷 접속 및 외부 앱 연결을 위한 plugin API를 공개했습니다.
현재 안전 이슈 등으로 소수의 유저에게만 공개하고 있으며 waiting list에 등록해야 합니다.

Scaling in the service of reasoning & model-based ML
Blog: https://yoshuabengio.org/2023/03/21/scaling-in-the-service-of-reasoning-model-based-ml

Yoshua Bengio 옹께서 Large Language Model이 System 2 cognition을 하기 위해 현재 LLM의 한계 및 앞으로 나아갈 방향에 대해 논의합니다.

Hello Dolly: Democratizing the magic of ChatGPT with open models
Blog: https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html

DataBricks 연구팀에서 GPT-J 6B 모델을 기반으로 작은 fine-tuning dataset에 Instruction find-tuning을 실행했을 때 무려 30분만에(!) ChatGPT와 유사하게 instruction에 대해 응답하도록 유도할 수 있었습니다. GPT-J 6B는 공개되고 2년이 지난 모델이라는 것을 감안한다면 Instruction에 대해 답변하는 것은 거대 모델의 역량도 중요하지만 fine-tuning 과정 또한 매우 중요하다는 것을 보여줍니다.

Software:
RAFT: Reusable Accelerated Functions and Tools

GitHub: https://github.com/rapidsai/raft
Docs: https://docs.rapids.ai/api/raft/stable
Blog: https://developer.nvidia.com/blog/reusable-computational-patterns-for-machine-learning-and-data-analytics-with-rapids-raft

NVIDIA에서 데이터 사이언스 및 머신러닝을 위한 primitves 라이브러리를 공개했습니다.
딥러닝이 지난 10년 동안 급부상했지만 여전히 추천시스템 및 tabular data에서 고전적 clustering 등 머신러닝 알고리즘을 사용하는 경우가 많은데 NVIDIA에서 low-level optimized 라이브러리를 제공해주어 GPU 가속화를 편리하게 사용할 수 있습니다. 특히, RAFT는 기존 cuML에 비해 low-level composability가 높기 대문에 새로운 clustering algorithm 등을 개발할 때 더 편리할 것으로 생각됩니다.

Research:
When and Why Vision-Language Models Behave like Bags-Of-Words, and What to Do About It?
OpenReview: https://openreview.net/forum?id=KRLUvxh8uaX
GitHub: https://github.com/mertyg/vision-language-models-are-bows
Blog: https://towardsdatascience.com/your-vision-language-model-might-be-a-bag-of-words-30b1beaef7f8

조~금 오래되었지만 ICLR2023 Oral Session 선정 논문이어서 공유합니다.
해당 연구에서는 기존CLIP과 같은 Vision Language Model (VLM)은 단어의 순서에 영향을 받지 않는 Bag-of-Words 모델과 유사한 행동을 보이며 이런 문제를 해결하기 위해 어순의 영향을 받는 hard negative case mining을 적용할 경우 문장의 의미를 이해하도록 학습할 수 있음을 보여줍니다.

Erasing Concepts from Diffusion Models
ArXiv: https://arxiv.org/abs/2303.07345
GitHub: https://github.com/rohitgandikota/erasing

MIT David Bau 교수님 연구실에서 나온 최근 연구입니다.
최근 Text-to-Image diffusion 모델에서 바람직하지 않은 출력물을 생성할 수 있는 문제가 대두되고 있는데 모델 weight 자체에서 원치 않는 개념을 제거할 방법을 제시합니다.

하나의 Frozen Stable Diffusion (SD) 모델과 하나의 Erased Stable Diffusion (ESD) 모델을 활용하여 원치 않는 개념에 대한 prompt를 제공한 경우와 제공하지 않은 경우에 대해 일치하도록 학습함으로써 추후 저작권 등을 위배하는 prompt를 제공하더라도 마치 해당 개념이 모델에서 삭제된것과 마찬가지의 출력을 제공하도록 합니다.

CoLT5: Faster Long-Range Transformers with Conditional Computation

ArXiv: https://arxiv.org/abs/2303.09752

Google Research에서 long sequence input에 대한 연산을 축소하기 위한 방법을 제시했습니다.
Transformer는 $O(N^2)$ attention computation 뿐만 아니라 memory-bound의 FFN 연산으로 인해 long sequence에 대해 latency가 매우 높다는 문제점을 가지고 있는데 이를 해결하기 위해 light와 heavy network를 분리하여 문장의 중요한 부분과 중요하지 않은 부분을 구별해서 conditional attention을 적용합니다.

Meet in the Middle: A New Pre-training Paradigm

ArXiv: https://arxiv.org/abs/2303.07295
GitHub: https://github.com/microsoft/Meet-in-the-Middle

Microsoft Azure AI에서 나온 연구로 새로운 LLM pre-training task를 제시합니다.
현재의 NLG LLM은 다음 token을 예측하는 pre-training task를 주로 사용하는데 해당 과정은 연산이 많이 사용된다는 단점이 있지만 in-context learning와 같은 emergent behavior를 유도할 수 있다는 장점이 있습니다.
해당 연구에서는 대신 좌에서 우로 생성하는 모델과 우에서 좌로 생성하는 paired model을 통해서 학습 효율 및 성능을 개선합니다.

jwlee-neubla · 2023-03-26T12:04:24Z

Arxiv

The effectiveness of MAE pre-pretraining for billion-scale pretraining

Meta AI
MAE를 이용하여 computer vision에서 늘 사용되는 pretrain-then-finetune paradigm을 개선
pre-pretraining : MAE를 이용한 SSL(1 epoch) -> weak and noisy label을 이용한 WSP(Weakly Suprevised Pretraining)
MAE가 model 뿐만 아니라 data에도 scalabilty를 가짐
iNaturalist-18(91.3%), 1-shot ImageNet-1k(62.1%, zero-shot transfer on Food-101(96.0%)에서 SOTA 달성

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

MIT + Nvidia
LLM quantization에서 activiation에 outlier가 존재해서 어려움
LLM.int8()은 outlier를 FP16으로 처리했는데, memory는 줄일 수 있지만 속도가 느림
activation outlier로 인한 quantization difficulty 를 weight 로 좀 넘겨주자

nick-jhlee · 2023-03-26T12:23:58Z

For my small comment:

Efficient fair PCA for fair representation learning (AISTATS 2023)

from Amazon AWS
efficient ver of fair PCA
(see Fig 5)

cf:

Null it out: Guarding protected attributes by iterative nullspace projection (ACL 2020)
Linear adversarial concept erasure (ICML 2022)

nick-jhlee · 2023-03-26T12:43:54Z

For next time:

Sparks of Artificial General Intelligence: Early experiments with GPT-4 (Microsoft Research)

교훈: latex에서 comment를 뺍시다!
https://twitter.com/dv2559106965076/status/1638769434763608064?s=46&t=llKohaNYR1IR_yaWlq40TA

“Given the breadth and depth of GPT-4's capabilities, we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system.”

Theoretical analyses of language models!!!

A Kernel-Based View of Language Model Fine-Tuning (~~rejected from ICLR 2023~~)

Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers (arXiv 2022)

The Learnability of In-Context Learning (arXiv 2023)

A Theory of Emergent In-Context Learning as Implicit Structure Induction (arXiv 2023)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[20230326] Weekly AI ArXiv 만담 시즌2 - 11회차 #77

[20230326] Weekly AI ArXiv 만담 시즌2 - 11회차 #77

scene-the-ella commented Mar 24, 2023

jungwoo-ha commented Mar 24, 2023 •

edited

Loading

veritas9872 commented Mar 25, 2023 •

edited

Loading

jwlee-neubla commented Mar 26, 2023 •

edited

Loading

nick-jhlee commented Mar 26, 2023 •

edited

Loading

nick-jhlee commented Mar 26, 2023 •

edited

Loading

[20230326] Weekly AI ArXiv 만담 시즌2 - 11회차 #77

[20230326] Weekly AI ArXiv 만담 시즌2 - 11회차 #77

Comments

scene-the-ella commented Mar 24, 2023

jungwoo-ha commented Mar 24, 2023 • edited Loading

News

ArXiv

veritas9872 commented Mar 25, 2023 • edited Loading

jwlee-neubla commented Mar 26, 2023 • edited Loading

Arxiv

The effectiveness of MAE pre-pretraining for billion-scale pretraining

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

nick-jhlee commented Mar 26, 2023 • edited Loading

nick-jhlee commented Mar 26, 2023 • edited Loading

jungwoo-ha commented Mar 24, 2023 •

edited

Loading

veritas9872 commented Mar 25, 2023 •

edited

Loading

jwlee-neubla commented Mar 26, 2023 •

edited

Loading

nick-jhlee commented Mar 26, 2023 •

edited

Loading

nick-jhlee commented Mar 26, 2023 •

edited

Loading