[20210620] Weekly AI ArXiv 만담 #14

jungwoo-ha · 2021-06-19T05:50:08Z

News
- CVPR2021 start !
  - NAVER 14개 발표 스케쥴: https://europe.naverlabs.com/updates/ieee-conference-on-computer-vision-and-pattern-recognition-cvpr-2021/
- ICCV 2021 Rebuttal 모두 수고하셨습니다.
- AI Hub 데이터댐 개방
  - https://news.naver.com/main/read.nhn?mode=LSD&mid=sec&sid1=105&oid=421&aid=0005423958
  - https://aihub.or.kr/
- 6월 24일 10시 제3회 AI 미래포럼 웨비나: AI의 기대와 현실 2부
ArXiv
- BEiT: BERT Pre-Training of Image Transformers
  - ViT식 학습이 아닌 BERT식 (MLM 스타일) 학습 (MSR)
  - https://github.com/microsoft/unilm/tree/master/beit (언제 올라올지 모르지만..)
  - 근데.. 저자들이 도지코인하나...
- Keep CALM and Improve Visual Feature Attribution
  - NAVER AI Lab + U of Tuebingen.
  - 기존 CAM의 해석하기 어려움을 완화한 근본있는 방법: 훈련할 때랑 inference할때랑 사용하는 게 다름.
  - 이미지 x가 주어질 때 클래스 y 와 어딜보고 클래스를 판단하는지의 위치정보 z 를 joint training하고 inference 때도 이 확률에 근거
  - https://github.com/naver-ai/calm
- Learning to See by Looking at Noise
  - from MIT
  - 실제 데이터는 구하기도 어렵고 bias도 있고 프라이버시 문제도 있으니 아예 노이즈기반 생성이미지 만으로도 self-supervised?
  - 그래서 다양한 합성데이터로 실험해봄. (MoCo-v2, ImageNet-100 으로 평가). 예상외로 좀 동작하네..
  - https://mbaradad.github.io/learning_with_noise/
- LoRA: Low-Rank Adaptation of Large Language Models
  - Low-rank 활용 GPT-3 fine-tuning 효율 극대화
  - 원래 GPT-3 finetuning은 모델이 너무 커서 비효율. 추가 파라미터 약간쓰고 본체는 freezing
  - Prefix tuning 보다 성능 좋음.
  - https://github.com/microsoft/LoRA
- WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution
  - Glow기반의 Audio super-resolution (12kHZ --> 48 kHz) (Interspeech2021)
  - 샘플: https://zkx06111.github.io/wsrglow/
- Watching Too Much Television is Good: Self-Supervised Audio-Visual Representation Learning from Movies and TV Shows
  - 제목보고 들어가니 Neflix 연구 ㅋㅋ
  - contrastive learning을 위한 negative sample 만들때 영화, TV show의 특성을 고려(스토리 톡성상 시간축으로 iid가 아님) audio-image frame을 pairing할 때 동일 snippet에서, 같은 컨텐츠만 다른 snippet에서, 다른 컨텐츠에서 를 구분하는 전략
  - 영화는 3.6K개 (평균 105분), TV Show는 9.2K 에피소드 (평균 42분), curation 안함.
  - 파인튜닝은 UFC101, HMDB51, ESC50
  - 대량의 Movie와 TV Show uncurated 데이터로 충분히 좋은 성능 낼 수 있음. 물론 curation을 이기긴 쉽지 않지만 curation의 노가다를 고려하면..
- The Oxford Road Boundaries Dataset
  - https://oxford-robotics-institute.github.io/road-boundaries-dataset/
- An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models
  - PLM finetuning시 하이퍼파라미터 최적화 실험 노가다 연구
  - option: LR, warm up, attention do, fc do, WD, BS, epoch for Electra and RoBERTa on GLUE
  - https://github.com/microsoft/FLAML/tree/main/flaml/nlp/
- X-FACT: A New Benchmark Dataset for Multilingual Fact Checking
  - 25개 국어 대규모 fact checking 데이터 (ACL 2021)
  - https://github.com/utahnlp/x-fact/
- A Random CNN Sees Objects: One Inductive Bias of CNN and Its Applications
  - Random init CNN을 data augmentation-based contrastive SSL 에 활용
  - Random CNN써도 foreground / background 기본 분리는 해내기 때문에 data aug에 활용한다고
  - SimCLR랑 MoCo에 얹어서 비교

nick-jhlee · 2021-06-20T03:18:58Z

Very Deep Graph Neural Networks Via Noise Regularisation
- DeepMind (Graph Representation의 rising 스타, Petar Velickovic 참여...!)
- GNN을 deep하게 마구마구 쌓으면 여러 문제가 생김! -> oversmoothing (Chen et al., AAAI'20), bottleneck from over-squashing (Alon & Yahav, ICLR'21)
  - ANN에서도 너무 deep하게 쌓으면 문제가 많음 -> gradient vanishing/explosion, slow training...etc.
- Noisy node regularisation을 제안함:
  - input graph의 (node, edge, graph attribute)를 perturb함
  - graph property prediction이면 autoencoding loss 추가 (~ denoising autoencoder, DAE)
  - ANN에서 Gaussian noise injection과 상당히 유사함
- " Our results show this regularisation method allows the model to monotonically improve in performance with increased message-passing steps."
- "We demonstrate that graph networks of large depth outperform shallower models even on small graphs, such as QM9, and can lead to substantial gains on very difficult datasets such as OC20. "
- 여기서도 정확히 어떤 regularisation 효과가 있는지 수학적으로 밝히는게 하나의 future work...?
  - 실제로 ANN에서 Gaussian noise injection의 경우에는, Fourier domain에서 high frequency를 penalize하는 효과가 있다는 것이 밝혀짐 (Camuto et al., NeurIPS'20)

Do Transformers Really Perform Bad for Graph Representation?
- Dalian University of Technology, Princeton, Peking, Microsoft Research Asia
- GNN에서 transformer가 생각보다 SOTA를 잘 못 achieve 함... (~~다 잘하면 사기캐긴하죠~~)
- 그래서. Graphormer 등-장
- "Centrality Encoding", "Spatial Encoding"

Does Knowledge Distillation Really Work?
- NYU (Bayesian Deep Learning의 rising 스타, AGW 교수님 참여...!), Google Research
- KD 잘 되는건 모두가 아는 사실. BUT, 이게 진짜 우리가 생각한대로 작동하는걸까??
- student-teacher agreement와 good generalization of student를 분리함!
- "student가 teacher를 배운다"가 맞는건가...?
  - NOPE! 실제로 independently trained model과 다를바가 없을 정도로 student와 teacher 사이의 agreement가 매우 낮다...!
- student가 teacher를 배울 만큼 capacity가 없는건 아닌가?
  - NOPE! 아무런 상관이 없고, 실제로 self-distillation의 success가 있으려면 student가 반항아가 되어야함
- (기타 등등)
- 결론:

FedBABU: Towards Enhanced Representation for Federated Image Classification
- KAIST (~~저를 뽑아주신~~ 윤세영 교수님 참여...!)
- FL에는 두 방향이 있음: a model that works for everyone decently or models that are personalized for each individual
- 하지만 FedAvg에서 보니, 서로 충돌함...! (better global => no improvement in personalization)
- Why?? head (classifier)를 각자 다르게 training을 해서 문제가 생김!!
- Solution: head를 절대 training 시키지 말고 random하게 놔둔다. 즉 body만 update!
  - 각 client마다, classifier 전까지만 training을 시켜서 실제적으로 representation만 바꾸는 것!
  - Representation learning에서 비슷한 방향성을 가지고 ICLR'21에서 BOIL을 제안했었음...! (물론, BOIL에선 outer loop에서 classifier update를 하지만, FedBABU에선 거의 고정)
- "이거의 장점은, client들이 서로 다른 data구성을 가지는 상황에서 representation을 학습하는 것이 제각각이 될 위험이 있는데 이건 고정된 기준점을 가지고 학습을 하기에 좀 더 효과적으로 배울 수 있게 된다."
- " Extensive experiments show consistent performance improvements and an efficient personalization of FedBABU."

veritas9872 · 2021-06-20T10:38:27Z

Knowledge distillation: A good teacher is patient and consistent
https://arxiv.org/abs/2106.05237
Google Research에서 Knowledge Distillation을 사용할 때 어떤 요소가 중요한지 분석해준 실험 논문입니다.
Knowledge Distillation은 많이 사용되는 만큼 어떻게 할때 잘 되는지 알면 도움이 많이 될 것 같습니다.

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better
https://arxiv.org/abs/2106.08962
제목 그대로 딥러닝을 더 효율적으로 학습하고 inference하는 방법들을 한군데에 정리한 Google Research의 리뷰 페이퍼입니다.
대부분 한번쯤 들어보셨을 내용이지만 한군데에 정리되어 있다는게 도움이 되어 공유합니다.

hollobit · 2021-06-20T11:24:16Z

4차 산업혁명 시대, 인공지능(AI)과 빅데이터 분야 표준화 시급 - 국가기술표준원

https://www.kats.go.kr/content.do?cmsid=240&mode=view&page=&cid=22436

보건의료 데이터‧인공지능 혁신전략(’21～’25년) 수립 - 보건복지부 (6/3)

http://www.mohw.go.kr/react/al/sal0301vw.jsp?PAR_MENU_ID=04&MENU_ID=0403&page=1&CONT_SEQ=365938

G7 보건의료 관계장관회의 성명 요약 (6/4)

https://www.facebook.com/1biit/posts/10159892660191410

AI can now convincingly mimic cybersecurity experts and medical researchers

https://theconversation.com/study-shows-ai-generated-fake-reports-fool-experts-160909

인공지능 육성 및 신뢰 기반 조성 등에 관한 법률안 입법공청회 (6/18)

hollobit · 2021-06-20T13:28:12Z

Data-Centric AI Competition
https://https-deeplearning-ai.github.io/data-centric-comp/

jungwoo-ha closed this as completed Jul 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[20210620] Weekly AI ArXiv 만담 #14

[20210620] Weekly AI ArXiv 만담 #14

jungwoo-ha commented Jun 19, 2021 •

edited

Loading

nick-jhlee commented Jun 20, 2021 •

edited

Loading

veritas9872 commented Jun 20, 2021

hollobit commented Jun 20, 2021

hollobit commented Jun 20, 2021

[20210620] Weekly AI ArXiv 만담 #14

[20210620] Weekly AI ArXiv 만담 #14

Comments

jungwoo-ha commented Jun 19, 2021 • edited Loading

nick-jhlee commented Jun 20, 2021 • edited Loading

veritas9872 commented Jun 20, 2021

hollobit commented Jun 20, 2021

hollobit commented Jun 20, 2021

jungwoo-ha commented Jun 19, 2021 •

edited

Loading

nick-jhlee commented Jun 20, 2021 •

edited

Loading