What makes ImageNet good for transfer learning? #104

chullhwan-song · 2019-02-28T02:09:31Z

https://arxiv.org/abs/1608.08614
https://github.com/minyoungg/wmigftl

chullhwan-song · 2019-02-28T02:13:05Z

what ? transfer learning (finetuning) 했을때 얼마나 좋아지는 기준을 찾아보자란게 취지

실제적으로 imagenet 데이터가 fine한가에 대한 질문인듯.~ㅎ
주의) 여기서는 finetuning에 쓸 pre-trained 모델이 기준임

실험 1 : How many pre-training ImageNet examples are sufficient for transfer learning? Pre-training with fewer instances per class

imagenet에 클래스마다 이미지 수
- 왼쪽 오른쪽 y축이 중요함
  - 왼쪽 y축은 finetuning 아닌 imagenet인경우는 당연히 image classification task이지 않을까?
- For PASCAL-DET, the mean average precision (mAP) for CNNs with 1000, 500 and 250 images/class is found to be 58.3, 57.0 and 54.6.
  - ? pre-trained에 대해 클래스당 이미지수가 적으면 성능저하를 이끔. 이는 imagenet 성능자체가 그런데 transfer learning task의 경우는 완만한 편임.

How many pre-training ImageNet classes are sufficient for transfer learning? Pretraining with fewer classes

imagenet 클래스 수

3. How important is fine-grained recognition for learning good features for transfer learning?

imagenet이 1000개 아닌 보다 작은수의 학습셋(여기서는 127) 으로 pre-trained 모델을 만들어보자.
More Classes vs More Examples Per Class : 127 클래스 vs 1000 클래스
- 별차이가 없더라, transfer learning은 1000 > 127개 줄인것으로도 비슷한 성능을 낼수 있음을 보여줌.

4. Does pre-training on coarse classes produce features capable of fine-grained recognition (and vice versa) on ImageNet itself?

How good are the embeddings at nearest neighbor tasks?
127 클래스 vs 1000 클래스
클래스에 포함되어 있지 않는 query 에 대한 실험
- 검색 점선위는 finetuning하는데 있어서 클래스가 포함된 퀄리 결과 vs 그렇지 않는 결과
  - 참고로) induction accuracy 단어가 나오는데 이는, 쿼리에 대한 클래스가 포함되지 않는 데이터로 학습된 모델을 이용하여 검출결과에 대한 측정을 의미.
  - 학습된 모델에 대한 카테고리를 달리하면서 측정
- query가 학습모델에 클래스가 포함되어 있지 않더라도.. 어느정도 클래스 수의 크기(많을수록 좋은거 같음)가 있더라면, 별문제없이 잘 찾음을 볼수 있음.
  - baseline vs induction 비교
  - top 1 vs top 5
  - The results show that features learnt by pre-training on just 127 classes still lead to fairly good induction. 요런말이 나온것보니, 그래프상으론 실감이 안나지만(?ㅎ) 1/10줄이고 induction 측정인데도 불구하고 10 vs 7.5 정도면 괜찮치 않는 결과 아니냐는 의미인듯!

5. Given the same budget of pre-training images, should we have more classes or more images per class? > 클래스가 많은게 좋은것인가??

6. Is more data always helpful?

모든 PASCALVOC 클래스를 제외하는 771 개의 ImageNet 클래스 (1000 개 중)를 사용한 교육은 오리지널 ImageNet에 대한 교육과 PASCAL-DET에서 거의 동일한 성능을 달성 > redundant 하다는 데이터가 많다는 의미인가?ㅎ

결론

● We know that we should use at least 500k images and at least 127 classes
● It will probably work well to skip unrelated classes.
● We also know that labeled pretraining seems to outperform other methods.

정작 필요한, pre-trained imagent 모델이 존재한후(고정시킨후) fine-tuning할때의 학습셋의 구성을 어떻게 해야할것인가에 대해서는 ㅠ없고 pre-trained imagent 데이터셋에 더 촛점이..간 논문임.
내가 알고싶은 imbalance한 데이터(클래스당 큰것도 있고 작은것도 있는경우)이고 학습 클래스도 많고 물론 데이터 사이즈는 큰 경우는 해당이...ㅠ

정리

데이터의 크기 - imagenet 모두 (120 만 개의 레이블이있는 이미지)을 사용하는게 정말 일반적인가??
- 아니다, 절반만으로도 fine-tuning이 충분하다.
클래스의 수 - 1000개를 그냥 사용하는것이 일반적인가?
- 아니다, 절반의 클래스로도 충분하다.
클래스의 내용 - 세밀한 개종류를 인식하고자 한다면, 이런 유사한 구성이 필요하냐?
- 아니다, 이와 다른 시각적으로 개아 아니더래도(다른류의 이미지)로도 충분하다.

chullhwan-song added Deep Learning Transfer Learning labels Feb 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What makes ImageNet good for transfer learning? #104

What makes ImageNet good for transfer learning? #104

chullhwan-song commented Feb 28, 2019

chullhwan-song commented Feb 28, 2019

What makes ImageNet good for transfer learning? #104

What makes ImageNet good for transfer learning? #104

Comments

chullhwan-song commented Feb 28, 2019

chullhwan-song commented Feb 28, 2019

what ? transfer learning (finetuning) 했을때 얼마나 좋아지는 기준을 찾아보자란게 취지

실험 1 : How many pre-training ImageNet examples are sufficient for transfer learning? Pre-training with fewer instances per class

How many pre-training ImageNet classes are sufficient for transfer learning? Pretraining with fewer classes

3. How important is fine-grained recognition for learning good features for transfer learning?

4. Does pre-training on coarse classes produce features capable of fine-grained recognition (and vice versa) on ImageNet itself?

5. Given the same budget of pre-training images, should we have more classes or more images per class? > 클래스가 많은게 좋은것인가??

6. Is more data always helpful?

결론

정리