Awesome Fine-Grained Image Classification

I tried to condense the (main) contributions (or the used methodology) from each paper into a line or two to observe trends across years.

Also made a companion website on GitHub Pages with summaries of all papers for a year + 1-slide summary of close to 200 surveyed papers.

Paper scraping description in link.

If you have any problems, suggestions or improvements, please submit the issue or PR.

Surveys

Fine-Grained Image Analysis With Deep Learning: A Survey. [Paper]
A survey on deep learning-based fine-grained object classification and semantic segmentation. [Paper]

Papers

2024

SaSPA: Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation. Michaeli E / Fried O. Reichman U, IL. arXiv 24/06. [Paper] [Project Page] [Code]
- Class-consistent data augmentations through pipeline consisting of GPT-4 prompts and ControlNET + BLIP-Diffusion

2023

Fine-Grained Visual Classification via Internal Ensemble Learning Transformer. Xu Q / Luo B. Anhui University, CN. Transactions on Multimedia 2023. [Paper]
- Select intermediate tokens based on head-wise attention voting average + gaussian kernel -> multi-layer refinement, dynamic ratio of intermediate layers contributions for refinement modules
Dual Transformer with Multi-Grained Assembly for Fine-Grained Visual Classification. Ji RY / Wu YJ. Chinese Academy of Sciences, CN. TCSVT 23. [Paper]
- Early crop based on 1st layer attention, attention to select tokens from intermediate features, cross-attention for interactions between CLS token of global and crops and features of other branch
Fine-grained Classification of Solder Joints with {\alpha}-skew Jensen-Shannon Divergence. Ulger F / Gokcen D. TCPMT 23. [Paper]
- Maximize entropy to penalize overconfidence
Shape-Aware Fine-Grained Classification of Erythroid Cells. Wang Y / Zhou Y. JLU, CN. Applied Intelligence 23. [Paper]
- Dataset and method for fine-grained erythroid cell classification
Test-Time Amendment with a Coarse Classifier for Fine-Grained Classification. Jain K / Gandhi V. IIIT Hyderabad, IN. arXiv 2023/02. [Paper]
- Hierarchical prediction by taking into account predictions from coarser levels (multiplication of scores)
Semantic Feature Integration network for Fine-grained Visual Classification. Wang H / Luo HC. Jiangnan U, CN. arXiv 23/02. [Paper]
- Intermediate predictions classifiers + loss (similar to SAC arXiv 22 and PIM arXiv 22) + sequence of modules to refine most discriminative intermediate features
Learning Common Rationale to Improve Self-Supervised Representation for Fine-Grained Visual Recognition Problems. Shu YY / Hengel AVD / Liu LQ. U of Adelaide, AU. arXiv 23/03. [Paper]
- Extends SAM (ECCV 22) for self-supervised setting (add GradCAM branch trained with KD loss to predict discriminative regions
Fine-grained Visual Classification with High-temperature Refinement and Background Suppression. Chou PY / Lin CH. National Taiwan Normal U, TW. arXiv 23/03. [Paper]
- Extends PIM (arXiv 22) with loss to supress background (predict -1 for background regions) + KD loss between two inter classifiers

2022

MetaFormer: A Unified Meta Framework for Fine-Grained Recognition. Diao QS / Yuan Z. ByteDance, CN. arXiv 22/03. [Paper]
- Incorporate multimodality data as extra information (date, location, text, attributes, etc)
Dynamic MLP for Fine-Grained Image Classification by Leveraging Geographical and Temporal Information. Yang LF / Yang J. Nanjing U of S&T, CN. CVPR 2022. [Paper]
- Incorporate metadata (date/loc)
Dual Cross-Attention Learning for Fine-Grained Visual Categorization and Object Re-Identification. Zhu HW / Shan Y. AMD, CN. CVPR 22. [Paper]
- Cross-attention between selected queries and all keys/values for refinement + cross-attention for regularization (mix queries/keys/values from two images)
SIM-Trans: Structure Information Modeling Transformer for Fine-grained Visual Categorization. Sun HB / Peng YX. Peking U, CN. ACM MM 22. [Paper]
- Refine attention selected tokens using GCN & polar coordinates + contrastive loss for last 3 layers
A Novel Plug-in Module for Fine-Grained Visual Classification. Chou PY / Kao WC. National Taiwan Normal U, TW. arXiv 22/02. [Paper]
- Intermediate classifier distribution sharpness as metric to select intermediate features + GCN to combine
ViT-NeT: Interpretable Vision Transformers with Neural Tree Decoder. Kim SW / Ko BC. Keimyung U, SK. ICML 22. [Paper]
- Binary tree with differentiable routing and refinement at each node/leaf
Fine-Grained Object Classification via Self-Supervised Pose Alignment. Yang XH / Tian YH. Peng Cheng Lab, CN. CVPR 22. [Paper]
- Intermediate features classifiers with different label smoothing levels and graph matching to align parts for contrastive learning
On the Eigenvalues of Global Covariance Pooling for Fine-grained Visual Recognition. Song Y / Wang W. U of Trento, IT. TPAMI 22. [Paper]
- Second order methods (B-CNN) weaknesses: small eigenvalues so propose scaling factor to magnify
Improving Fine-Grained Visual Recognition in Low Data Regimes via Self-Boosting Attention Mechanism. Shu YY / Liu LQ. U of Adelaide, AU. ECCV 22. [Paper]
- KL divergence between CAMs and convolutional projection as auxiliary task
SR-GNN: Spatial Relation-aware Graph Neural Network for Fine-Grained Image Categorization. Bera A / Behera A. BITS, IN / Edge Hill U, UK. TIP 22. [Paper]
- Divide into regions, refinement with GNN and SA
Cross-Part Learning for Fine-Grained Image Classification. Liu M / Zhao Y. Beijing Jiaotong University, CN. TIP 2022. [Paper]
- Multi-stage processing and localization (object -> parts) + refinement
Convolutional Fine-Grained Classification With Self-Supervised Target Relation Regularization. Liu KJ / Jia K. South China U of Technology, CN / Peng Cheng Lab, CN. arXiv 22/08. [Paper]
- Class center + distance between graphs as self-supervised loss
R2-Trans: Fine-Grained Visual Categorization with Redundancy Reduction. Wang Y / You XG. Huazhong U, CN. arXiv 22/04. [Paper]
- Mask tokens based on attention + information theory inspired loss
Knowledge Mining with Scene Text for Fine-Grained Recognition. Wang H / Liu WY. Huazhong U of Science and Technology, CN / Tencent, CN. CVPR 22. [Paper]
- Incorporate wikipedia knowledge from scene text as additional data
Fine-Grained Visual Classification using Self Assessment Classifier. Do T / Nguyen A. AIOZ, SN / U of Liverpool, UK. arXiv 22/05. [Paper]
- Predict once, augment top-k predictions with class text names to predict again
Exploiting Web Images for Fine-Grained Visual Recognition via Dynamic Loss Correction and Global Sample Selection. Liu HF / Xiu WS / Tang ZM. Nanjing U of S&T, CN. TMM 2022. [Paper]
- Web images for fine-grained recognition
Cross-layer Attention Network for Fine-grained Visual Categorization. Huang RR / Yang HZ. Tsinghua U, CN. arXiv 22/10 / CVPR 22 FGVC8 Workshop. [Paper]
- Refine intermediate features with top-level and top-level with intermediate features
Anime Character Recognition using Intermediates Feature Aggregation. Rios EA / Lai BC. National Yang Ming Chiao Tung U, TW. ISCAS 22. [Paper]
- Concatenate ViT intermediate CLS tokens and forward through fully connected layer to aggregate intermediate features + incorporate tag information as additional data.
Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism. Chen H / Ling W. Guangdong U of T, CN. Applied Intelligence 2022. [Paper]
- Attention map filtering and multi-scale
Bridge the Gap between Supervised and Unsupervised Learning for Fine-Grained Classification. Wang JB / Wei XS / Zhang R. Army Engineering U of PLA, CN / Nanjing U, CN. arXiv 22/03. [Paper]
- Study on unsupervised fine-grained (no labels, clustering-based)
PEDTrans: A fine-grained visual classification model for self-attention patch enhancement and dropout. Lin XH / Chen YF. China Agricultural U, CN. ACCV 22. [Paper]
- Patch dropping based on similarity (outer product/bilinear pooling) + refinement of patches before transformer
Iterative Self Knowledge Distillation -- from Pothole Classification to Fine-Grained and Covid Recognition. Peng KC. Mitsubishi MERL, US. ICASSP 22. [Paper]
- Use student from previous iteration as teacher, recursively
Fine-grain Inference on Out-of-Distribution Data with Hierarchical Classification. Linderman R / Chen Y. Duke U, US. NeurIPS 22 Workshop. [Paper]
- Hierarchical OOD fine-grained with inference stopping criterion
Semantic Guided Level-Category Hybrid Prediction Network for Hierarchical Image Classification. Wang P / Qian YT. Zhejiang University, CN. arXiv 2022/11. [Paper]
- Hierarchical prediction taking into account “quality” (noise, occlusion, blur or low resolution) to decide classification level
Data Augmentation Vision Transformer for Fine-grained Image Classification. Hu C / Wu WJ. Unknown affiliation. arXiv 22/11. [Paper]
- Crops based on single-layer (5th) attention + TransFG’s PSM module between 2 layers (recursive matrix-matrix attention)
Medical applications (COVID, kidney pathology, renal and ocular disease):
- Self-supervision and Multi-task Learning: Challenges in Fine-Grained COVID-19 Multi-class Classification from Chest X-rays. Ridzuan M / Yaqub M. MBZUAI, AE. MIUA 22. [Paper]
- Automatic Fine-grained Glomerular Lesion Recognition in Kidney Pathology. Nan Y / Yang G. Imperial College London, UK. Pattern Recognition 22. [Paper]
- Holistic Fine-grained GGS Characterization: From Detection to Unbalanced Classification. Lu YZ / Huo YK. Vanderbilt U, US. Journal Medical Imaging 2022. [Paper]
- CDNet: Contrastive Disentangled Network for Fine-Grained Image Categorization of Ocular B-Scan Ultrasound. Dan RL / Wang YQ. Hangzhou Dianzi U, CN. arXiv 22/06. [Paper]
Snake competition methodologies:
- Solutions for Fine-grained and Long-tailed Snake Species Recognition in SnakeCLEF 2022. Zou C / Cheng Y. Ant Group, CN. Conference and Labs of the Evaluation Forum 2022. [Paper]
- Explored An Effective Methodology for Fine-Grained Snake Recognition. Huang Y / Feng JH. Huazhong U of Science and T, CN / Alibaba, CN. CLEF 22. [Paper]

2021

First ViTs for FGIR:
- TransFG: A Transformer Architecture for Fine-Grained Recognition. He J / Wang CH. Johns Hopkins U / ByteDance. arXiv 21/03 / AAAI 22. [Paper]
  - First to apply ViT for FGIR: overlapping patchifier convolution, recursive layer-wise matrix-matrix multiplication to aggregate attention and select features from last layer, contrastive loss
- Feature Fusion Vision Transformer for Fine-Grained Visual Categorization. Wang J / Gao YS. U of Warwick, UK / Griffith U, AU. BMVC 21. [Paper]
  - ViT for FGIR, select intermediate tokens based on layer-wise attention
- RAMS-Trans: Recurrent Attention Multi-scale Transformer for Fine-grained Image Recognition. Hu YQ / Xue H. Zhejiang U / Alibaba, CN. ACM MM 21. [Paper]
  - ViT for FGIR, select regions to crop based on recursive layer-wise attention matrix-matrix multiplication + individual CLS token for crops
- Transformer with peak suppression and knowledge guidance for fine-grained image recognition. Liu XD / Han XG. Beihang U, CN. Neurocomputing 22. [Paper]
  - ViT for FGIR, mask tokens of top attention to prevent overconfident predictions, learnable class matrix to augment output
- A free lunch from ViT: adaptive attention multi-scale fusion Transformer for fine-grained visual recognition. Zhang Y / Chen WQ. Peking U / Alibaba, CN. arXiv 21/08 ICASSP 22. [Paper]
  - ViT for FGIR, crops based on head-wise element-wise multiplications of attention heads and aggregating through SE-like mechanism to reweight different layers attentions
- Exploring Vision Transformers for Fine-grained Classification. Conde MV / Turgutlu K. U of Valladolid, ES. CVPR Workshop 21. [Paper]
  - ViT for FGIR, attention rollout + morphological operations for recursive cropping / masking
- Complemental Attention Multi-Feature Fusion Network for Fine-Grained Classification. Miao Z / Li H. Army Eng U of PLA, CN. Signal Proc Letters 21. [Paper]
  - Reweight Swin features based on importance and divide into two branches (discriminative and not)
- Part-Guided Relational Transformers for Fine-Grained Visual Recognition. Zhao YF / Tian YH. Beihang U, CN. TIP 21. [Paper]
  - Transformer with positional embeddings from CNN features to refine global and part features
- A Multi-Stage Vision Transformer for Fine-grained Image Classification. Huang Z / Zhang HB. Huaqiao U, CN. ITME 21. [Paper]
  - ViT for FGIR with pooling layer to build multiple stages in transformer
AP-CNN: Weakly Supervised Attention Pyramid Convolutional Neural Network for Fine-Grained Visual Classification. Ding YF / Ma ZY / Ling HB. Beijing U of Posts & Telecomms, CN. TIP 21. [Paper]
- FPN with top-down & bottom-up paths + merged ROI cropping + ROI masking
Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification. Rao YM / Zhou J. Tsinghua U, CN. ICCV 21. [Paper]
- Builds on WS-DAN (attention crop & mask) by making predictions with counterfactual (fake) attention maps to learn better attention maps
Neural Prototype Trees for Interpretable Fine-grained Image Recognition. Nauta M/ Seifert C. University of Twente, NL. CVPR 21. [Paper]
- Binary trees based on similarity to protoypes + pruning
SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data. Huang SL / Tao DC. U of Sydney, AU. AAAI 21. [Paper]
- CutMix (cut part fron one image into another as data aug) with asymmetric crops + assign labels based on CAMs
Intra-class Part Swapping for Fine-Grained Image Classification. Zhang LB / Huang SL / Liu W. U of Technology Sydney, AU. WACV 2021. [Paper]
- CutMix images from same class only + affine transform guided by CAMs for mixing
Stochastic Partial Swap: Enhanced Model Generalization and Interpretability for Fine-grained Recognition. Huang SL / Tao DC. The University of Sydney, AU. ICCV 21. [Paper]
- Intermediate classifiers + changing features of one image with another randomly to inject noise
Enhancing Mixture-of-Experts by Leveraging Attention for Fine-Grained Recognition. Zhang LB / Huang SL / Liu Wei. U of Technology Sydney / U of Sydney, AU. TMM 21. [Paper]
- CutMix based on activations from last conv layer, same class only, crops also based on activations from last conv
Multiresolution Discriminative Mixup Network for Fine-Grained Visual Categorization. Xu KR / Li YS. Xidian U, CN. TNNLS 21. [Paper]
- Mixup based on CAM attention + distillation from multiple high resolution crops to single low resolution crop
Context-aware Attentional Pooling (CAP) for Fine-grained Visual Classification. Behera A / Bera A. Edge Hill U, UK. AAAI 21. [Paper]
- Combine cross-regions features with attention + LSTM + learnable pooling
A Realistic Evaluation of Semi-Supervised Learning for Fine-Grained Classification. Su JC / Maji S. U of Massachusetts Amherst, US. CVPR 21. [Paper]
- In depth-study on fine-grained semi-supervised learning
MaskCOV: A random mask covariance network for ultra-fine-grained visual categorization. Yu XH / Xiong SW. Griffith U, AU / Wuhan U of T, CN. Pattern Recognition 21. [Paper]
- Masking and shuffling of patches as data aug, predict covariance as auxiliary task
Benchmark Platform for Ultra-Fine-Grained Visual Categorization Beyond Human Performance. Yu XQ / Xiong SW. Griffith U, AU / Wuhan U of T, CN. ICCV 21. [Paper]
- Ultra fine-grained recognition of leaves dataset
Human Attention in Fine-grained Classification. Rong Y / Kasneci E. University of Tübingen, DE. BMVC 21. [Paper]
- Human attention/gaze for crops/extra modality data
Fair Comparison: Quantifying Variance in Results for Fine-grained Visual Categorization. Gwilliam M / Farrell R. Brigham Young U, US / U of Maryland, US. WACV 21. [Paper]
- Study on the failure of single top-1 accuracy as metric for FGIR, suggest using class variance and standard deviation and mean of multiple experiments with different random seeds
Learning Canonical 3D Object Representation for Fine-Grained Recognition. Joung SH / Sohn KH. Yonsei U, KR. ICCV 21. [Paper]
- Learn 3D representations as auxiliary task for fine grained recognition
Multi-branch and Multi-scale Attention Learning for Fine-Grained Visual Categorization. Zhang F / Liu YZ. China U of Mining and T, CN. MMM 21. [Paper]
- Features maps of multiple layers (instead of one) to guide cropping
CLIP-Art: Contrastive Pre-training for Fine-Grained Art Classification. Conde MV / Turgutlu K. U of Valladolid, ES. CVPR Workshop 21. [Paper]
- Applies CLIP for fine-grained art recognition
Graph-based High-Order Relation Discovery for Fine-grained Recognition. Zhao YF / Li J. Beihang University, CN. CVPR 21. Paper]
- Extend on bi/trilinear pooling + GCN for refining features
Progressive Learning of Category-Consistent Multi-Granularity Features for Fine-Grained Visual Classification. Du RY / Ma ZY / Guo J. Beijing U of Posts and Telecomms, CN. TPAMI 21. [Paper]
- Extended journal version of PMG (ECCV20): progressive training with block-based processing + pair category consistency loss between same class images
Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An Approach. Sun ZR / Wei XS / Shen HT. Nanjing U of S&T / Nanjing U, CN. ICCV 21. [Paper]
- Dataset for fine-grained recognition with noisy web labels and method to train with noisy labels
Re-rank Coarse Classification with Local Region Enhanced Features for Fine-Grained Image Recognition. Yang SK / Liu S / Wang CH ByteDance, CN. arXiv 21/02. [Paper]
- Automatic hierarchy based on clustering, triplet loss to guide crops, similarity to class database to re-classify images (compared to coarse classifier)
Progressive Co-Attention Network for Fine-grained Visual Classification. Zhang T / Ma ZY / Guo J. Beijing U of Posts and Telecomms, CN. VCIP 21. [Paper]
- Interaction between pairs of images using bilinear pooling
Subtler mixed attention network on fine-grained image classification. Liu C / Zhang WF. Ocean U of China, CN. Applied Intelligence 21. [Paper]
- Spatial and channel attention on parts
Dynamic Position-aware Network for Fine-grained Image Recognition. Wang SJ / Li HJ / Ouyang WL. Dalian U of T, CN. AAAI 21. [Paper]
- Horizontal and vertical pooling + learnable sin/cos positional embeddings + GCN for crops
Learning Scale-Consistent Attention Part Network for Fine-Grained Image Recognition. Liu HB / Lin WY. Shanghai Jiaotong U, CN. TMM 21. [Paper]
- SE-like + Gumbel softmax trick + scale-consistency for parts detection + self-attention for parts relations
Multi-branch Channel-wise Enhancement Network for Fine-grained Visual Recognition. Li GJ / Zhu FT. University of Shanghai for Science and Technology, CN. ACM MM 21. [Paper]
- Multi-size spatial shuffling (similar to DCL (CVPR19) but with multiple sizes of shuffling)
Fine-Grained Categorization From RGB-D Images. Tan YH / Lu K. Chinese Academy of Sciences, CN. TMM 21. [Paper]
- Dataset and network for incorporating RGB and depth images

2020

The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification. Chang DL / Song YZ. U of Posts and Telecomms, CN. TIP 20. [Paper]
- Channel groups loss to make each channel group discriminative and focus on different spatial regions
Learning Attentive Pairwise Interaction for Fine-Grained Classification. Zhuang PQ / Qiao Y. Chinese Acad. Of Sciences, CN. AAAI 20. [Paper]
- Pairwise interactions between pairs of images from same/different class
Fine-Grained Visual Classification via Progressive Multi-Granularity Training of Jigsaw Patches. Du RY / Guo J. U of Posts and Telecomms, CN. ECCV 20. [Paper]
- Jigsaw puzzle for data augmentation of different network stages, training each stage progressively and classifier for each stage
Channel Interaction Networks for Fine-Grained Image Categorization. Gao Y / Scott M. Malong Technologies, CN. AAAI 20. [Paper]
- Trilinear pooling + contrastive loss to pull images from same class together and push images from different class apart
ELoPE: Fine-Grained Visual Classification with Efficient Localization, Pooling and Embedding. Hanselmann H / Ney H. WTH Aachen U, DE. WACV 20. [Paper]
- Small CNN to predict crops + embedding loss w/ class centers
Fine-Grained Visual Classification with Efficient End-to-end Localization. Hanselmann H / Ney H. arXiv 20/05. [Paper]
- End-to-end train of small CNN + STN
Attentional Kernel Encoding Networks for Fine-Grained Visual Categorization. Hu YT / Zhen XT. Beihang U, CN. TCSVT 20. [Paper]
- Cascaded attention + fourier/cosine kernel (cos of input)
Bi-Modal Progressive Mask Attention for Fine-Grained Recognition. Song KT / Wei XS / Lu JF. Nanjing U of S&T, CN. TIP 20. [Paper]
- Multi-stage fusion of vision (CNN) & text (LSTM) with vision-/language-only attention & cross-modality attention and intermediate classifiers
Hierarchical Image Classification using Entailment Cone Embeddings. Dhall A / Krause A. ETH Zurich, CH. CVPR Workshop 20. [Paper]
- Comparison on losses and embeddings for hierarchical classification
Learning Semantically Enhanced Feature for Fine-Grained Image Classification. Luo W / Wei XS. IEEE, US. Signal Processing Letters 20. [Paper]
- Group feature channels based on semantics and KD from global features to groups
An Adversarial Domain Adaptation Network For Cross-Domain Fine-Grained Recognition. Wang YM / Wei XS / Zhang LJ. Nanjing U, CN / Megvii. WACV 20. [Paper]
- Adversarial loss to distinguish domains + loss to pull features from same class together + attention binary mask for removing BG
Group Based Deep Shared Feature Learning for Fine-grained Image Classification. Li XL / Monga V. Pennsylvania State University, US. BMVC 20. [Paper]
- Autoencoder with class/shared center loss to divide features into class and not
Beyond the Attention: Distinguish the Discriminative and Confusable Features For Fine-grained Image Classification. Shi XR / Liu W. Beijing U of Posts and Telecomm, CN. ACM MM 20. [Paper]
- Divide into discriminative/confusing regions w/ SE to refine features, intermediate losses for classification, pulling features of images with same label closer (L1) and maximizing entropy of confusing features (pseudolabel of 1 to all classes -> background)
Fine-Grained Classification via Categorical Memory Networks. Deng WJ / Zheng L. Australian National U, AU. arXiv 20/12 / TIP 22. [Paper]
- Augment feature with class-specific memory module (learned average based on previous samples and how similar / how it reacts to new samples)
Interpretable and Accurate Fine-grained Recognition via Region Grouping. Huang ZX / Li Y. U of Wisconsin-Madison, US. CVPR 20. [Paper]
- Part assignment, feature refinement and weighted classification
Filtration and Distillation: Enhancing Region Attention for Fine-Grained Visual Categorization. Liu CB / Zhang YD. U of S&T of China, CN. AAAI 20. [Paper]
- RPN with losses for consistency between proposals from RPN and main feature extractor + KD between object and parts
Graph-Propagation Based Correlation Learning for Weakly Supervised Fine-Grained Image Classification. Wang ZH / Li HJ / Li JJ. Dalian U of S&T, CN. AAAI 20. [Paper]
- GCN for graph propagation for discriminative feature selection (crops) + losses for cropping
Weakly Supervised Fine-grained Image Classification via Gaussian Mixture Model Oriented Discriminative Learning. Wang ZH / Li HJ / Li ZZ. Dalian U of T, CN. CVPR 20. [Paper]
- Gaussian mixture model to learn low rank feature maps for selecting crops
Category-specific Semantic Coherency Learning for Fine-grained Image Recognition. Wang SJ / Li HJ / Ouyang WL. Dalian U of T, CN. ACM MM 20. [Paper]
- Latent attributes prediction, alignment, reordering and patch-wise attention for selecting crops
Multi-Objective Matrix Normalization for Fine-Grained Visual Recognition. Min SB / Zhang YD. U of S&T of China, CN. TIP 20. [Paper]
- Matrix normalization for bilinear pooling
Power Normalizations in Fine-Grained Image, Few-Shot Image and Graph Classification. Koniusz P / Zhang HG. Australian National U, AU. TPAMI 20. [Paper]
- Study on normalizations for B-CNN
Fine-grained Image Classification and Retrieval by Combining Visual and Locally Pooled Textual Features. Mafla A / Karatzas D. UAB, ES. WACV 20. [Paper]
- Extract and incorporate text in images for FGIR
Multi-Modal Reasoning Graph for Scene-Text Based Fine-Grained Image Classification and Retrieval. Mafla A / Karatzas D. UAB, ES. arXiv 20/09 / WACV 21. [Paper]
- Expands on previous by encoding multimodality with GCN
Focus Longer to See Better: Recursively Refined Attention for Fine-Grained Image Classification. Shroff P / Wang ZY. Texas A&M U, US. CVPR Workshop 20. [Paper]
- Recursive LSTM for encoding cropped features
Fine-Grained Visual Categorization by Localizing Object Parts With Single Image. Zheng XT / Lu XQ. Chinese Acad of Sciences, CN. TMM 20. [Paper]
- Cluster feature maps of multiple layers
Microscopic Fine-Grained Instance Classification Through Deep Attention. Fan MR / Rittscher J. U of Oxford, UK. MICCAI 20. [Paper]
- Attention crops for microscopic applications

2019

Destruction and Construction Learning for Fine-Grained Image Recognition. Chen Y / Mei T. JD AI Research, CN. CVPR 19. [Paper]
- Shuffling local regions in an image (destruction) + learning to predict original locations (construction) + adversarial loss to distinguish shuffled from not
Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-Grained Image Recognition. Zheng HL / Luo JB. U of S&T of CN, CN. CVPR 19. [Paper]
- Trilinear attention (〖𝑿𝑿〗^𝑻 𝑿) for crops + KD loss between crops & original
Weakly Supervised Complementary Parts Models for Fine-Grained Image Classification From the Bottom Up. Ge WF / Yu YZ. U of Hong Kong, HK. CVPR 19. [Paper]
- Weakly supervised detection/segmentation with Mask R-CNN, CAMs & CRFs + LSTM for aggregating features from original and crops
See Better Before Looking Closer: Weakly Supervised Data Augmentation Network for Fine-Grained Visual Classification. Hu T / Lu Y. Chinese Academy of Sciences, CN / Microsoft. arXiv 19. [Paper]
- Attention masking (and cropping) + moving average center loss to guide attention maps
Selective Sparse Sampling for Fine-grained Image Recognition. Ding Y / Jiao JB. U of Chinese Academy of Sciences, CN. ICCV 19. [Paper]
- CAMs peaks + Gaussians based on classification entropy (confidence) for resampling images (cropping with convs)
Cross-X Learning for Fine-Grained Visual Categorization. Luo W / Lim S. South China Agricultural University, CN / FB. ICCV 19. [Paper]
- Multiple excitations (OSME with loss to distinguish excitations) with intermediate features (FPN + KD loss between intermediate predictions)
P-CNN: Part-Based Convolutional Neural Networks for Fine-Grained Visual Categorization. Han JW / Xu D. Northwestern Polythechnical U, CN / U of Sydney, AU. TPAMI 19. [Paper]
- Cluster peak channel responses using K-means as part detectors
Learning Rich Part Hierarchies With Progressive Attention Networks for Fine-Grained Image Recognition. Zheng HL / Luo JB / Mei T. Microsoft, CN. TIP 19. [Paper]
- Journal MA-CNN w/ refinement module and iterative training
Bidirectional Attention-Recognition Model for Fine-Grained Object Classification. Liu CB / Zhang YD. U of S&T of China, CN. TMM 19. [Paper]
- RPN for proposals with feedback (NTS-Net like) + multiple random erasing data augmentation
Deep Fuzzy Tree for Large-Scale Hierarchical Visual Classification. Wang Y / Li XQ. Tianjin U, CN. Trans. Fuzzy Systems 19. [Paper]
- Fuzzy tree Based on interclass similarity for hierarchical class
Part-Aware Fine-grained Object Categorization using Weakly Supervised Part Detection Network. Zhang YB / Wang ZX. South China U of Technology, CN. TMM 19. [Paper]
- RPN proposals based on channel-wise peaks + self-supervised part labeling

2018

Learning to Navigate for Fine-grained Classification. Yang Z / Wang LW. Peking U, CN. ECCV 2018. [Paper]
- Feedback between networks, shared feature extractor between modules, RPN (Faster-RCNN) for part proposal
Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning. Cui Y / Belongie S. Cornell University, US. CVPR 18. [Paper]
- Importance of resolution and strategy for long-tailed and distance to capture domain similarity between datasets for better transfer learning by training on similar sources to target
Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition. Sun M / Ding ER. Baidu, CN. ECCV 18. [Paper]
- Multi-excitation (squeeze-and-excitation) for feature maps + loss to pull features from same excitation closer and pushes features from different excitations away
Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition. Yu CJ / You XG. Huazhong U of S&T, CN . ECCV 18. [Paper]
- Combine intermediate features by element-wise multiplications + concatenation of bilinearly pooled outputs
Deep Attention-Based Spatially Recursive Networks for Fine-Grained Visual Recognition. Wu L / Wang Y. U of Queensland, AU. Trans. Cybernetics 18. [Paper]
- Bilinear pooling w/o sum (outer product only not matrix mult)+ FC +softmax for attention + 2D spatial LSTM with neighborhood to aggregate features
Mask-CNN: Localizing Parts and Selecting Descriptors for Fine-Grained Image Recognition. Wei XS / Wu JX. Nanjing University, CN. arXiv 16/05 (Submitted to NIPS16) / Pattern Recognition 2018/04. [Paper]
- FCN for segmentation of parts + descriptor selection for GAP/GMP
Maximum-Entropy Fine-Grained Classification. Dubey A / Naik N. Massachusetts Institute of Technology, US. NIPS 18. [Paper]
- Prevent overconfidence with maximum-entropy loss + definition of fine-grained based on diversity
Fine-Grained Image Classification Using Modified DCNNs Trained by Cascaded Softmax and Generalized Large-Margin Losses. Shi WW. Xian Jiaotong U, CH. TNNLS18. [Paper]
- Multi objective classification with cascaded FC classifiers for each hierarchy level + loss to bring same fine-grained class together and same coarse class closer than different coarse

2017

Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition. Fu JF / Zheng HL / Mei T. Microsoft / U of S&T of China, CN. CVPR 17. [Paper]
- Recurrent CNN with intra-scale classification loss and inter-scale pairwise ranking loss to enforce finer-scale to generate more confident predictions
Learning Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition. Zheng HL / Mei T / Luo JB. U of S&T of China, CN / Microsoft. ICCV 17. [Paper]
- Channel grouping module to select multiple parts from CNN feature maps + loss for compact distribution and diversity with geometric constraints
Object-Part Attention Model for Fine-Grained Image Classification. Peng YX / Zhao JJ. Peking U, CN. arXiv 17/04 / TIP 18. [Paper]
- Propose automatic object localization via saliency extraction (CAM) for localizing objects, object-part spatial constraints and clustering of parts based on clustered intermediate CNN filters
Low-Rank Bilinear Pooling for Fine-Grained Classification. Kong S / Fowlkes C. University of California Irvine, US. CVPR 17. [Paper]
- Bilinear pooling with low-dimensionality projection (extra FC layer)
Pairwise Confusion for Fine-Grained Visual Classification. Dubey A / Naik N. MIT, US. arXiv 17/05 / ECCV 18. [Paper]
- Euclidean Distance loss which “confuses” network by adding a regularization term which minimizes distance between two images in mini-batch
Bilinear Convolutional Neural Networks for Fine-Grained Visual Recognition. Lin T / Maji S. University of Massachusetts Amherst, US. TPAMI 2017. 177. [Paper]
- Extension and analysis of bilinear pooling
Fine-grained Image Classification via Combining Vision and Language. He XT / Peng YX. Peking U, CN. CVPR 17. [Paper]
- Vision (GoogleNet) & language (CNN-RNN) two-stream network
Higher-Order Integration of Hierarchical Convolutional Activations for Fine-Grained Visual Categorization. Cai SJ / Zhang L. HK Polytechnic University, HK. ICCV 17. [Paper]
- Bilinear pooling for multiple layers using 1x1 Convs and concatenating intermediate outputs
The Devil is in the Tails: Fine-grained Classification in the Wild. Horn GV / Perona P. Caltech, US. ArXiv 2017/09. [Paper]
- Discussion on challenges related to long-tailed fine-grained classification
BoxCars: Improving Fine-Grained Recognition of Vehicles using 3D Bounding Boxes in Traffic Surveillance. Sochor J / Herout A. Brno U of T, CZ. Transactions on ITS 17. [Paper]
- Automatic 3D BBox estimation for car recognition

2016

Diversified Visual Attention Networks for Fine-Grained Object Classification. Zhao B / Yan SC. Southwest Jiaotong U, CN. arXiv 16/06 / TMM 17. [Paper]
- Multi-scale canvas for CNN extractor + LSTM to refine CNN predictions across time steps
Learning a Discriminative Filter Bank Within a CNN for Fine-Grained Recognition. Wang YM / Davis LS. University of Maryland, US. arXiv 16/11 / CVPR 18. [Paper]
- Two stream head: global (original) and part with 1x1 Conv, spatial global max pooling, and filter grouping/pooling to focus on most discriminative parts
Picking Deep Filter Responses for Fine-grained Image Recognition. Zhang XP / Tian Q. Shanghai Jiao Tong U, CN. CVPR 16. [Paper]
- Selecting deep filters which react to parts + spatial-weighting of Fisher Vector
BoxCars: 3D Boxes as CNN Input for Improved Fine-Grained Vehicle Recognition. Sochor J / Havel J. Brno U of T, CZ. CVPR 16. [Paper]
- 3D BBox, vehicle orientation, and shape as extra data
Weakly Supervised Fine-Grained Categorization With Part-Based Image Representation. Zhang Y / Do M. A*SATR, SN. TIP 16. [Paper]
- Convolutional filters for part proposal + Fisher Vector clusters for selecting useful parts + normalized FV concatenation from different scale parts
Mining Discriminative Triplets of Patches for Fine-Grained Classification. Wang YM / Davis LS. U of Maryland, US. CVPR 16. [Paper]
- Triplets of patches with geometric constraints to aid mid-level representations
Fully Convolutional Attention Networks for Fine-Grained Recognition. Liu X / Lin YQ. Baidu, CN. arXiv 16/03. [Paper]
- Simultaneously compute parts without recursion using reinforcement learning

2015

Bilinear CNN Models for Fine-grained Visual Recognition. Lin TY / Maji S. U of Massachusetts, US. ICCV 15. [Paper]
- Outer product of (two) CNN feature maps (bilinear vector) as input to classifier
Fine-Grained Recognition Without Part Annotations. Krause J / Fei-Fei L. Stanford U, US. CVPR 15. [Paper]
- Alignment by segmentation and pose graphs based on neighbors (highest cosine similarity of CNN features) to generate parts
Part-Stacked CNN for Fine-Grained Visual Categorization. Huang SL / Zhang Y. U of Technology Sydney, AU. arXiv 15/12 / CVPR 16. [Paper]
- Directly perform part-based classification on detected part locations from output feature maps using FCN, shared features, two stage training
Deep LAC: Deep localization, alignment and classification for fine-grained recognition. Lin D / Jia JY. CVPR 15. Chinese U of Hong Kong, HK. [Paper]
- Backprop-able localization + alignment based on templates (clustered from train set)
Fine-Grained Categorization and Dataset Bootstrapping Using Deep Metric Learning with Humans in the Loop. Cui Y / Belongie S. Cornell U, US. arXiv 15/12 / CVPR 16. [Paper]
- Triplet loss with sampling strategy for hard negatives and utilizing web data (CNN recognition-based + human verified)
Multiple Granularity Descriptors for Fine-Grained Categorization. Wang DQ / Zhang Z. Fudan U, CN. ICCV 15. [Paper]
- Detectors and classifiers for each level of class granularity / hierarchy
Hyper-class Augmented and Regularized Deep Learning for Fine-grained Image Classification. Xie SN / Lin YQ. UC San Diego, US. CVPR 15. [Paper]
- Web data and other datasets based on hyperclasses (dogs & orientation of cars) + auxiliary loss to predict hyperclasses
Fine-Grained Image Classification by Exploring Bipartite-Graph Labels. Zhou F / Lin YQ. NEC Labs, US. arXiv 15/12 / CVPR 16. [Paper]
- Jointly model fine-grained clases with pre-defined coarse classes (attributes / tags such as ingredients or macro-categories)
A Fine-Grained Image Categorization System by Cellet-Encoded Spatial Pyramid Modeling. Zhang LM / Li XL. National U of Singapore, SN. TIE 15. [Paper]
- Traditional encoding

2014

DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. Donahue J / Darrell T. UC Berkeley, US. ICML 2014. [Paper]
- CNN Features + Part Localization
Part-based R-CNNs for Fine-grained Category Detection. Zhang N / Darrell T. UC Berkeley, US. CVPR 14. [Paper]
- Extends R-CNN to Detect Parts with Geometric Constraints
Evaluation of Output Embeddings for Fine-Grained Image Classification. Akata Z / Schiele B. Max Planck Institute for Informatics, DE. arXiv 14/09 / CVPR 15. [Paper]
- Learning from Web Text + Text-Based Zero-Shot Classification
The application of two-level attention models in deep convolutional neural network for fine-grained image classification. Xiao TJ / Zhang Z. Peking U, CN. arXiv 14/11 / CVPR 15. [Paper]
- Bounding-Box Free Cropping (Weakly Supervised) via Multi-Stage Architecture
Bird Species Categorization Using Pose Normalized Deep Convolutional Nets. Branson S / Perona P. Caltech, US. BMVC 14. [Paper]
- Pose-Normalized CNN + Fine-tuning
Attention for Fine-Grained Categorization. Sermanet P / Real E. Google. arxiv 14/12 / ICLR 15 Workshop. [Paper]
- Large-Scale Pretraining on CNN + RNN Attention for Weakly Supervised Crops
Learning Category-Specific Dictionary and Shared Dictionary for Fine-Grained Image Categorization. Gao SH / Ma Y. Advanced Digital Sciences, SN. TIP 14. [Paper]
- Category Specific and Shared Codebooks
Jointly Optimizing 3D Model Fitting and Fine-Grained Classification. Lin YL / Davis LS. National Taiwan U, TW. ECCV 14. [Paper]
- 3D Model Fitting as Auxiliary Task
Fine-grained visual categorization via multi-stage metric learning. Qian Q / Lin YQ. Michigan State U, US. arXiv 14/02 / CVPR 15. [Paper]
- Multi-Stage Distance Metric (Pull Positive Pairs and Push Negative Pairs, Contrastive-like) + KNN Classifier
Revisiting the Fisher vector for fine-grained classification. Gosselin PH / Jegou H / Perronnin F. ETIS ENSEA / Inria, FR. Pattern Recognition Letters 2014. [Paper]
- Fisher Vector Scaling for FGIR
Learning Features and Parts for Fine-Grained Recognition. Krause J / Fei-Fei L. Stanford U, US. CVPR 14. [Paper]
- CNN + Unsupervised Part Discovery for Focusing on CNN Regions (No multi-stage)
Nonparametric Part Transfer for Fine-Grained Recognition. Goring C / Denzler J. University Jena, DE. CVPR 14. [Paper]
- Train images with similar shape to current image then transfer part annotations

2013

POOF: Part-Based One-vs.-One Features for Fine-Grained Categorization, Face Verification, and Attribute Estimation. Berg T / Belhumeur P. Columbia U, US. CVPR 13. [Paper]
- Align two images, divide into small patches, classify and distinguish between patches, select most discriminative then classify again
Fine-Grained Crowdsourcing for Fine-Grained Recognition. Deng J / Fei-Fei L. CVPR 13. [Paper]
- Crowdsource discriminative regions and algorithm to make use of them
Symbiotic Segmentation and Part Localization for Fine-Grained Categorization. Chai YN / Zisserman A. U of Oxford, UK. ICCV 13. [Paper]
- Joint loss for parts + foreground / background segmentation
Deformable Part Descriptors for Fine-Grained Recognition and Attribute Prediction. Zhang N / Darrel T. UC Berkeley, US. ICCV 13. [Paper]
- Part localization + pose normalization
Efficient Object Detection and Segmentation for Fine-Grained Recognition. Angelova A / Zhu SH. NEC Labs America, US. CVPR 13. [Paper]
- Detect and segment object then crop
Fine-Grained Categorization by Alignments. Gavves E / Tuytelaars T. U of Amsterdam, ICCV 13. [Paper]
- Align images then predict parts based on similar images in train set
Style Finder : Fine-Grained Clothing Style Recognition and Retrieval. Di W / Sundaresan N. UC San Diego, US. CVPR Workshop 13. [Paper]
- Clothing dataset
Hierarchical Part Matching for Fine-Grained Visual Categorization. Xie LX / Zhang B. Tsinghua U, CN. ICCV 13. [Paper]
- Segmentation into semantic parts + combining mid-level features
Multi-level Discriminative Dictionary Learning towards Hierarchical Visual Categorization. Shen L / Huang QM. U of Chinese Academy of Sciences, CN. CVPR 13. [Paper]
- Hierarchical classification
Vantage Feature Frames for Fine-Grained Categorization. Sfar A / Geman D. INRIA Saclay. CVPR 13. [Paper]
- Find points and orientation from which to distinguish fine-grained details (inspired by experts approach)
Con-text: text detection using background connectivity for fine-grained object classification. Karaoglu S / Gevers T. U of Amsterdam, NL. ACM MM 13. [Paper]
- Text detection (foreground) by reconstructing background using morphology then substract background

2012

Discovering localized attributes for fine-grained recognition. Duan K / Grauman K. Indiana U, US. CVPR 12. [Paper]
- Detection of human interpretable attributes
Unsupervised Template Learning for Fine-Grained Object Recognition. Shapiro L / Yang SL. U of Washington, US. NIPS 12. [Paper]
- Template detection and use them to align images
A codebook-free and annotation-free approach for fine-grained image categorization. Yao BP / Fei-Fei L. Stanford U, US. CVPR 12. [Paper]
- Template-based similarity matching between random templates

2011

Combining randomization and discrimination for fine-grained image categorization. Yao BP. / Fei-Fei L. Stanford U, US. CVPR 11. [Paper]
- Random forest + discriminative trees
Fisher Vectors for Fine-Grained Visual Categorization. Sanchez J / Akata Z. Xerox. FGVC Workshop in CVPR 11. [Paper]
- Fisher vectors

Datasets

SkinCon: A skin disease dataset densely annotated by domain experts for fine-grained model debugging and analysis [Paper]
GrainSpace: A Large-scale Dataset for Fine-grained and Domain-adaptive Recognition of Cereal Grains [Paper]
FAIR1M: A Benchmark Dataset for Fine-grained Object Recognition in High-Resolution Remote Sensing Imagery [Paper]
Yoga-82: A New Dataset for Fine-grained Classification of Human Poses [Paper]
Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection [Paper]
A large-scale car dataset for fine-grained categorization and verification [Paper]
Birdsnap: Large-Scale Fine-Grained Visual Categorization of Birds [Paper]
3D Object Representations for Fine-Grained Categorization [Paper]
Fine-Grained Visual Classification of Aircraft [Paper]
Novel Dataset for Fine-Grained Image Categorization : Stanford Dogs [Paper]

FGIR-OSI

Acknowledgement

Thanks Awesome-Crowd-Counting for the template.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.github		.github
assets		assets
css		css
dist		dist
docs		docs
examples		examples
js		js
plugin		plugin
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo.html		demo.html
gulpfile.js		gulpfile.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Fine-Grained Image Classification

Surveys

Papers

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Datasets

FGIR-OSI

Acknowledgement

About

Releases

Sponsor this project

Packages

Languages

License

arkel23/AFGIC

Folders and files

Latest commit

History

Repository files navigation

Awesome Fine-Grained Image Classification

Surveys

Papers

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Datasets

FGIR-OSI

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Languages

Packages