I tried to condense the (main) contributions (or the used methodology) from each paper into a line or two to observe trends across years.
Also made a companion website on GitHub Pages with summaries of all papers for a year + 1-slide summary of close to 200 surveyed papers.
Paper scraping description in link.
If you have any problems, suggestions or improvements, please submit the issue or PR.
-
Fine-Grained Image Analysis With Deep Learning: A Survey. [Paper]
-
A survey on deep learning-based fine-grained object classification and semantic segmentation. [Paper]
-
SaSPA: Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation. Michaeli E / Fried O. Reichman U, IL. arXiv 24/06. [Paper] [Project Page] [Code]
- Class-consistent data augmentations through pipeline consisting of GPT-4 prompts and ControlNET + BLIP-Diffusion
-
Fine-Grained Visual Classification via Internal Ensemble Learning Transformer. Xu Q / Luo B. Anhui University, CN. Transactions on Multimedia 2023. [Paper]
- Select intermediate tokens based on head-wise attention voting average + gaussian kernel -> multi-layer refinement, dynamic ratio of intermediate layers contributions for refinement modules
-
Dual Transformer with Multi-Grained Assembly for Fine-Grained Visual Classification. Ji RY / Wu YJ. Chinese Academy of Sciences, CN. TCSVT 23. [Paper]
- Early crop based on 1st layer attention, attention to select tokens from intermediate features, cross-attention for interactions between CLS token of global and crops and features of other branch
-
Fine-grained Classification of Solder Joints with {\alpha}-skew Jensen-Shannon Divergence. Ulger F / Gokcen D. TCPMT 23. [Paper]
- Maximize entropy to penalize overconfidence
-
Shape-Aware Fine-Grained Classification of Erythroid Cells. Wang Y / Zhou Y. JLU, CN. Applied Intelligence 23. [Paper]
- Dataset and method for fine-grained erythroid cell classification
-
Test-Time Amendment with a Coarse Classifier for Fine-Grained Classification. Jain K / Gandhi V. IIIT Hyderabad, IN. arXiv 2023/02. [Paper]
- Hierarchical prediction by taking into account predictions from coarser levels (multiplication of scores)
-
Semantic Feature Integration network for Fine-grained Visual Classification. Wang H / Luo HC. Jiangnan U, CN. arXiv 23/02. [Paper]
- Intermediate predictions classifiers + loss (similar to SAC arXiv 22 and PIM arXiv 22) + sequence of modules to refine most discriminative intermediate features
-
Learning Common Rationale to Improve Self-Supervised Representation for Fine-Grained Visual Recognition Problems. Shu YY / Hengel AVD / Liu LQ. U of Adelaide, AU. arXiv 23/03. [Paper]
- Extends SAM (ECCV 22) for self-supervised setting (add GradCAM branch trained with KD loss to predict discriminative regions
-
Fine-grained Visual Classification with High-temperature Refinement and Background Suppression. Chou PY / Lin CH. National Taiwan Normal U, TW. arXiv 23/03. [Paper]
- Extends PIM (arXiv 22) with loss to supress background (predict -1 for background regions) + KD loss between two inter classifiers
-
MetaFormer: A Unified Meta Framework for Fine-Grained Recognition. Diao QS / Yuan Z. ByteDance, CN. arXiv 22/03. [Paper]
- Incorporate multimodality data as extra information (date, location, text, attributes, etc)
-
Dynamic MLP for Fine-Grained Image Classification by Leveraging Geographical and Temporal Information. Yang LF / Yang J. Nanjing U of S&T, CN. CVPR 2022. [Paper]
- Incorporate metadata (date/loc)
-
Dual Cross-Attention Learning for Fine-Grained Visual Categorization and Object Re-Identification. Zhu HW / Shan Y. AMD, CN. CVPR 22. [Paper]
- Cross-attention between selected queries and all keys/values for refinement + cross-attention for regularization (mix queries/keys/values from two images)
-
SIM-Trans: Structure Information Modeling Transformer for Fine-grained Visual Categorization. Sun HB / Peng YX. Peking U, CN. ACM MM 22. [Paper]
- Refine attention selected tokens using GCN & polar coordinates + contrastive loss for last 3 layers
-
A Novel Plug-in Module for Fine-Grained Visual Classification. Chou PY / Kao WC. National Taiwan Normal U, TW. arXiv 22/02. [Paper]
- Intermediate classifier distribution sharpness as metric to select intermediate features + GCN to combine
-
ViT-NeT: Interpretable Vision Transformers with Neural Tree Decoder. Kim SW / Ko BC. Keimyung U, SK. ICML 22. [Paper]
- Binary tree with differentiable routing and refinement at each node/leaf
-
Fine-Grained Object Classification via Self-Supervised Pose Alignment. Yang XH / Tian YH. Peng Cheng Lab, CN. CVPR 22. [Paper]
- Intermediate features classifiers with different label smoothing levels and graph matching to align parts for contrastive learning
-
On the Eigenvalues of Global Covariance Pooling for Fine-grained Visual Recognition. Song Y / Wang W. U of Trento, IT. TPAMI 22. [Paper]
- Second order methods (B-CNN) weaknesses: small eigenvalues so propose scaling factor to magnify
-
Improving Fine-Grained Visual Recognition in Low Data Regimes via Self-Boosting Attention Mechanism. Shu YY / Liu LQ. U of Adelaide, AU. ECCV 22. [Paper]
- KL divergence between CAMs and convolutional projection as auxiliary task
-
SR-GNN: Spatial Relation-aware Graph Neural Network for Fine-Grained Image Categorization. Bera A / Behera A. BITS, IN / Edge Hill U, UK. TIP 22. [Paper]
- Divide into regions, refinement with GNN and SA
-
Cross-Part Learning for Fine-Grained Image Classification. Liu M / Zhao Y. Beijing Jiaotong University, CN. TIP 2022. [Paper]
- Multi-stage processing and localization (object -> parts) + refinement
-
Convolutional Fine-Grained Classification With Self-Supervised Target Relation Regularization. Liu KJ / Jia K. South China U of Technology, CN / Peng Cheng Lab, CN. arXiv 22/08. [Paper]
- Class center + distance between graphs as self-supervised loss
-
R2-Trans: Fine-Grained Visual Categorization with Redundancy Reduction. Wang Y / You XG. Huazhong U, CN. arXiv 22/04. [Paper]
- Mask tokens based on attention + information theory inspired loss
-
Knowledge Mining with Scene Text for Fine-Grained Recognition. Wang H / Liu WY. Huazhong U of Science and Technology, CN / Tencent, CN. CVPR 22. [Paper]
- Incorporate wikipedia knowledge from scene text as additional data
-
Fine-Grained Visual Classification using Self Assessment Classifier. Do T / Nguyen A. AIOZ, SN / U of Liverpool, UK. arXiv 22/05. [Paper]
- Predict once, augment top-k predictions with class text names to predict again
-
Exploiting Web Images for Fine-Grained Visual Recognition via Dynamic Loss Correction and Global Sample Selection. Liu HF / Xiu WS / Tang ZM. Nanjing U of S&T, CN. TMM 2022. [Paper]
- Web images for fine-grained recognition
-
Cross-layer Attention Network for Fine-grained Visual Categorization. Huang RR / Yang HZ. Tsinghua U, CN. arXiv 22/10 / CVPR 22 FGVC8 Workshop. [Paper]
- Refine intermediate features with top-level and top-level with intermediate features
-
Anime Character Recognition using Intermediates Feature Aggregation. Rios EA / Lai BC. National Yang Ming Chiao Tung U, TW. ISCAS 22. [Paper]
- Concatenate ViT intermediate CLS tokens and forward through fully connected layer to aggregate intermediate features + incorporate tag information as additional data.
-
Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism. Chen H / Ling W. Guangdong U of T, CN. Applied Intelligence 2022. [Paper]
- Attention map filtering and multi-scale
-
Bridge the Gap between Supervised and Unsupervised Learning for Fine-Grained Classification. Wang JB / Wei XS / Zhang R. Army Engineering U of PLA, CN / Nanjing U, CN. arXiv 22/03. [Paper]
- Study on unsupervised fine-grained (no labels, clustering-based)
-
PEDTrans: A fine-grained visual classification model for self-attention patch enhancement and dropout. Lin XH / Chen YF. China Agricultural U, CN. ACCV 22. [Paper]
- Patch dropping based on similarity (outer product/bilinear pooling) + refinement of patches before transformer
-
Iterative Self Knowledge Distillation -- from Pothole Classification to Fine-Grained and Covid Recognition. Peng KC. Mitsubishi MERL, US. ICASSP 22. [Paper]
- Use student from previous iteration as teacher, recursively
-
Fine-grain Inference on Out-of-Distribution Data with Hierarchical Classification. Linderman R / Chen Y. Duke U, US. NeurIPS 22 Workshop. [Paper]
- Hierarchical OOD fine-grained with inference stopping criterion
-
Semantic Guided Level-Category Hybrid Prediction Network for Hierarchical Image Classification. Wang P / Qian YT. Zhejiang University, CN. arXiv 2022/11. [Paper]
- Hierarchical prediction taking into account “quality” (noise, occlusion, blur or low resolution) to decide classification level
-
Data Augmentation Vision Transformer for Fine-grained Image Classification. Hu C / Wu WJ. Unknown affiliation. arXiv 22/11. [Paper]
- Crops based on single-layer (5th) attention + TransFG’s PSM module between 2 layers (recursive matrix-matrix attention)
-
Medical applications (COVID, kidney pathology, renal and ocular disease):
-
Self-supervision and Multi-task Learning: Challenges in Fine-Grained COVID-19 Multi-class Classification from Chest X-rays. Ridzuan M / Yaqub M. MBZUAI, AE. MIUA 22. [Paper]
-
Automatic Fine-grained Glomerular Lesion Recognition in Kidney Pathology. Nan Y / Yang G. Imperial College London, UK. Pattern Recognition 22. [Paper]
-
Holistic Fine-grained GGS Characterization: From Detection to Unbalanced Classification. Lu YZ / Huo YK. Vanderbilt U, US. Journal Medical Imaging 2022. [Paper]
-
CDNet: Contrastive Disentangled Network for Fine-Grained Image Categorization of Ocular B-Scan Ultrasound. Dan RL / Wang YQ. Hangzhou Dianzi U, CN. arXiv 22/06. [Paper]
-
-
Snake competition methodologies:
-
Solutions for Fine-grained and Long-tailed Snake Species Recognition in SnakeCLEF 2022. Zou C / Cheng Y. Ant Group, CN. Conference and Labs of the Evaluation Forum 2022. [Paper]
-
Explored An Effective Methodology for Fine-Grained Snake Recognition. Huang Y / Feng JH. Huazhong U of Science and T, CN / Alibaba, CN. CLEF 22. [Paper]
-
-
First ViTs for FGIR:
-
TransFG: A Transformer Architecture for Fine-Grained Recognition. He J / Wang CH. Johns Hopkins U / ByteDance. arXiv 21/03 / AAAI 22. [Paper]
- First to apply ViT for FGIR: overlapping patchifier convolution, recursive layer-wise matrix-matrix multiplication to aggregate attention and select features from last layer, contrastive loss
-
Feature Fusion Vision Transformer for Fine-Grained Visual Categorization. Wang J / Gao YS. U of Warwick, UK / Griffith U, AU. BMVC 21. [Paper]
- ViT for FGIR, select intermediate tokens based on layer-wise attention
-
RAMS-Trans: Recurrent Attention Multi-scale Transformer for Fine-grained Image Recognition. Hu YQ / Xue H. Zhejiang U / Alibaba, CN. ACM MM 21. [Paper]
- ViT for FGIR, select regions to crop based on recursive layer-wise attention matrix-matrix multiplication + individual CLS token for crops
-
Transformer with peak suppression and knowledge guidance for fine-grained image recognition. Liu XD / Han XG. Beihang U, CN. Neurocomputing 22. [Paper]
- ViT for FGIR, mask tokens of top attention to prevent overconfident predictions, learnable class matrix to augment output
-
A free lunch from ViT: adaptive attention multi-scale fusion Transformer for fine-grained visual recognition. Zhang Y / Chen WQ. Peking U / Alibaba, CN. arXiv 21/08 ICASSP 22. [Paper]
- ViT for FGIR, crops based on head-wise element-wise multiplications of attention heads and aggregating through SE-like mechanism to reweight different layers attentions
-
Exploring Vision Transformers for Fine-grained Classification. Conde MV / Turgutlu K. U of Valladolid, ES. CVPR Workshop 21. [Paper]
- ViT for FGIR, attention rollout + morphological operations for recursive cropping / masking
-
Complemental Attention Multi-Feature Fusion Network for Fine-Grained Classification. Miao Z / Li H. Army Eng U of PLA, CN. Signal Proc Letters 21. [Paper]
- Reweight Swin features based on importance and divide into two branches (discriminative and not)
-
Part-Guided Relational Transformers for Fine-Grained Visual Recognition. Zhao YF / Tian YH. Beihang U, CN. TIP 21. [Paper]
- Transformer with positional embeddings from CNN features to refine global and part features
-
A Multi-Stage Vision Transformer for Fine-grained Image Classification. Huang Z / Zhang HB. Huaqiao U, CN. ITME 21. [Paper]
- ViT for FGIR with pooling layer to build multiple stages in transformer
-
-
AP-CNN: Weakly Supervised Attention Pyramid Convolutional Neural Network for Fine-Grained Visual Classification. Ding YF / Ma ZY / Ling HB. Beijing U of Posts & Telecomms, CN. TIP 21. [Paper]
- FPN with top-down & bottom-up paths + merged ROI cropping + ROI masking
-
Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification. Rao YM / Zhou J. Tsinghua U, CN. ICCV 21. [Paper]
- Builds on WS-DAN (attention crop & mask) by making predictions with counterfactual (fake) attention maps to learn better attention maps
-
Neural Prototype Trees for Interpretable Fine-grained Image Recognition. Nauta M/ Seifert C. University of Twente, NL. CVPR 21. [Paper]
- Binary trees based on similarity to protoypes + pruning
-
SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data. Huang SL / Tao DC. U of Sydney, AU. AAAI 21. [Paper]
- CutMix (cut part fron one image into another as data aug) with asymmetric crops + assign labels based on CAMs
-
Intra-class Part Swapping for Fine-Grained Image Classification. Zhang LB / Huang SL / Liu W. U of Technology Sydney, AU. WACV 2021. [Paper]
- CutMix images from same class only + affine transform guided by CAMs for mixing
-
Stochastic Partial Swap: Enhanced Model Generalization and Interpretability for Fine-grained Recognition. Huang SL / Tao DC. The University of Sydney, AU. ICCV 21. [Paper]
- Intermediate classifiers + changing features of one image with another randomly to inject noise
-
Enhancing Mixture-of-Experts by Leveraging Attention for Fine-Grained Recognition. Zhang LB / Huang SL / Liu Wei. U of Technology Sydney / U of Sydney, AU. TMM 21. [Paper]
- CutMix based on activations from last conv layer, same class only, crops also based on activations from last conv
-
Multiresolution Discriminative Mixup Network for Fine-Grained Visual Categorization. Xu KR / Li YS. Xidian U, CN. TNNLS 21. [Paper]
- Mixup based on CAM attention + distillation from multiple high resolution crops to single low resolution crop
-
Context-aware Attentional Pooling (CAP) for Fine-grained Visual Classification. Behera A / Bera A. Edge Hill U, UK. AAAI 21. [Paper]
- Combine cross-regions features with attention + LSTM + learnable pooling
-
A Realistic Evaluation of Semi-Supervised Learning for Fine-Grained Classification. Su JC / Maji S. U of Massachusetts Amherst, US. CVPR 21. [Paper]
- In depth-study on fine-grained semi-supervised learning
-
MaskCOV: A random mask covariance network for ultra-fine-grained visual categorization. Yu XH / Xiong SW. Griffith U, AU / Wuhan U of T, CN. Pattern Recognition 21. [Paper]
- Masking and shuffling of patches as data aug, predict covariance as auxiliary task
-
Benchmark Platform for Ultra-Fine-Grained Visual Categorization Beyond Human Performance. Yu XQ / Xiong SW. Griffith U, AU / Wuhan U of T, CN. ICCV 21. [Paper]
- Ultra fine-grained recognition of leaves dataset
-
Human Attention in Fine-grained Classification. Rong Y / Kasneci E. University of Tübingen, DE. BMVC 21. [Paper]
- Human attention/gaze for crops/extra modality data
-
Fair Comparison: Quantifying Variance in Results for Fine-grained Visual Categorization. Gwilliam M / Farrell R. Brigham Young U, US / U of Maryland, US. WACV 21. [Paper]
- Study on the failure of single top-1 accuracy as metric for FGIR, suggest using class variance and standard deviation and mean of multiple experiments with different random seeds
-
Learning Canonical 3D Object Representation for Fine-Grained Recognition. Joung SH / Sohn KH. Yonsei U, KR. ICCV 21. [Paper]
- Learn 3D representations as auxiliary task for fine grained recognition
-
Multi-branch and Multi-scale Attention Learning for Fine-Grained Visual Categorization. Zhang F / Liu YZ. China U of Mining and T, CN. MMM 21. [Paper]
- Features maps of multiple layers (instead of one) to guide cropping
-
CLIP-Art: Contrastive Pre-training for Fine-Grained Art Classification. Conde MV / Turgutlu K. U of Valladolid, ES. CVPR Workshop 21. [Paper]
- Applies CLIP for fine-grained art recognition
-
Graph-based High-Order Relation Discovery for Fine-grained Recognition. Zhao YF / Li J. Beihang University, CN. CVPR 21. Paper]
- Extend on bi/trilinear pooling + GCN for refining features
-
Progressive Learning of Category-Consistent Multi-Granularity Features for Fine-Grained Visual Classification. Du RY / Ma ZY / Guo J. Beijing U of Posts and Telecomms, CN. TPAMI 21. [Paper]
- Extended journal version of PMG (ECCV20): progressive training with block-based processing + pair category consistency loss between same class images
-
Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An Approach. Sun ZR / Wei XS / Shen HT. Nanjing U of S&T / Nanjing U, CN. ICCV 21. [Paper]
- Dataset for fine-grained recognition with noisy web labels and method to train with noisy labels
-
Re-rank Coarse Classification with Local Region Enhanced Features for Fine-Grained Image Recognition. Yang SK / Liu S / Wang CH ByteDance, CN. arXiv 21/02. [Paper]
- Automatic hierarchy based on clustering, triplet loss to guide crops, similarity to class database to re-classify images (compared to coarse classifier)
-
Progressive Co-Attention Network for Fine-grained Visual Classification. Zhang T / Ma ZY / Guo J. Beijing U of Posts and Telecomms, CN. VCIP 21. [Paper]
- Interaction between pairs of images using bilinear pooling
-
Subtler mixed attention network on fine-grained image classification. Liu C / Zhang WF. Ocean U of China, CN. Applied Intelligence 21. [Paper]
- Spatial and channel attention on parts
-
Dynamic Position-aware Network for Fine-grained Image Recognition. Wang SJ / Li HJ / Ouyang WL. Dalian U of T, CN. AAAI 21. [Paper]
- Horizontal and vertical pooling + learnable sin/cos positional embeddings + GCN for crops
-
Learning Scale-Consistent Attention Part Network for Fine-Grained Image Recognition. Liu HB / Lin WY. Shanghai Jiaotong U, CN. TMM 21. [Paper]
- SE-like + Gumbel softmax trick + scale-consistency for parts detection + self-attention for parts relations
-
Multi-branch Channel-wise Enhancement Network for Fine-grained Visual Recognition. Li GJ / Zhu FT. University of Shanghai for Science and Technology, CN. ACM MM 21. [Paper]
- Multi-size spatial shuffling (similar to DCL (CVPR19) but with multiple sizes of shuffling)
-
Fine-Grained Categorization From RGB-D Images. Tan YH / Lu K. Chinese Academy of Sciences, CN. TMM 21. [Paper]
- Dataset and network for incorporating RGB and depth images
-
The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification. Chang DL / Song YZ. U of Posts and Telecomms, CN. TIP 20. [Paper]
- Channel groups loss to make each channel group discriminative and focus on different spatial regions
-
Learning Attentive Pairwise Interaction for Fine-Grained Classification. Zhuang PQ / Qiao Y. Chinese Acad. Of Sciences, CN. AAAI 20. [Paper]
- Pairwise interactions between pairs of images from same/different class
-
Fine-Grained Visual Classification via Progressive Multi-Granularity Training of Jigsaw Patches. Du RY / Guo J. U of Posts and Telecomms, CN. ECCV 20. [Paper]
- Jigsaw puzzle for data augmentation of different network stages, training each stage progressively and classifier for each stage
-
Channel Interaction Networks for Fine-Grained Image Categorization. Gao Y / Scott M. Malong Technologies, CN. AAAI 20. [Paper]
- Trilinear pooling + contrastive loss to pull images from same class together and push images from different class apart
-
ELoPE: Fine-Grained Visual Classification with Efficient Localization, Pooling and Embedding. Hanselmann H / Ney H. WTH Aachen U, DE. WACV 20. [Paper]
- Small CNN to predict crops + embedding loss w/ class centers
-
Fine-Grained Visual Classification with Efficient End-to-end Localization. Hanselmann H / Ney H. arXiv 20/05. [Paper]
- End-to-end train of small CNN + STN
-
Attentional Kernel Encoding Networks for Fine-Grained Visual Categorization. Hu YT / Zhen XT. Beihang U, CN. TCSVT 20. [Paper]
- Cascaded attention + fourier/cosine kernel (cos of input)
-
Bi-Modal Progressive Mask Attention for Fine-Grained Recognition. Song KT / Wei XS / Lu JF. Nanjing U of S&T, CN. TIP 20. [Paper]
- Multi-stage fusion of vision (CNN) & text (LSTM) with vision-/language-only attention & cross-modality attention and intermediate classifiers
-
Hierarchical Image Classification using Entailment Cone Embeddings. Dhall A / Krause A. ETH Zurich, CH. CVPR Workshop 20. [Paper]
- Comparison on losses and embeddings for hierarchical classification
-
Learning Semantically Enhanced Feature for Fine-Grained Image Classification. Luo W / Wei XS. IEEE, US. Signal Processing Letters 20. [Paper]
- Group feature channels based on semantics and KD from global features to groups
-
An Adversarial Domain Adaptation Network For Cross-Domain Fine-Grained Recognition. Wang YM / Wei XS / Zhang LJ. Nanjing U, CN / Megvii. WACV 20. [Paper]
- Adversarial loss to distinguish domains + loss to pull features from same class together + attention binary mask for removing BG
-
Group Based Deep Shared Feature Learning for Fine-grained Image Classification. Li XL / Monga V. Pennsylvania State University, US. BMVC 20. [Paper]
- Autoencoder with class/shared center loss to divide features into class and not
-
Beyond the Attention: Distinguish the Discriminative and Confusable Features For Fine-grained Image Classification. Shi XR / Liu W. Beijing U of Posts and Telecomm, CN. ACM MM 20. [Paper]
- Divide into discriminative/confusing regions w/ SE to refine features, intermediate losses for classification, pulling features of images with same label closer (L1) and maximizing entropy of confusing features (pseudolabel of 1 to all classes -> background)
-
Fine-Grained Classification via Categorical Memory Networks. Deng WJ / Zheng L. Australian National U, AU. arXiv 20/12 / TIP 22. [Paper]
- Augment feature with class-specific memory module (learned average based on previous samples and how similar / how it reacts to new samples)
-
Interpretable and Accurate Fine-grained Recognition via Region Grouping. Huang ZX / Li Y. U of Wisconsin-Madison, US. CVPR 20. [Paper]
- Part assignment, feature refinement and weighted classification
-
Filtration and Distillation: Enhancing Region Attention for Fine-Grained Visual Categorization. Liu CB / Zhang YD. U of S&T of China, CN. AAAI 20. [Paper]
- RPN with losses for consistency between proposals from RPN and main feature extractor + KD between object and parts
-
Graph-Propagation Based Correlation Learning for Weakly Supervised Fine-Grained Image Classification. Wang ZH / Li HJ / Li JJ. Dalian U of S&T, CN. AAAI 20. [Paper]
- GCN for graph propagation for discriminative feature selection (crops) + losses for cropping
-
Weakly Supervised Fine-grained Image Classification via Gaussian Mixture Model Oriented Discriminative Learning. Wang ZH / Li HJ / Li ZZ. Dalian U of T, CN. CVPR 20. [Paper]
- Gaussian mixture model to learn low rank feature maps for selecting crops
-
Category-specific Semantic Coherency Learning for Fine-grained Image Recognition. Wang SJ / Li HJ / Ouyang WL. Dalian U of T, CN. ACM MM 20. [Paper]
- Latent attributes prediction, alignment, reordering and patch-wise attention for selecting crops
-
Multi-Objective Matrix Normalization for Fine-Grained Visual Recognition. Min SB / Zhang YD. U of S&T of China, CN. TIP 20. [Paper]
- Matrix normalization for bilinear pooling
-
Power Normalizations in Fine-Grained Image, Few-Shot Image and Graph Classification. Koniusz P / Zhang HG. Australian National U, AU. TPAMI 20. [Paper]
- Study on normalizations for B-CNN
-
Fine-grained Image Classification and Retrieval by Combining Visual and Locally Pooled Textual Features. Mafla A / Karatzas D. UAB, ES. WACV 20. [Paper]
- Extract and incorporate text in images for FGIR
-
Multi-Modal Reasoning Graph for Scene-Text Based Fine-Grained Image Classification and Retrieval. Mafla A / Karatzas D. UAB, ES. arXiv 20/09 / WACV 21. [Paper]
- Expands on previous by encoding multimodality with GCN
-
Focus Longer to See Better: Recursively Refined Attention for Fine-Grained Image Classification. Shroff P / Wang ZY. Texas A&M U, US. CVPR Workshop 20. [Paper]
- Recursive LSTM for encoding cropped features
-
Fine-Grained Visual Categorization by Localizing Object Parts With Single Image. Zheng XT / Lu XQ. Chinese Acad of Sciences, CN. TMM 20. [Paper]
- Cluster feature maps of multiple layers
-
Microscopic Fine-Grained Instance Classification Through Deep Attention. Fan MR / Rittscher J. U of Oxford, UK. MICCAI 20. [Paper]
- Attention crops for microscopic applications
-
Destruction and Construction Learning for Fine-Grained Image Recognition. Chen Y / Mei T. JD AI Research, CN. CVPR 19. [Paper]
- Shuffling local regions in an image (destruction) + learning to predict original locations (construction) + adversarial loss to distinguish shuffled from not
-
Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-Grained Image Recognition. Zheng HL / Luo JB. U of S&T of CN, CN. CVPR 19. [Paper]
- Trilinear attention (〖𝑿𝑿〗^𝑻 𝑿) for crops + KD loss between crops & original
-
Weakly Supervised Complementary Parts Models for Fine-Grained Image Classification From the Bottom Up. Ge WF / Yu YZ. U of Hong Kong, HK. CVPR 19. [Paper]
- Weakly supervised detection/segmentation with Mask R-CNN, CAMs & CRFs + LSTM for aggregating features from original and crops
-
See Better Before Looking Closer: Weakly Supervised Data Augmentation Network for Fine-Grained Visual Classification. Hu T / Lu Y. Chinese Academy of Sciences, CN / Microsoft. arXiv 19. [Paper]
- Attention masking (and cropping) + moving average center loss to guide attention maps
-
Selective Sparse Sampling for Fine-grained Image Recognition. Ding Y / Jiao JB. U of Chinese Academy of Sciences, CN. ICCV 19. [Paper]
- CAMs peaks + Gaussians based on classification entropy (confidence) for resampling images (cropping with convs)
-
Cross-X Learning for Fine-Grained Visual Categorization. Luo W / Lim S. South China Agricultural University, CN / FB. ICCV 19. [Paper]
- Multiple excitations (OSME with loss to distinguish excitations) with intermediate features (FPN + KD loss between intermediate predictions)
-
P-CNN: Part-Based Convolutional Neural Networks for Fine-Grained Visual Categorization. Han JW / Xu D. Northwestern Polythechnical U, CN / U of Sydney, AU. TPAMI 19. [Paper]
- Cluster peak channel responses using K-means as part detectors
-
Learning Rich Part Hierarchies With Progressive Attention Networks for Fine-Grained Image Recognition. Zheng HL / Luo JB / Mei T. Microsoft, CN. TIP 19. [Paper]
- Journal MA-CNN w/ refinement module and iterative training
-
Bidirectional Attention-Recognition Model for Fine-Grained Object Classification. Liu CB / Zhang YD. U of S&T of China, CN. TMM 19. [Paper]
- RPN for proposals with feedback (NTS-Net like) + multiple random erasing data augmentation
-
Deep Fuzzy Tree for Large-Scale Hierarchical Visual Classification. Wang Y / Li XQ. Tianjin U, CN. Trans. Fuzzy Systems 19. [Paper]
- Fuzzy tree Based on interclass similarity for hierarchical class
-
Part-Aware Fine-grained Object Categorization using Weakly Supervised Part Detection Network. Zhang YB / Wang ZX. South China U of Technology, CN. TMM 19. [Paper]
- RPN proposals based on channel-wise peaks + self-supervised part labeling
-
Learning to Navigate for Fine-grained Classification. Yang Z / Wang LW. Peking U, CN. ECCV 2018. [Paper]
- Feedback between networks, shared feature extractor between modules, RPN (Faster-RCNN) for part proposal
-
Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning. Cui Y / Belongie S. Cornell University, US. CVPR 18. [Paper]
- Importance of resolution and strategy for long-tailed and distance to capture domain similarity between datasets for better transfer learning by training on similar sources to target
-
Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition. Sun M / Ding ER. Baidu, CN. ECCV 18. [Paper]
- Multi-excitation (squeeze-and-excitation) for feature maps + loss to pull features from same excitation closer and pushes features from different excitations away
-
Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition. Yu CJ / You XG. Huazhong U of S&T, CN . ECCV 18. [Paper]
- Combine intermediate features by element-wise multiplications + concatenation of bilinearly pooled outputs
-
Deep Attention-Based Spatially Recursive Networks for Fine-Grained Visual Recognition. Wu L / Wang Y. U of Queensland, AU. Trans. Cybernetics 18. [Paper]
- Bilinear pooling w/o sum (outer product only not matrix mult)+ FC +softmax for attention + 2D spatial LSTM with neighborhood to aggregate features
-
Mask-CNN: Localizing Parts and Selecting Descriptors for Fine-Grained Image Recognition. Wei XS / Wu JX. Nanjing University, CN. arXiv 16/05 (Submitted to NIPS16) / Pattern Recognition 2018/04. [Paper]
- FCN for segmentation of parts + descriptor selection for GAP/GMP
-
Maximum-Entropy Fine-Grained Classification. Dubey A / Naik N. Massachusetts Institute of Technology, US. NIPS 18. [Paper]
- Prevent overconfidence with maximum-entropy loss + definition of fine-grained based on diversity
-
Fine-Grained Image Classification Using Modified DCNNs Trained by Cascaded Softmax and Generalized Large-Margin Losses. Shi WW. Xian Jiaotong U, CH. TNNLS18. [Paper]
- Multi objective classification with cascaded FC classifiers for each hierarchy level + loss to bring same fine-grained class together and same coarse class closer than different coarse
-
Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition. Fu JF / Zheng HL / Mei T. Microsoft / U of S&T of China, CN. CVPR 17. [Paper]
- Recurrent CNN with intra-scale classification loss and inter-scale pairwise ranking loss to enforce finer-scale to generate more confident predictions
-
Learning Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition. Zheng HL / Mei T / Luo JB. U of S&T of China, CN / Microsoft. ICCV 17. [Paper]
- Channel grouping module to select multiple parts from CNN feature maps + loss for compact distribution and diversity with geometric constraints
-
Object-Part Attention Model for Fine-Grained Image Classification. Peng YX / Zhao JJ. Peking U, CN. arXiv 17/04 / TIP 18. [Paper]
- Propose automatic object localization via saliency extraction (CAM) for localizing objects, object-part spatial constraints and clustering of parts based on clustered intermediate CNN filters
-
Low-Rank Bilinear Pooling for Fine-Grained Classification. Kong S / Fowlkes C. University of California Irvine, US. CVPR 17. [Paper]
- Bilinear pooling with low-dimensionality projection (extra FC layer)
-
Pairwise Confusion for Fine-Grained Visual Classification. Dubey A / Naik N. MIT, US. arXiv 17/05 / ECCV 18. [Paper]
- Euclidean Distance loss which “confuses” network by adding a regularization term which minimizes distance between two images in mini-batch
-
Bilinear Convolutional Neural Networks for Fine-Grained Visual Recognition. Lin T / Maji S. University of Massachusetts Amherst, US. TPAMI 2017. 177. [Paper]
- Extension and analysis of bilinear pooling
-
Fine-grained Image Classification via Combining Vision and Language. He XT / Peng YX. Peking U, CN. CVPR 17. [Paper]
- Vision (GoogleNet) & language (CNN-RNN) two-stream network
-
Higher-Order Integration of Hierarchical Convolutional Activations for Fine-Grained Visual Categorization. Cai SJ / Zhang L. HK Polytechnic University, HK. ICCV 17. [Paper]
- Bilinear pooling for multiple layers using 1x1 Convs and concatenating intermediate outputs
-
The Devil is in the Tails: Fine-grained Classification in the Wild. Horn GV / Perona P. Caltech, US. ArXiv 2017/09. [Paper]
- Discussion on challenges related to long-tailed fine-grained classification
-
BoxCars: Improving Fine-Grained Recognition of Vehicles using 3D Bounding Boxes in Traffic Surveillance. Sochor J / Herout A. Brno U of T, CZ. Transactions on ITS 17. [Paper]
- Automatic 3D BBox estimation for car recognition
-
Diversified Visual Attention Networks for Fine-Grained Object Classification. Zhao B / Yan SC. Southwest Jiaotong U, CN. arXiv 16/06 / TMM 17. [Paper]
- Multi-scale canvas for CNN extractor + LSTM to refine CNN predictions across time steps
-
Learning a Discriminative Filter Bank Within a CNN for Fine-Grained Recognition. Wang YM / Davis LS. University of Maryland, US. arXiv 16/11 / CVPR 18. [Paper]
- Two stream head: global (original) and part with 1x1 Conv, spatial global max pooling, and filter grouping/pooling to focus on most discriminative parts
-
Picking Deep Filter Responses for Fine-grained Image Recognition. Zhang XP / Tian Q. Shanghai Jiao Tong U, CN. CVPR 16. [Paper]
- Selecting deep filters which react to parts + spatial-weighting of Fisher Vector
-
BoxCars: 3D Boxes as CNN Input for Improved Fine-Grained Vehicle Recognition. Sochor J / Havel J. Brno U of T, CZ. CVPR 16. [Paper]
- 3D BBox, vehicle orientation, and shape as extra data
-
Weakly Supervised Fine-Grained Categorization With Part-Based Image Representation. Zhang Y / Do M. A*SATR, SN. TIP 16. [Paper]
- Convolutional filters for part proposal + Fisher Vector clusters for selecting useful parts + normalized FV concatenation from different scale parts
-
Mining Discriminative Triplets of Patches for Fine-Grained Classification. Wang YM / Davis LS. U of Maryland, US. CVPR 16. [Paper]
- Triplets of patches with geometric constraints to aid mid-level representations
-
Fully Convolutional Attention Networks for Fine-Grained Recognition. Liu X / Lin YQ. Baidu, CN. arXiv 16/03. [Paper]
- Simultaneously compute parts without recursion using reinforcement learning
-
Bilinear CNN Models for Fine-grained Visual Recognition. Lin TY / Maji S. U of Massachusetts, US. ICCV 15. [Paper]
- Outer product of (two) CNN feature maps (bilinear vector) as input to classifier
-
Fine-Grained Recognition Without Part Annotations. Krause J / Fei-Fei L. Stanford U, US. CVPR 15. [Paper]
- Alignment by segmentation and pose graphs based on neighbors (highest cosine similarity of CNN features) to generate parts
-
Part-Stacked CNN for Fine-Grained Visual Categorization. Huang SL / Zhang Y. U of Technology Sydney, AU. arXiv 15/12 / CVPR 16. [Paper]
- Directly perform part-based classification on detected part locations from output feature maps using FCN, shared features, two stage training
-
Deep LAC: Deep localization, alignment and classification for fine-grained recognition. Lin D / Jia JY. CVPR 15. Chinese U of Hong Kong, HK. [Paper]
- Backprop-able localization + alignment based on templates (clustered from train set)
-
Fine-Grained Categorization and Dataset Bootstrapping Using Deep Metric Learning with Humans in the Loop. Cui Y / Belongie S. Cornell U, US. arXiv 15/12 / CVPR 16. [Paper]
- Triplet loss with sampling strategy for hard negatives and utilizing web data (CNN recognition-based + human verified)
-
Multiple Granularity Descriptors for Fine-Grained Categorization. Wang DQ / Zhang Z. Fudan U, CN. ICCV 15. [Paper]
- Detectors and classifiers for each level of class granularity / hierarchy
-
Hyper-class Augmented and Regularized Deep Learning for Fine-grained Image Classification. Xie SN / Lin YQ. UC San Diego, US. CVPR 15. [Paper]
- Web data and other datasets based on hyperclasses (dogs & orientation of cars) + auxiliary loss to predict hyperclasses
-
Fine-Grained Image Classification by Exploring Bipartite-Graph Labels. Zhou F / Lin YQ. NEC Labs, US. arXiv 15/12 / CVPR 16. [Paper]
- Jointly model fine-grained clases with pre-defined coarse classes (attributes / tags such as ingredients or macro-categories)
-
A Fine-Grained Image Categorization System by Cellet-Encoded Spatial Pyramid Modeling. Zhang LM / Li XL. National U of Singapore, SN. TIE 15. [Paper]
- Traditional encoding
-
DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. Donahue J / Darrell T. UC Berkeley, US. ICML 2014. [Paper]
- CNN Features + Part Localization
-
Part-based R-CNNs for Fine-grained Category Detection. Zhang N / Darrell T. UC Berkeley, US. CVPR 14. [Paper]
- Extends R-CNN to Detect Parts with Geometric Constraints
-
Evaluation of Output Embeddings for Fine-Grained Image Classification. Akata Z / Schiele B. Max Planck Institute for Informatics, DE. arXiv 14/09 / CVPR 15. [Paper]
- Learning from Web Text + Text-Based Zero-Shot Classification
-
The application of two-level attention models in deep convolutional neural network for fine-grained image classification. Xiao TJ / Zhang Z. Peking U, CN. arXiv 14/11 / CVPR 15. [Paper]
- Bounding-Box Free Cropping (Weakly Supervised) via Multi-Stage Architecture
-
Bird Species Categorization Using Pose Normalized Deep Convolutional Nets. Branson S / Perona P. Caltech, US. BMVC 14. [Paper]
- Pose-Normalized CNN + Fine-tuning
-
Attention for Fine-Grained Categorization. Sermanet P / Real E. Google. arxiv 14/12 / ICLR 15 Workshop. [Paper]
- Large-Scale Pretraining on CNN + RNN Attention for Weakly Supervised Crops
-
Learning Category-Specific Dictionary and Shared Dictionary for Fine-Grained Image Categorization. Gao SH / Ma Y. Advanced Digital Sciences, SN. TIP 14. [Paper]
- Category Specific and Shared Codebooks
-
Jointly Optimizing 3D Model Fitting and Fine-Grained Classification. Lin YL / Davis LS. National Taiwan U, TW. ECCV 14. [Paper]
- 3D Model Fitting as Auxiliary Task
-
Fine-grained visual categorization via multi-stage metric learning. Qian Q / Lin YQ. Michigan State U, US. arXiv 14/02 / CVPR 15. [Paper]
- Multi-Stage Distance Metric (Pull Positive Pairs and Push Negative Pairs, Contrastive-like) + KNN Classifier
-
Revisiting the Fisher vector for fine-grained classification. Gosselin PH / Jegou H / Perronnin F. ETIS ENSEA / Inria, FR. Pattern Recognition Letters 2014. [Paper]
- Fisher Vector Scaling for FGIR
-
Learning Features and Parts for Fine-Grained Recognition. Krause J / Fei-Fei L. Stanford U, US. CVPR 14. [Paper]
- CNN + Unsupervised Part Discovery for Focusing on CNN Regions (No multi-stage)
-
Nonparametric Part Transfer for Fine-Grained Recognition. Goring C / Denzler J. University Jena, DE. CVPR 14. [Paper]
- Train images with similar shape to current image then transfer part annotations
-
POOF: Part-Based One-vs.-One Features for Fine-Grained Categorization, Face Verification, and Attribute Estimation. Berg T / Belhumeur P. Columbia U, US. CVPR 13. [Paper]
- Align two images, divide into small patches, classify and distinguish between patches, select most discriminative then classify again
-
Fine-Grained Crowdsourcing for Fine-Grained Recognition. Deng J / Fei-Fei L. CVPR 13. [Paper]
- Crowdsource discriminative regions and algorithm to make use of them
-
Symbiotic Segmentation and Part Localization for Fine-Grained Categorization. Chai YN / Zisserman A. U of Oxford, UK. ICCV 13. [Paper]
- Joint loss for parts + foreground / background segmentation
-
Deformable Part Descriptors for Fine-Grained Recognition and Attribute Prediction. Zhang N / Darrel T. UC Berkeley, US. ICCV 13. [Paper]
- Part localization + pose normalization
-
Efficient Object Detection and Segmentation for Fine-Grained Recognition. Angelova A / Zhu SH. NEC Labs America, US. CVPR 13. [Paper]
- Detect and segment object then crop
-
Fine-Grained Categorization by Alignments. Gavves E / Tuytelaars T. U of Amsterdam, ICCV 13. [Paper]
- Align images then predict parts based on similar images in train set
-
Style Finder : Fine-Grained Clothing Style Recognition and Retrieval. Di W / Sundaresan N. UC San Diego, US. CVPR Workshop 13. [Paper]
- Clothing dataset
-
Hierarchical Part Matching for Fine-Grained Visual Categorization. Xie LX / Zhang B. Tsinghua U, CN. ICCV 13. [Paper]
- Segmentation into semantic parts + combining mid-level features
-
Multi-level Discriminative Dictionary Learning towards Hierarchical Visual Categorization. Shen L / Huang QM. U of Chinese Academy of Sciences, CN. CVPR 13. [Paper]
- Hierarchical classification
-
Vantage Feature Frames for Fine-Grained Categorization. Sfar A / Geman D. INRIA Saclay. CVPR 13. [Paper]
- Find points and orientation from which to distinguish fine-grained details (inspired by experts approach)
-
Con-text: text detection using background connectivity for fine-grained object classification. Karaoglu S / Gevers T. U of Amsterdam, NL. ACM MM 13. [Paper]
- Text detection (foreground) by reconstructing background using morphology then substract background
-
Discovering localized attributes for fine-grained recognition. Duan K / Grauman K. Indiana U, US. CVPR 12. [Paper]
- Detection of human interpretable attributes
-
Unsupervised Template Learning for Fine-Grained Object Recognition. Shapiro L / Yang SL. U of Washington, US. NIPS 12. [Paper]
- Template detection and use them to align images
-
A codebook-free and annotation-free approach for fine-grained image categorization. Yao BP / Fei-Fei L. Stanford U, US. CVPR 12. [Paper]
- Template-based similarity matching between random templates
-
Combining randomization and discrimination for fine-grained image categorization. Yao BP. / Fei-Fei L. Stanford U, US. CVPR 11. [Paper]
- Random forest + discriminative trees
-
Fisher Vectors for Fine-Grained Visual Categorization. Sanchez J / Akata Z. Xerox. FGVC Workshop in CVPR 11. [Paper]
- Fisher vectors
-
SkinCon: A skin disease dataset densely annotated by domain experts for fine-grained model debugging and analysis [Paper]
-
GrainSpace: A Large-scale Dataset for Fine-grained and Domain-adaptive Recognition of Cereal Grains [Paper]
-
FAIR1M: A Benchmark Dataset for Fine-grained Object Recognition in High-Resolution Remote Sensing Imagery [Paper]
-
Yoga-82: A New Dataset for Fine-grained Classification of Human Poses [Paper]
-
Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection [Paper]
-
A large-scale car dataset for fine-grained categorization and verification [Paper]
-
Birdsnap: Large-Scale Fine-Grained Visual Categorization of Birds [Paper]
-
3D Object Representations for Fine-Grained Categorization [Paper]
-
Fine-Grained Visual Classification of Aircraft [Paper]
-
Novel Dataset for Fine-Grained Image Categorization : Stanford Dogs [Paper]
Thanks Awesome-Crowd-Counting for the template.