Reading list on deep learning.
- AlexNet: MLA Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. ⭐⭐⭐⭐⭐
- Dropout: Srivastava, Nitish, et al. "Dropout: a simple way to prevent neural networks from overfitting." Journal of Machine Learning Research 15.1 (2014): 1929-1958. ⭐⭐⭐⭐
- VGG: Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014). ⭐⭐⭐⭐⭐
- GoogLeNet: Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. ⭐⭐⭐⭐⭐
- Batch Normalization: Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015). [Inception v2] ⭐⭐⭐⭐⭐
- PReLU & msra Initilization: He, Kaiming, et al. "Delving deep into rectifiers: Surpassing human-level performance on imagenet classification." Proceedings of the IEEE international conference on computer vision. 2015. ⭐⭐⭐⭐⭐
- InceptionV3: Szegedy, Christian, et al. "Rethinking the inception architecture for computer vision." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. ⭐⭐⭐⭐
- ResNet: He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. ⭐⭐⭐⭐⭐
- Identity ResNet: He, Kaiming, et al. "Identity mappings in deep residual networks." European Conference on Computer Vision. Springer International Publishing, 2016. ⭐⭐⭐⭐⭐
- CReLU: Shang, Wenling, et al. "Understanding and improving convolutional neural networks via concatenated rectified linear units." Proceedings of the International Conference on Machine Learning (ICML). 2016. ⭐⭐⭐
- InceptionV4 & Inception-ResNet: Szegedy, Christian, et al. "Inception-v4, inception-resnet and the impact of residual connections on learning." arXiv preprint arXiv:1602.07261 (2016). ⭐⭐⭐⭐
- ResNeXt: Xie, Saining, et al. "Aggregated residual transformations for deep neural networks." arXiv preprint arXiv:1611.05431 (2016). ⭐⭐⭐⭐
- Batch Renormalization: Ioffe, Sergey. "Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models." arXiv preprint arXiv:1702.03275 (2017). ⭐⭐⭐⭐
- Xception: Chollet, François. "Xception: Deep Learning with Depthwise Separable Convolutions." arXiv preprint arXiv:1610.02357 (2016). ⭐⭐⭐
- MobileNets: Howard, Andrew G., et al. "Mobilenets: Efficient convolutional neural networks for mobile vision applications." arXiv preprint arXiv:1704.04861 (2017). ⭐⭐⭐
- DenseNet: Huang, Gao, et al. "Densely connected convolutional networks." arXiv preprint arXiv:1608.06993 (2016). ⭐⭐⭐⭐⭐
- PolyNet: Zhang, Xingcheng, et al. "Polynet: A pursuit of structural diversity in very deep networks." arXiv preprint arXiv:1611.05725 (2016). Slides ⭐⭐⭐⭐
- IRNN: Le, Quoc V., Navdeep Jaitly, and Geoffrey E. Hinton. "A simple way to initialize recurrent networks of rectified linear units." arXiv preprint arXiv:1504.00941 (2015). ⭐⭐⭐
- Overfeat: Sermanet, Pierre, et al. "Overfeat: Integrated recognition, localization and detection using convolutional networks." arXiv preprint arXiv:1312.6229 (2013). ⭐⭐⭐⭐
- RCNN: Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014. ⭐⭐⭐⭐⭐
- SPP: He, Kaiming, et al. "Spatial pyramid pooling in deep convolutional networks for visual recognition." European Conference on Computer Vision. Springer International Publishing, 2014. ⭐⭐⭐⭐⭐
- Fast RCNN: Girshick, Ross. "Fast r-cnn." Proceedings of the IEEE International Conference on Computer Vision. 2015. ⭐⭐⭐⭐⭐
- Faster RCNN: Ren, Shaoqing, et al. "Faster r-cnn: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015. ⭐⭐⭐⭐⭐
- R-CNN minus R: Lenc, Karel, and Andrea Vedaldi. "R-cnn minus r." arXiv preprint arXiv:1506.06981 (2015). ⭐
- End-to-end people detection in crowded scenes: Stewart, Russell, Mykhaylo Andriluka, and Andrew Y. Ng. "End-to-end people detection in crowded scenes." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. ⭐⭐
- YOLO: Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. ⭐⭐⭐⭐⭐
- ION: Bell, Sean, et al. "Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. ⭐⭐⭐⭐
- MultiPath: Zagoruyko, Sergey, et al. "A multipath network for object detection." arXiv preprint arXiv:1604.02135 (2016). ⭐⭐⭐
- SSD: Liu, Wei, et al. "SSD: Single shot multibox detector." European Conference on Computer Vision. Springer International Publishing, 2016. ⭐⭐⭐⭐⭐
- OHEM: Shrivastava, Abhinav, Abhinav Gupta, and Ross Girshick. "Training region-based object detectors with online hard example mining." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. ⭐⭐⭐⭐⭐
- HyperNet: Kong, Tao, et al. "HyperNet: towards accurate region proposal generation and joint object detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. ⭐⭐⭐⭐
- SDP: Yang, Fan, Wongun Choi, and Yuanqing Lin. "Exploit all the layers: Fast and accurate cnn object detector with scale dependent pooling and cascaded rejection classifiers." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. ⭐⭐⭐⭐
- SubCNN: Xiang, Yu, et al. "Subcategory-aware convolutional neural networks for object proposals and detection." Applications of Computer Vision (WACV), 2017 IEEE Winter Conference on. IEEE, 2017. ⭐⭐⭐
- MSCNN: Cai, Zhaowei, et al. "A unified multi-scale deep convolutional neural network for fast object detection." European Conference on Computer Vision. Springer International Publishing, 2016. ⭐⭐⭐⭐
- RFCN: Li, Yi, Kaiming He, and Jian Sun. "R-fcn: Object detection via region-based fully convolutional networks." Advances in Neural Information Processing Systems. 2016. ⭐⭐⭐⭐⭐
- Shallow Network: Ashraf, Khalid, et al. "Shallow networks for high-accuracy road object-detection." arXiv preprint arXiv:1606.01561 (2016). ⭐⭐
- Is Faster R-CNN Doing Well for Pedestrian Detection: Zhang, Liliang, et al. "Is Faster R-CNN Doing Well for Pedestrian Detection?." European Conference on Computer Vision. Springer International Publishing, 2016. ⭐⭐
- GCNN: Najibi, Mahyar, Mohammad Rastegari, and Larry S. Davis. "G-cnn: an iterative grid based object detector." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. ⭐⭐⭐
- LocNet: Gidaris, Spyros, and Nikos Komodakis. "Locnet: Improving localization accuracy for object detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. ⭐⭐⭐
- PVANet: Kim, Kye-Hyeon, et al. "PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection." arXiv preprint arXiv:1608.08021 (2016). ⭐⭐⭐⭐
- FPN: Lin, Tsung-Yi, et al. "Feature Pyramid Networks for Object Detection." arXiv preprint arXiv:1612.03144 (2016). ⭐⭐⭐⭐⭐
- TDM: Shrivastava, Abhinav, et al. "Beyond Skip Connections: Top-Down Modulation for Object Detection." arXiv preprint arXiv:1612.06851 (2016). ⭐⭐⭐⭐
- YOLO9000: Redmon, Joseph, and Ali Farhadi. "YOLO9000: Better, Faster, Stronger." arXiv preprint arXiv:1612.08242 (2016). ⭐⭐⭐⭐
- Speed/accuracy trade-offs for modern convolutional object detectors: Huang, Jonathan, et al. "Speed/accuracy trade-offs for modern convolutional object detectors." arXiv preprint arXiv:1611.10012 (2016). ⭐⭐
- GDB-Net: Zeng, Xingyu, et al. "Crafting GBD-Net for Object Detection." arXiv preprint arXiv:1610.02579 (2016). Slides ⭐⭐⭐⭐
- WRInception: Lee, Youngwan, et al. "Wide-Residual-Inception Networks for Real-time Object Detection." arXiv preprint arXiv:1702.01243 (2017). ⭐
- DSSD: Fu, Cheng-Yang, et al. "DSSD: Deconvolutional Single Shot Detector." arXiv preprint arXiv:1701.06659 (2017). ⭐⭐⭐⭐
- A-Fast-RCNN (Hard positive generation): Wang, Xiaolong, Abhinav Shrivastava, and Abhinav Gupta. "A-fast-rcnn: Hard positive generation via adversary for object detection." arXiv preprint arXiv:1704.03414 (2017). ⭐⭐⭐ code
- RRC: Ren, Jimmy, et al. "Accurate Single Stage Detector Using Recurrent Rolling Convolution." arXiv preprint arXiv:1704.05776 (2017). ⭐⭐⭐
- Deformable ConvNets: Dai, Jifeng, et al. "Deformable Convolutional Networks." arXiv preprint arXiv:1703.06211 (2017). ⭐⭐⭐⭐
- RSSD: Jeong, Jisoo, Hyojin Park, and Nojun Kwak. "Enhancement of SSD by concatenating feature maps for object detection." arXiv preprint arXiv:1705.09587 (2017). ⭐⭐
- Perceptual GAN: Li, Jianan, et al. "Perceptual Generative Adversarial Networks for Small Object Detection." arXiv preprint arXiv:1706.05274 (2017). ⭐⭐⭐
- FCN: Long, Jonathan, Evan Shelhamer, and Trevor Darrell. "Fully convolutional networks for semantic segmentation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. ⭐⭐⭐⭐⭐
- Deconvolution Network for Segmentation: Noh, Hyeonwoo, Seunghoon Hong, and Bohyung Han. "Learning deconvolution network for semantic segmentation." Proceedings of the IEEE International Conference on Computer Vision. 2015. ⭐⭐⭐
- MNC: Dai, Jifeng, Kaiming He, and Jian Sun. "Instance-aware semantic segmentation via multi-task network cascades." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. ⭐⭐⭐⭐⭐
- InstanceFCN: Dai, Jifeng, et al. "Instance-sensitive fully convolutional networks." arXiv preprint arXiv:1603.08678 (2016). ⭐⭐⭐⭐
- FCIS: Li, Yi, et al. "Fully convolutional instance-aware semantic segmentation." arXiv preprint arXiv:1611.07709 (2016). ⭐⭐⭐⭐⭐
- PSPNet: Zhao, Hengshuang, et al. "Pyramid scene parsing network." arXiv preprint arXiv:1612.01105 (2016). ⭐⭐⭐
- Mask R-CNN: He, Kaiming, et al. "Mask r-cnn." arXiv preprint arXiv:1703.06870 (2017). ⭐⭐⭐⭐⭐
- Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning: Cinbis, Ramazan Gokberk, Jakob Verbeek, and Cordelia Schmid. "Weakly supervised object localization with multi-fold multiple instance learning." IEEE transactions on pattern analysis and machine intelligence 39.1 (2017): 189-203. ⭐⭐⭐
- Weakly Supervised Deep Detection Networks: Bilen, Hakan, and Andrea Vedaldi. "Weakly supervised deep detection networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. ⭐⭐⭐⭐
- Weakly- and Semi-Supervised Learning: Papandreou, George, et al. "Weakly-and semi-supervised learning of a deep convolutional network for semantic image segmentation." Proceedings of the IEEE International Conference on Computer Vision. 2015. ⭐⭐⭐⭐
- Image-level to pixel-level labeling: Pinheiro, Pedro O., and Ronan Collobert. "From image-level to pixel-level labeling with convolutional networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
- Weakly Supervised Localization using Deep Feature Maps: Bency, Archith J., et al. "Weakly supervised localization using deep feature maps." arXiv preprint arXiv:1603.00489 (2016).
- WELDON: Durand, Thibaut, Nicolas Thome, and Matthieu Cord. "Weldon: Weakly supervised learning of deep convolutional neural networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
- WILDCAT: Durand, Thibaut, et al. "WILDCAT: Weakly Supervised Learning of Deep ConvNets for Image Classification, Pointwise Localization and Segmentation." The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017.
- SGDL: Lai, Baisheng, and Xiaojin Gong. "Saliency guided dictionary learning for weakly-supervised image parsing." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
- Learning Features by Watching Objects Move: Pathak, Deepak, et al. "Learning Features by Watching Objects Move." arXiv preprint arXiv:1612.06370 (2016). ⭐⭐⭐⭐⭐
- SimGAN: Shrivastava, Ashish, et al. "Learning from simulated and unsupervised images through adversarial training." arXiv preprint arXiv:1612.07828 (2016). ⭐⭐⭐
- OPN: Lee, Hsin-Ying, et al. "Unsupervised Representation Learning by Sorting Sequences." arXiv preprint arXiv:1708.01246 (2017). ⭐⭐⭐
- Transitive Invariance for Self-supervised Visual Representation Learning: Wang, Xiaolong, et al. "Transitive Invariance for Self-supervised Visual Representation Learning" Proceedings of the IEEE International Conference on Computer Vision. 2017. ⭐⭐⭐ code
- DHSNet: Liu, Nian, and Junwei Han. "Dhsnet: Deep hierarchical saliency network for salient object detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. ⭐⭐⭐⭐
- RFCN: Wang, Linzhao, et al. "Saliency detection with recurrent fully convolutional networks." European Conference on Computer Vision. Springer International Publishing, 2016. ⭐⭐⭐⭐
- RACDNN: Kuen, Jason, Zhenhua Wang, and Gang Wang. "Recurrent attentional networks for saliency detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. ⭐⭐⭐⭐
- NLDF: Luo, Zhiming, et al. "Non-Local Deep Features for Salient Object Detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017. ⭐⭐⭐
- DSS: Hou, Qibin, et al. "Deeply supervised salient object detection with short connections." arXiv preprint arXiv:1611.04849 (2016). ⭐⭐⭐⭐
- MSRNet: Li, Guanbin, et al. "Instance-Level Salient Object Segmentation." arXiv preprint arXiv:1704.03604 (2017). ⭐⭐⭐⭐
- Amulet: Zhang, Pingping, et al. "Amulet: Aggregating Multi-level Convolutional Features for Salient Object Detection." arXiv preprint arXiv:1708.02001 (2017). ⭐⭐⭐⭐
- UCF: Zhang, Pingping, et al. "Learning Uncertain Convolutional Features for Accurate Saliency Detection." arXiv preprint arXiv:1708.02031 (2017). ⭐⭐⭐⭐
- SRM: Wang, Tiantian, et al. "A Stagewise Refinement Model for Detecting Salient Objects in Images." In ICCV. 2017. ⭐⭐⭐⭐
- SRN: Zhu, Feng, et al. "Learning Spatial Regularization with Image-level Supervisions for Multi-label Image Classification." arXiv preprint arXiv:1702.05891 (2017). ⭐⭐⭐⭐
- Zoom-in-Net: Wang, Zhe, et al. "Zoom-in-Net: Deep Mining Lesions for Diabetic Retinopathy Detection." arXiv preprint arXiv:1706.04372 (2017). ⭐⭐⭐⭐
- Multi-context attention: Chu, Xiao, et al. "Multi-context attention for human pose estimation." arXiv preprint arXiv:1702.07432 (2017). ⭐⭐⭐
- DeshadowNet: Qu, Liangqiong, et al. "DeshadowNet: A Multi-context Embedding Deep Network for Shadow Removal." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017. ⭐⭐⭐
- scGAN: Nguyen, Vu, et al. "Shadow Detection with Conditional Generative Adversarial Networks." In ICCV. 2017. ⭐⭐
- Fast Shadow Detection from a Single Image Using a Patched Convolutional Neural Network: Hosseinzadeh, Sepideh, Moein Shakeri, and Hong Zhang. "Fast Shadow Detection from a Single Image Using a Patched Convolutional Neural Network." arXiv preprint arXiv:1709.09283 (2017). ⭐
- G-RMI: Google. (Object Detection) slides
- 2017 CVPR Tutorial: video and slides
- 1st ImageNet and COCO Visual Recognition Challenges Joint Workshop: 2015. link
- 2nd ImageNet and COCO Visual Recognition Challenges Joint Workshop: 2016. link