vision-tutorial
is a tutorial for who is studying Computer Vision Basic Architectures
using Pytorch and Keras. Most of the models about Vision were implemented with less than 100 lines of code(except comments or blank lines). The list of these papers is a list that Professor Sung Kim recommended.
-
Data was used as overfitting to show simple model learning. One image about Cat or Dog
-
The accuracy of the model is not important in this project because it is affected by data. I recommend that you **focus on the structure of the model, the number of parameters, the learning process and paper detailed implementation. **
-
How to handle image in Pytorch and Keras
- Image Resizing, Cropping
-
Introduction CNN(Convolutional Neural Networks) in Pytorch and Keras
-
How does number of channels, filter size (=kernel), grid, and padding affect Convolution?
-
-
AlexNet(2012.09)
-
ZFNet(2013.11)
-
VGG16(2014.09)
-
Inception.v1(a.k.a GoogLeNet)(2014.09)
- Paper : Going Deeper with Convolutions
-
Inception.v2, v3(2015.12)
-
ResNet(2015.12)
- Paper : Deep Residual Learning for Image Recognition
- Model
-
Inception.v4(2016.02)
-
DenseNet(2016.08)
- Paper : Densely Connected Convolutional Networks
- Model
-
Xception(2016.10)
-
MobileNet(2017.04)
-
SENet(2017.09)
- Paper : Squeeze-and-Excitation Networks
- FCN(2014.11) : Fully Convolutional Networks for Semantic Segmentation
- U-Net(2015.05) : U-Net: Convolutional Networks for Biomedical Image Segmentation](https://arxiv.org/abs/1606.00915)
- SegNet(2015.11) : SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
- DeepLab(2016.06) : DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
- ENet(2016.07) : ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation
- PSPNet(2016.12) : Pyramid Scene Parsing Network
- ICNet(2017.04) : ICNet for Real-Time Semantic Segmentation on High-Resolution Images
- GAN(2014.06) : Generative Adversarial Networks
- DCGAN(2015.11) : Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
- Pix2Pix(2016.11) : Image-to-Image Translation with Conditional Adversarial Networks
- WGAN(2017.01) : Wasserstein GAN
- CycleGAN(2017.05) : Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
- RCNN(2013.11) : Rich feature hierarchies for accurate object detection and semantic segmentation
- Fast-RCNN(2015.04) : Fast R-CNN
- Faster-RCNN(2015.06) : Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
- YOLO(2015.06) : You Only Look Once: Unified, Real-Time Object Detection
- SSD(2015.12) : SSD: Single Shot MultiBox Detector
- YOLO9000(2016.12) : YOLO9000: Better, Faster, Stronger
- Mask R-CNN(2017.05) : Mask R-CNN
- RetinaNet(2017.08): Focal Loss for Dense Object Detection
- Tae Hwan Jung(Jeff Jung) @graykode
- Author Email : nlkey2022@gmail.com