Skip to content
Not Suitable for Work (NSFW) image classification. Classification using the VGG16 CNN for large-scale image recognition implemented with Keras, Tensorflow.js, and PyTorch πŸ‘€
JavaScript HTML Shell CSS Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Salty Wet Man πŸ’™

Not Suitable for Work (NSFW) image classification using Keras, Tensorflow.js, and PyTorch. Warning. Repo contains abstract nudity and may be unsuitable for the workplace.

Status GitHub Issues GitHub Pull Requests License


Table_of_Contents πŸ’™

Introduction πŸ’™

  • Defining NSFW material is subjective and the task of identifying these images is non-trivial

  • Salty-Wet-Man identifies images solving a binary classification problem:

    • [SFW] positively trained for neutral images that are safe for work

    • [NSFW] negatively trained for pornographic images involving sexually explicit images

Convolutional_Neural_Networks πŸ’™

Image Datasets

  • Theoretically CNN is best since large learning capacity and complexity
  • Stationarity of statistics
  • Locality of pixel dependencies

NSFW Images

  • Static images
  • Uncontrolled backgrounds
  • Multiple people and partial figures
  • Different camera angles

GPU Implementation

  • Heavy computation required - Size of CNN network limited by GPU memory avaliabe
  • Highly optimized implementation of 2D convolutions
  • Solution to spread network over multiple GPUs via parallel processing

Object_Recognition πŸ’™

Deep Learning's Impact on Computer Vision

deep learning impact

Labeled Image-Training Datasets

  • Small image datasets (order of tens of thousands of images) - MNIST digit-recognition with best error rate
  • Large image datasets (order of hundreds of thousands of images) - ImageNet

ImageNet used for Large Scale Object Recognition

  • Dataset over 15 million labeled images
  • Variable-resolution images (256x256)
  • Training, validation, and testing images
  • Benchmark - ImageNet Large-Scale Visual Recognition Challenge (ILSVRC)

NSFW_Object_Recognition:_Content-Based_Retrival_via_Localization πŸ’™

Image Location with Large Areas of Skin-colored Regions

  • Skin region properties - image, color, and texture

  • Input RGB values (skin spatial pixels) with log-opponent representation

    • L(x) = 105*logbaseten(x+1+n)
    • I = L(G)
    • Rg = L(R) - L(G)
    • By = L(B) - (L(G) + L(R))/2
  • Intensity of image (texture) smooth-ed with median filter, then subtracted from original image

  • Query By Image Content (QBIC)

    • Absraction of an image to search for colored textured regions

    • Uses image decomposition, pattern matching, and clustering algorithms

    • Find a set of images similar to a query image

      Image retrival algorithm

Elongated Regions Grouping

  • Group 2D and 3D constraints on body/skin regions
  • Model human body == cylindrical parts within skeleton geometry
  • Identify region outline

Classify Regions into Human Limbs

  • Geometric grouping algorithms - matching view to collection of images of an object

  • Make hypothesis object present, and an estimate of appearance via future vector from compressed image

  • Minimum distance classifer to match feature vectors


NSFW_Object_Recognition:_Detection,_and_Segmentation πŸ’™

  • Object Image Segmentation

    • Group together skin pixels
    • Normalized cut
  • Input image each pixel with a category label

    • For every pixel - Check if the pixel [skin or not-skin]
  • If atleast 30% of the image area skin, the image will be identified as passing the skin filter

  • Training data for this super expensive - need to find images with every pixel labeled


NSFW_Object_Recognition_Image_Cropping πŸ’™

Neural_Network_Classifier_VGG16_Model πŸ’™

  • VGG16 is a CNN for large-scale image recognition
  • Model achieves 92.7% top-5 test accuracy on ImageNet
  • Implemented with Keras and Tensorflow backend in this project

Image of VGG16 architecture

VGG16 Architecture

  • Fixed input of 224 x 224 RGB image
  • Three fully-connected (FC) layers
    • 4096, 4096, and 1000 chanels respectively
  • Max pooling layers
  • Hidden layers have ReLu Retification
  • Final layer is soft-max layer
  • Total 16 Layers

VGG16 Disadvantages

  • Super slow - takes weeks to train
  • Large disk/bandwidth network achitecture with +533MB
  • Consider varient VGG19 classifer

VGG16 Keras Implementation

keras.applications.vgg16.VGG16(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

Full Keras VGG Code

Neural_Network_Errors_and_Overfitting πŸ’™

Data Augmentation

  • Label peserving transfomations

  • RGB channel intensities

    • Add transformation (covariance matrix) to each RGB image pixel
    • Object idenity invariant to changes in intensity/colour of images

Dropout Rates

  • ReLu neutrons
  • Dropout is used for first two fully-connected (FC) layers (4096 and 4096)

Technical_Installations πŸ’™

Requires heavy computation

  1. Install Python dependencies and packages (Keras, TensorFlow, and TensorFlow.js) - best to run from virtualenv

  2. Download and convert the VGG16 model to TensorFlow.js format

  3. Launch Node.js script to load converted model and compute maximally-activating input images for convnet's filters using gradient ascent in the input space. Save image files under dist/filters directory

  4. Launch Node.js script to calculate internal convolutional layers' activations and gradient-based Class Activation Map (CAM). Save image files under dist/activation directory

  5. Compile. Launch web view at

Technical_Visualizations πŸ’™

yarn visualize

Increase the number of filters to visualize per convolutional layer from default 8 to larger value (ex. 18):

yarn visualize --gpu --filters 18

Default image used for internal-activation and CAM visualization is "nsfw.jpg". Switch to another image by using the "--image waifu-pic.jpeg" πŸ‘€

yarn visualize --image waifu-pic.jpeg

References πŸ’™

You can’t perform that action at this time.