Skip to content

Deep Learning - Visual Representation Learning by solving Jigsaw puzzles using Deep Reinforcement Learning

License

Notifications You must be signed in to change notification settings

harpribot/harpreif

Repository files navigation

harpreif

jigsaw Visual Representation Learning by solving Jigsaw puzzles using Deep Reinforcement Learning

Contents

Dataset

dataset

We take 240 objects and randomly choose 80 images from each of them. Then divide it into (50/10/20) for Training/Validation/Testing respectively. Then for testing for transfer learning, we take 30 images from the rest 16 object categories, and use that for transfer testing.

Input Construction

For input construction, a windowed HOG gradient (across 8 directions) is calculated for the image and then subsequently discretized, which gives us a state representation, as shown below:

input

Deep Q Network

The Deep Q network is used for evaluation function for Reinforcement Learning. The network is shown below:

dqn

Experimental Results

Test Images

The T-Sne plot for the image features (penultimate layer activation - FC3 layer) for the test images are plot across iterations. The results shows that RL agent learns to generate cluster to improve Learning.

tsne-test

20 neighbors

plot-test-20

100 neighbors

plot-test-100

Transfer Learning Test Images

The T-Sne plot for the image features (penultimate layer activation - FC3 layer) for the transfer test images are plot across iterations. The results shows that RL agent learns to generate cluster to improve Learning. The images were not used for training, and thus this shows transfer learning.

tsne-tftest

20 neighbors

plot-tftest-20

About

Deep Learning - Visual Representation Learning by solving Jigsaw puzzles using Deep Reinforcement Learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages