Deep Image Retrieval

Image Retrieval using Deep Feature

My Experiements

notify

My individual review is that methods such as multi-scale input images, multiple backbone networks, QE, etc. are not practical and are just tricks to increase performance. In this experiments, I avoid the above mentioned methods as much as possible, and I will try to achieve SOTA in a way that is as true to the basics as possible.
These results show only the best results for each evaluation set(Oxford5k, Paris6k, Holidays) among the results of applying the model generated during training. That is, it is not a result of a single model. Of course, one model may yield the best results for all evaluation sets.
In case of npair loss, normalization is not performed in the last layer. The reason was not learned.
using tensorflow(tf) and pytorch(pt)
P of the gem was fixed at 3. In the future, I plan to continue tuning around 3. > hint
pytorch gem [1-10][2-2] : Code for reproducing fintuned-gem > Some code modifed
[2] shows results using fintuned-gem's trained model
[2-3] this paper based

update : 2020-04-09 (Currently in progress)

NO	net	feat	Holidays	Paris6k	Oxf5k	dim	loss	trainset	pre-trained	lib
[1-1]	alexnet	fc6	0.789		0.557	128	cls	nc	imgnet
[1-2]	res152	gem	0.9026	0.8927	0.7808	1024	npairs	nc	imgnet	tf
[1-3]	res152	gem:single	0.9001	0.8927	0.7507	1024	npairs	nc	imgnet	tf
[1-4]	res152	mac	0.8983	0.8779	0.7732	1024	npairs	nc	imgnet	tf
[1-5]	res152	mac:single	0.8983	0.8702	0.7636	1024	npairs	nc	imgnet	tf
[1-6]	res152	spoc	0.8845	0.8322	0.7184	1024	npairs	nc	imgnet	tf
[1-7]	res152	spoc:single	0.8813	0.8306	0.7184	1024	npairs	nc	imgnet	tf
[1-8]	res101	r-mac	0.8527	0.9104	0.8018	2048	triplet	nc	imgnet	pt
[1-9]	res152	r-mac	0.8468	0.935	0.808	2048	triplet	nc	imgnet	pt
[1-10]	res101	gem		0.8487	0.7339	2048	contrastive	SfM	imgnet	pt
[2]	res101	gem		0.829	0.782	2048	contrastive	SfM	imgnet	pt
[2-3]	res101	gem		0.9323	0.9185	512	arcface	GDV1	imgnet	pt

update : 2020-04-09 (Currently in progress)

NO	net	feat	rox_e	rox_m	rox_h	rpa_e	rpa_m	rpa_h	dim	loss	trainset	pre-trained	lib
[2]	res101	gem	0.7389	0.539	0.247	0.8467	0.659	0.388	2048	contrastive	SfM	imgnet	pt
[2-1]	res101	r-mac	0.6058	0.4156	0.1421	0.828	0.6759	0.4418	2048	triplet	nc	imgnet	pt
[2-2]	res101	gem	0.706	0.495	0.19	0.849	0.6757	0.4183	2048	contrastive	SfM	imgnet	pt
[2-3]	res101	gem	0.8512	0.7097	0.4665	0.9157	0.8061	0.6319	512	arcface	GDV1	imgnet	pt

refer[1-1] : Neural Codes for Image Retrieval : [paper][review]
nc: neuralcode clean dataset
SfM: retrieval-SfM-120k
GDV1 : google landmark V1
GDV2 : google landmark V2
rox: revisitop_oxford
rpa: revisitop_rparis
e:easy, m:middle, h:hard
triplet : triplet loss
imgnet : imagenet
tf : tensorflow
pt : pytorch

Instance benchmark dataset

NO	Title	카테고리	link	category	query	all	비고
1	Oxford5k	landmark	링크	16	55	5,062
2	Paris6k	landmark	링크	11	55	6,412
3	Holidays	landmark	링크	500	500	1,491
4	Google-Landmarks_V1	landmark	링크	12,894	100,000	1,060,709	textbysearch
4	Google-Landmarks_V2	landmark	링크	203,094	117,577	5,012,248	textbysearch
5	UKBench	landmark	링크	2,550	10,200	10,200
6	FlickrLogos-32	logo	링크	32	500	8,240
7	FlickrLogos-47	logo	링크	47	?	?
8	INSTRE	Instance	링크	200	N/A	28,543
9	ZuBuD	landmark	링크	200	115	1,005
10	SMVS	표지류	링크	1,200	3,300	1,200
11	DupImage	Instance	링크	33	108	1,104
12	Neural Codes	landmark	링크	672		213,678	textbysearch

평가 방법

Learnable (fine-tuning using targeted datasets)

QE performance remove
GD : Global Descriptor
LD : Local Descriptor

Paper	Oxf5k	Par6k	Oxf105k	Par106k	Holidays	descriptor	비고
SOTA	86.1	94.5	82.8	90.6	90.3/94.8
[1]	86.1	94.5	82.8	90.6	90.3/94.8	GD	DIR, triplet, R-MAC
[2]	83.8	85.0	82.6	81.7		LD	delf, softmax
[3]	79.7	83.8	73.9	76.4	82.5	GD	siamense, R-MAC

[1] End-to-end Learning of Deep Visual Representations for Image Retrieval : [paper][review]
[2] Large-Scale Image Retrieval with Attentive Deep Local Features : [paper][review]
- delf는 QE+DIR과의조합을 통해 SOTA를 기록한 Case임.
[3] CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples

Not Learnable (only trained on ImageNet)

Paper	Oxf5k	Par6k	Oxf105k	Par106k	Holidays	Sculp6k	UKB	descriptor	비고
SOTA	0.712	0.805	0.672	0.733
[1]	0.712	0.805	0.672	0.733				GD	CAM
[2]	53.3	67.0	48.9		71.6	37.7	84.2	GD	MAC (first paper), Max pooling + l1 dist

[1] Class-Weighted Convolutional Features for Visual Instance Search
[2] Visual Instance Retrieval with Deep Convolutional Networks

Name		Name	Last commit message	Last commit date
Latest commit History 217 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Image Retrieval

My Experiements

notify

update : 2020-04-09 (Currently in progress)

update : 2020-04-09 (Currently in progress)

Instance benchmark dataset

평가 방법

Learnable (fine-tuning using targeted datasets)

Not Learnable (only trained on ImageNet)

About

Releases

Packages

chullhwan-song/OLD-Deep-Image-Retrieval

Folders and files

Latest commit

History

Repository files navigation

Deep Image Retrieval

My Experiements

notify

update : 2020-04-09 (Currently in progress)

update : 2020-04-09 (Currently in progress)

Instance benchmark dataset

평가 방법

Learnable (fine-tuning using targeted datasets)

Not Learnable (only trained on ImageNet)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages