Skip to content

matankleiner/Identify-Known-Sites-in-Photo-Album

Repository files navigation

Identify Known Sites in Photo Album

shield shield shield shield shield shield shield shield shield shield shield shield shield shield shield shield shield shield shield

alt text

Introduction:

This is a university project based on the Google Landmark Recognition 2020 kaggle competiton.

The goal of this project is to classify successfully images of known sites from around the world, given big and challenging train set to learn from and a test set that contain mainly out of domain images.

In face of the special and challenging features of the data set, we proposed and implemented two possible solution using machine learning techniques.

The first solution, a baseline, is a simple straight forward aprrocah, training a CNN (EfficientNet using RAdam optimizer) and use it as a classifier. This solution fail to overcome the challenging aspects of the data set and yields poor results.

The second solution is a retrival based solution that derive inspiration from other teams solution to this competition.

This solution consist of two steps, the first is to clean the test set from out of domain images using object detection (we used YOLO darknet implementation). Object detection examples:

alt text alt text

The second is classification using nearest neighbor algorithm, using the images features vector.

The power of using feature vectore and K-NN (the test set image is to the left, next to it there are the 5 nearest neighbors from the train set. If there is a small res X in the lower right corner of the image, it means that this image class is not as the test set image):

alt text alt text alt text

Even when the classification is not succesful, the nearest neighbors still have some resemblance to the test set image:

alt text

This solution is built to face on the challenging features of the data set and although the solution it yields are far from great they are much better than the baseline's results.

Prerequisites

To run the whole code of this project, one needs the following libraries (in the specified version or higher):

Library Version
Python 3.6
torch 1.8.0
torchvision 0.9.0
pandas 1.25.0
numpy 1.19.0
opencv 4.2.0
matplotlib 3.2.1
seaborn 0.11.0
efficientnet_pytorch 0.7.0
torch_optimizer 0.1.0
sklearn 0.21.3
PIlow 6.1.0
tqdm 4.55.0

In this project we also used YOLO darknet implementation as an object detector. We used version 3 and version 4 network that were pre trained on Open Images Dataset and COCO Dataset accordingly.

Many of the code in this project is part of a jupyter notebook. Unfortunately, GitHub is not able to render successfully all the notebooks, so one can download them and run them locally or via colab or view them using nbviewer with the links in the nbviewer directory.

Code and Repository Organization

The code we wrote for this project is organized in sub directories, so that there is a sub directory for each part of the project. Each sub directory contain the relevant code files (.py or .ipynb) and may contain csv files or images.

Sub-Directory Content
\baseline directory containing implementation of the baseline, results and evaluation
\data directory containing GLDv2 dataset analysis
\feature_extraction directory containing implementation of feature extraction and K-NN classifier
\images directory containing images used in this repository
\landmark_classifier directory containing pre-process of the data as input to YOLO Darknet implementation and its results analysis and evaluation
\nbviewer directory containing nbviewer links for the jupyter notebook in this repository
\poster directory containing the project poster
\results_and_evaluation directory containing the classification results and evaluation

We tried to write the code so it will be organized and well documented.

Team:

Matan Kleiner

Yuval Snir

Supervised by Ori Linial

References

[1] T. Weyand, A. Araujo, B. Cao and J. Sim, Proc. "Google Landmarks Dataset v2 - A Large-Scale Benchmark for Instance-Level Recognition and Retrieval", CVPR'20

[2] K. Chen et-al “2nd Place and 2nd Place Solution to Kaggle Landmark Recognition and Retrieval Competition 2019", arXiv:1906.03990 [cs.CV], Jun. 2019.

[3] J. Redmon and A. Farhadi. "YOLOv3: An Incremental Improvement", arXiv:1804.02767v1 [cs.CV] Apr. 2018.

[4] A. Krizhevsky, I. Sutskever, and G. E. Hinton. "ImageNet classification with deep convolutional neural networks", In Proceedings of NIPS, pages 1106–1114, 2012.