<img src="https://1.bp.blogspot.com/-5pULKUERnIc/Wpc7qPnUCuI/AAAAAAAACao/4YOtEQb_1gEweHRf8-drmi7KBEa1BmBTgCLcBGAs/s1600/image2.png">

# About the Competition🚩
<p style="font-size:15px">Welcome to the fourth Landmark Recognition competition! This year, we introduce a lot more diversity in the challenge’s test images in order to measure global landmark recognition performance in a fairer manner. And following last year’s success, we set this up as a code competition.
<br></p>
<p>
Have you ever gone through your vacation photos and asked yourself: What is the name of this temple I visited in China? Who created this monument I saw in France? Landmark recognition can help! This technology can predict landmark labels directly from image pixels, to help people better understand and organize their photo collections. This competition challenges Kagglers to build models that recognize the correct landmark (if any) in a dataset of challenging test images.<br></p>
<p>
Many Kagglers are familiar with image classification challenges like the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), which aims to recognize 1K general object categories. Landmark recognition is a little different from that: it contains a much larger number of classes (there are more than 81K classes in this challenge), and the number of training examples per class may not be very large. Landmark recognition is challenging in its own way
<br>
</p>

# Data Description
<div style="font-size:15px">
 there are 2 folder train and test which contains images in jpg format and 2 csv files:-
<ul>
    <li><code>train:</code> contains train images in jpf format 
</li>
    <li><code>test:</code>contains test images in jpf format</li>
    <li><code>train.csv:</code> labels of train images</li>
    <li><code>sample_submission.csv:</code> a sample submission file in the correct format
</li>
</ul>    
</div>

# Evaluation
Submissions are evaluated using Global Average Precision (GAP) at (k), where (k=1). This metric is also known as micro Average Precision (\mu AP), as per references 1 and 2 below. It works as follows:

For each test image, you will predict one landmark label and a corresponding confidence score. The evaluation treats each prediction as an individual data point in a long list of predictions, sorted in descending order by confidence scores, and computes the Average Precision based on this list.

References:

1) F. Perronnin, Y. Liu, and J.-M. Renders, "A Family of Contextual Measures of Similarity between Distributions with Application to Image Retrieval," Proc. CVPR'09

2) T. Weyand, A. Araujo, B. Cao and J. Sim, "Google Landmarks Dataset v2 - A Large-Scale Benchmark for Instance-Level Recognition and Retrieval," Proc. CVPR'20

In [None]:
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns
import plotly.express as px
import tensorflow as tf
import keras
import keras.layers as L
import math
import cv2
from keras.utils import Sequence
from keras.preprocessing import image
from random import shuffle
from sklearn.model_selection import train_test_split

In [None]:
train_labels = pd.read_csv('../input/landmark-recognition-2021/train.csv')
sample_submission = pd.read_csv('../input/landmark-recognition-2021/sample_submission.csv')
counts = train_labels.landmark_id.value_counts()
counts = counts[counts >=50].index #indexing only classes which have atleast 50 samples
train_labels = train_labels.loc[train_labels.landmark_id.isin(counts)]
num_classes = counts.shape[0] 
print(num_classes)

In [None]:
def id2path(idx,is_train=True):
    path = '../input/landmark-recognition-2021'
    if is_train:
        path += '/train/'+idx[0]+'/'+idx[1]+'/'+idx[2]+'/'+idx+'.jpg'
    else:
        path += '/test/'+idx[0]+'/'+idx[1]+'/'+idx[2]+'/'+idx+'.jpg'
    return path
train_labels['file_path'] = train_labels['id'].apply(id2path)
sample_submission['file_path'] = sample_submission['id'].apply(id2path,False)

In [None]:
def read_image(idx):
    image = cv2.imread(idx)
    image = image/255.
    image = cv2.resize(image,(256,256))
    return image

In [None]:
def plot_images(landmark_id=27): #plot images by image_id
    landmark = train_labels[train_labels['landmark_id']==landmark_id].head(25)
    imgs = [read_image(x) for x in landmark['file_path']]
    _, axs = plt.subplots(5,5, figsize=(12, 12))
    axs = axs.flatten()
    for i, (img, ax) in enumerate(zip(imgs, axs)):
        ax.title.set_text(str(landmark['id'].iloc[i]))
        ax.imshow(img)
        ax.axis('off')
    plt.show()

In [None]:
plot_images()

In [None]:
plot_images(136)

In [None]:
plot_images(139)

In [None]:
plot_images(203071)

In [None]:
class Dataset(Sequence):
    def __init__(self,idx,y=None,batch_size=32,shuffle=True):
        self.idx = idx
        self.batch_size = batch_size
        self.shuffle = shuffle
        if y is not None:
            self.is_train=True
        else:
            self.is_train=False
        self.y = y
    def __len__(self):
        return math.ceil(len(self.idx)/self.batch_size)
    def __getitem__(self,ids):
        batch_ids = self.idx[ids * self.batch_size:(ids + 1) * self.batch_size]
        if self.y is not None:
            batch_y = self.y[ids * self.batch_size: (ids + 1) * self.batch_size]
            
        list_x = np.array([read_image(x) for x in batch_ids])
        batch_X = np.stack(list_x)
        if self.is_train:
            return batch_X, batch_y
        else:
            return batch_X
    
    def on_epoch_end(self):
        if self.shuffle and self.is_train:
            ids_y = list(zip(self.idx, self.y))
            shuffle(ids_y)
            self.idx, self.y = list(zip(*ids_y))

In [None]:
train_idx =  train_labels['file_path'].values
y = train_labels['landmark_id'].values
test_idx = sample_submission['file_path'].values

In [None]:
x_train,x_valid,y_train,y_valid = train_test_split(train_idx,y,test_size=0.05,random_state=42)

In [None]:
train_dataset = Dataset(x_train,y_train)
valid_dataset = Dataset(x_valid,y_valid)
test_dataset = Dataset(test_idx)

In [None]:
!pip install ../input/keras-efficientnet-whl/Keras_Applications-1.0.8-py3-none-any.whl
!pip install ../input/keras-efficientnet-whl/efficientnet-1.1.1-py3-none-any.whl

In [None]:
import efficientnet.keras as efn

In [None]:
model = tf.keras.Sequential([efn.EfficientNetB0(include_top=False,input_shape=(256,256,3),weights='../input/efficientnet-keras-weights-b0b5/efficientnet-b0_imagenet_1000_notop.h5'),
        L.GlobalAveragePooling2D(),
        L.Dense(32,activation='relu'),
        L.Dense(num_classes, activation='sigmoid')])
model.summary()
model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.001),
              loss=keras.losses.SparseCategoricalCrossentropy(), metrics=[keras.metrics.SparseCategoricalAccuracy()])

In [None]:
#model.fit(train_dataset,epochs=1,validation_data=valid_dataset)

In [None]:
#preds = model.predict(test_dataset)
#preds = preds.reshape(-1)
sample_submission = pd.read_csv('../input/landmark-recognition-2021/sample_submission.csv')
sample_submission.to_csv('submission.csv',index=False)

<h2><center>If you found this notebook useful please upvote</center></h2>

<h2><center>Work in Progress ... ⏳</center></h2>