
# Vision and Cognitive Systems - Project


<a href="https://colab.research.google.com/github/GianmarcoLattaruolo/Vision_Project/blob/main/Vision_Notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Preliminaries


## Setting up the working space

In this first cell we check if the notebook is runnig in Colab. In this case we need some additional work to set properly the environmet. We need also to mount our vision drive. In local machine instead we need to add the Geoestimation folder of our paper in the paths where python searches for libraries.

In [4]:
# with this line we can check if we are in colab or not
import sys
in_colab = 'google.colab' in sys.modules
print("are we in Colab?:",in_colab)
if in_colab:
    !pip install -q condacolab
    import condacolab
    condacolab.install()
else:
    import os
    current_wd = os.getcwd()
    if current_wd.split('\\')[-1] == 'Vision_Project':
        os.chdir(r'GeoEstimation')
    sys.path.append(current_wd + r'\GeoEstimation')

are we in Colab?: False


In [5]:
# this cell takes a lot of time on colab!
import sys
in_colab = 'google.colab' in sys.modules
if in_colab:
    import condacolab
    condacolab.check()
    from google.colab import drive
    drive.mount('/content/drive')
    import os
    os.chdir(r'/content/drive/MyDrive/GeoEstimation')
    print(os.getcwd())
    !conda env update -n base -f environment.yml
    # The following is ridiculous, I know, but it seems to work
    !pip uninstall torchtext
    !pip install torchtext==0.7

In theory we need to install some specific packages with certain version to account for the original environment in which the paper results were obtained:
```
  - python=3.8
  - msgpack-python=1.0.0
  - pandas=1.1.5
  - yaml=0.2.5
  - tqdm=4.50
  - cudatoolkit=10.2
  - pytorch=1.6
  - torchvision=0.7
  - pytorch-lightning=1.0.1
  - pip
  - pip:
    - s2sphere==0.2.5
```

## Cells to download the new dataset
Run this cell once you have the time download some images (set num_photo eventually)

In [None]:
import urllib.request

#our defult working directory is Geoestimation

def download_image(url, file_path, file_name, size = 'z'):
    url = url[:-5]+size+url[-4:]
    file_name = str(file_name)
    full_path = file_path + '/'+ file_name + '.jpg'
    try: 
        urllib.request.urlretrieve(url, full_path)
        return 'ok'
    except:
        print(f'the url {url} does not work')
        return ''


def download_from_dataframe(df, num_photos=250):
    os.chdir(r'/content/drive/MyDrive/GeoEstimation/resources/images/new_data10k')
    cwd = os.getcwd()
    count = 0
    start = os.listdir()
    if len(start)>10006:
        print('Dataset already downloaded')
        return
    for i,url in enumerate(df['url']):
        id = str(df['photo_id'][i])
        
        if id+'.jpg' not in start and count<num_photos and type(url)!=float:
            status = download_image(url, cwd , id)
            if status=='ok':
                count += 1
    #this is to return to the original parent folder
    os.chdir(r'..')
    os.chdir(r'..')
    os.chdir(r'..')
    return 


In [None]:
import pandas as pd

new_data = pd.read_csv(r'/content/drive/MyDrive/GeoEstimation/resources/images/new_data10k/final_dataset.csv', sep = ';', index_col = 0)
new_data.head(2)
print('Previous number of Photos:',len(os.listdir(r'/content/drive/MyDrive/GeoEstimation/resources/images/new_data10k'))-1)
download_from_dataframe(new_data)
print('New number of Photos:',len(os.listdir(r'/content/drive/MyDrive/GeoEstimation/resources/images/new_data10k'))-1)

## A transfer learning example: the strenght of pytorch-lightning

Here we want to show in a nutshell the transfer learning approach from a pretrained model using both standard code and pytorch-lightning, to highlight the differences. Moreover we are going to load the same pretrained model (REsNet50) used by the authors as backbone to develop their ML model. For seek of semplicity we are going to re-train this model on the Cifar10. First let's see the classic torch approach:

In [None]:
#you may need to install pytorch-lightning-bolts
#!pip install pytorch-lightning-bolts==

#libraries
from torchvision import models
from torchvision.datasets import CIFAR10
from torchvision import transforms
import torch
from torch.utils.data import DataLoader
from torch.nn.functional import softmax, cross_entropy
from torch.optim import Adam

#download the pretrained model
backbone = models.resnet50(pretrained = True)

#download and normalize the CIFAR10 dataset
normalize = transforms.Normalize(mean=[x/255.0 for x in [125.3, 123.0, 113.9]],
                                 std=[x/255.0 for x in [63.0, 62.1, 66.7]])
cf10_transforms = transforms.Compose([
    transforms.ToTensor(),
    normalize
])
cifar_10 = CIFAR10('.',train=True, download = True, transform=cf10_transforms) 


#prepare the batches
train_loader = DataLoader(cifar_10, batch_size=32, shuffle=True)

# We add to the last layer with a fully connected one to match our number of classes (=10):
# We treat the outputs of resnet as high level features (we could use them with any classifier instead of a FC)
finetune_layer = torch.nn.Linear(backbone.fc.out_features, 10) 
#finetune_layer = torch.nn.Linear(backbone.fc.in_features, 10) is for REPLACE THE LAST LAYER

#define the optimizer
optimizer = Adam(finetune_layer.parameters(), lr = 1e-4)

#training
for epoch in range(10):
    for batch in train_loader:
        x, y = batch
        #we do not waste memory recording the gradient on the backbone
        with torch.no_grad():
            #(b, 3, 32, 32) -> (b, 1000)
            features = backbone(x)

        # (b, 1000) -> (b, 10)
        preds = finetune_layer(features)
        loss = cross_entropy(preds, y)

        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
        print(loss.item())

        


In [5]:
import pytorch_lightning as pl
from pytorch_lightning.metrics.functional import accuracy

class ImageClassifier(pl.LightningModule):
    def __init__(self, num_classes=10 , lr = 1e-3):
        super().__init__()
        #this setting save as the time to define an attribute for each hyperparameter --> self.hparams.<parameter>
        self.save_hyperparameters() #Pytorch-lightning trick!
        self.backbone = models.resnet50(pretrained = True)
        self.finetune_layer = torch.nn.Linear(backbone.fc.out_features, num_classes)

    def training_step(self, batch, batch_idx): #these methods are standard methods in LightningModule
        x, y = batch

        #we decide whether to freeze the backbone or not on the base of the number of epochs
        if self.trainer.current_epoch < 10:
            with torch.no_grad():
                #(b, 3, 32, 32) -> (b, 1000)
                features = self.backbone(x)
        else:
            features = self.backbone(x)

        # (b, 1000) -> (b, 10)
        preds = self.finetune_layer(features)
        loss = cross_entropy(preds, y)
        #we don't need anymore loss.backward(), optimizer.step(), optimizer.zero_grad()
        self.log('train_loss', loss) # we will see later this method of LightningModule
        self.log('train_loss', accuracy(preds, y))
        return loss
    
    def validation_step(self, batch, batch_idx):
        x, y = batch

        features = self.backbone(x)

        # (b, 1000) -> (b, 10)
        preds = self.finetune_layer(features)
        loss = cross_entropy(preds, y)
        #we don't need anymore loss.backward(), optimizer.step(), optimizer.zero_grad()
        self.log('val_loss', loss) # we will see later this method of LightningModule
        self.log('val_loss', accuracy(preds, y))
        return loss

    def configure_optimizers(self):
        optimizer = Adam(self.parameters(), lr =self.hparams.lr) 
        #we can safely pass all the parameters since in the backbone we are not computing the gradient
        return optimizer
        

At this point we have  very handle object.

In [7]:
#from pl_bolts.datamodules import CIFAR10DataModule

#Bolts save us the time of train, vla, test split and using 3 different torch.Dataloader for each of them
#dm = CIFAR10DataModule('.') 

classifier = ImageClassifier()
logger = pl.loggers.TensorBoardLogger(name = f'pretrained model 1', save_dir = 'lightning_logs')
trainer = pl.Trainer(
    max_epochs = 2, # set the number of epochs if <1000 (=defult)
    progress_bar_refresh_rate = 20, 
    logger = logger,
    #gpus=1, 
    limit_train_batches = 50,
    #limit_val_batches = 2,
    #check_val_every_n_epoch = 5
    #fast_dev_run=True # add this to have a fast chech of bugs
    ) 
trainer.fit(classifier, train_loader)#dm) #we can use the normal train_loader we defined previously



GPU available: False, used: False
TPU available: False, using: 0 TPU cores

  | Name           | Type   | Params
------------------------------------------
0 | backbone       | ResNet | 25 M  
1 | finetune_layer | Linear | 10 K  


Epoch 1:  80%|████████  | 40/50 [00:26<00:06,  1.48it/s, loss=2.094, v_num=0]


1

We can use this very nice [tool](https://pytorch-lightning.readthedocs.io/en/stable/api/pytorch_lightning.loggers.tensorboard.html#module-pytorch_lightning.loggers.tensorboard) form Pytorch-Lighting

In [None]:
# start tensorboard
%reload_ext tensorboard
%tensorboard --logdir lightning_logs/

### Self-supervised transfer learning with Lightning

PyTorch Lightning implementation of SwAV adapted from the [official implementation](https://arxiv.org/abs/2006.09882), whose authors used the same pretrained model (ResNet50 trained on ImageNet). We can simply import this model from Lightning-Bolt and define a class very similar to the previous one for our classifer. 

In [None]:
from pl_bolts.models.self_supervised import SwAV

#weight_path = 'https://pl-bolts-weights.s3.us-east-2.amazonaws.com/swav/bolts_swav_imagenet/swav_imagenet.ckpt'
weight_path = 'https://pl-bolts-weights.s3.us-east-2.amazonaws.com/swav/swav_imagenet/swav_imagenet.pth.tar'
swav = SwAV.load_from_checkpoint(weight_path, strict=False)

class SSLImageClassifier(pl.LightningModule):
    def __init__(self, num_classes=10 , lr = 1e-3):
        super().__init__()
        self.save_hyperparameters() 
        self.backbone = swav.model #model pretrained on ImageNet without labels
        self.finetune_layer = torch.nn.Linear(3000, num_classes)

    def training_step(self, batch, batch_idx): #these methods are standard methods in 
        x, y = batch

        #we decide whether to freeze the backbone or not on the base of the number of epochs
        if self.trainer.current_epoch < 10:
            with torch.no_grad():
                #(b, 3, 32, 32) -> (b, 1000)
                (f1, f2) = self.backbone(x)
                features = f2
        else:
            (f1, f2) = self.backbone(x)
            features = f2

        # (b, 1000) -> (b, 10)
        preds = self.finetune_layer(features)
        loss = cross_entropy(preds, y)
        #we don't need anymore loss.backward(), optimizer.step(), optimizer.zero_grad()
        self.log('train_loss', loss) # we will see later this method of LightningModule
        self.log('train_loss', accuracy(preds, y))
        return loss

    def validation_step(self, batch, batch_idx): #these methods are standard methods in 
        x, y = batch
        
        (f1, f2) = self.backbone(x)
        features = f2

        # (b, 1000) -> (b, 10)
        preds = self.finetune_layer(features)
        loss = cross_entropy(preds, y)
        #we don't need anymore loss.backward(), optimizer.step(), optimizer.zero_grad()
        self.log('val_loss', loss) # we will see later this method of LightningModule
        self.log('val_loss', accuracy(preds, y))
        return loss

    def configure_optimizers(self):
        optimizer = Adam(self.parameters(), lr =self.hparams.lr) 
        #we can safely pass all the parameters since in the backbone we are not computing the gradient
        return optimizer
        
ssl_classifier = SSLImageClassifier()
logger = pl.loggers.TensorBoardLogger(name = f'pretrained model 2 (self superised)', save_dir = 'lightning_logs')
trainer = pl.Trainer(
    # how to set the number of epochs?
    progress_bar_refresh_rate = 20, 
    gpus=1, 
    limit_train_batches = 50#, 
    #fast_dev_run=True # add this to have a fast chech of bugs
    ) 
trainer.fit(classifier, dm) #we can use the normal train_loader we defined previously

In [None]:
# start tensorboard
%reload_ext tensorboard
%tensorboard --logdir lightning_logs/

## Reproduce paper results


To begin we try to reproduce the paper results on their test set.

In [9]:
from pathlib import Path
from math import ceil

import pandas as pd
import torch
import pytorch_lightning as pl

from classification.train_base import MultiPartitioningClassifier # class defining our model
from classification.dataset import FiveCropImageDataset # class for preparing the images before giving them to the NN

## Load the model

In [10]:
# where model's params and hyperparams are saved
checkpoint = "models/base_M/epoch=014-val_loss=18.4833.ckpt"
hparams = "models/base_M/hparams.yaml"

In [11]:
# load_from_checkpoint is a static method from pytorch lightning, inherited by MultiPartitioningClassifier
# it permits to load a model previously saved, in the form of a checkpoint file, and one with hyperparameters
# MultiPartitioningClassifier is the class defining our model
model = MultiPartitioningClassifier.load_from_checkpoint(
    checkpoint_path=checkpoint,
    hparams_file=hparams,
    map_location=None
)

In [13]:
type(pl.LightningModule)

abc.ABCMeta

In [8]:
#to allow GPU
want_gpu = True
if want_gpu and torch.cuda.is_available():
    gpu = 1
else:
    gpu = None

# the class Trainer from pythorch lightining is the one responsible for training a deep NN
# it can initialize the model, run forward and backward passes, optimize, print stats, early stop...
wanted_precision = 32 #16 for half precision (how many bits for each number)
trainer = pl.Trainer(gpus=gpu, precision=wanted_precision)

GPU available: True, used: True
INFO:lightning:GPU available: True, used: True
TPU available: False, using: 0 TPU cores
INFO:lightning:TPU available: False, using: 0 TPU cores
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:lightning:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


## Load and initialize the images

In [6]:
# where images are saved
image_dir = "resources/images/im2gps"
meta_csv = "resources/images/im2gps_places365.csv"

In [7]:
import pandas as pd
first_csv = pd.read_csv(meta_csv)

In [10]:
#FiveCropImageDataset is the class for preparing the images before giving them to the NN
# in particular, it creates five different crops for every image
dataset = FiveCropImageDataset(meta_csv, image_dir)

Read resources/images/im2gps_places365.csv


In [11]:
batch_size = 64
dataloader = torch.utils.data.DataLoader(
                    dataset,
                    batch_size=ceil(batch_size / 5),  #you divide by 5 because for each image you generate 5 different crops
                    shuffle=False,
                    num_workers=4 #number ot threads used for parallelism (cores of CPU?)
                )

## Run the model on the test set

In [12]:
results = trainer.test(model, test_dataloaders=dataloader, verbose=False)

HBox(children=(HTML(value='Testing'), FloatProgress(value=1.0, bar_style='info', layout=Layout(flex='2'), max=…






## Look at the results

In [13]:
# formatting results into a pandas dataframe
df = pd.DataFrame(results[0]).T
#df["dataset"] = image_dir
df["partitioning"] = df.index
df["partitioning"] = df["partitioning"].apply(lambda x: x.split("/")[-1])
df.set_index(keys=["partitioning"], inplace=True) #keys=["dataset", "partitioning"] in case
print(df)

                  1         25        200       750       2500
partitioning                                                  
coarse        0.092827  0.316456  0.497890  0.670886  0.789030
middle        0.139241  0.345992  0.481013  0.683544  0.793249
fine          0.156118  0.392405  0.489451  0.658228  0.784810
hierarchy     0.147679  0.375527  0.489451  0.683544  0.789030


In [None]:
# to save the dataframe on a csv file
fout = 'test_results.csv'
df.to_csv(fout)

In [None]:
os.chdir(r'/content/drive/MyDrive/GeoEstimation/resources/images/im2gps')
print(len(os.listdir()))
os.chdir(r'/content/drive/MyDrive/GeoEstimation')
print(os.getcwd())
import torch
print(torch.cuda.is_available())

# Output would be True if Pytorch is using GPU otherwise it would be False.
print(torch.cuda.device_count())
print(torch.cuda.get_device_name(0))

237
/content/drive/MyDrive/GeoEstimation
True
1
Tesla T4


In [None]:
#@title
#libraries to import
#known
import pandas as pd
import numpy as np
import os
import re
import torchvision
import torch
import PIL
from PIL import Image
from PIL import ImageFile
import sys
import time
from math import ceil



#Unknown
from typing import Union
from io import BytesIO
import random
from argparse import Namespace, ArgumentParser
from pathlib import Path
from multiprocessing import Pool
from functools import partial
import requests
import logging
import json
import yaml
from tqdm.auto import tqdm
#from classification.train_base import MultiPartitioningClassifier
#from classification.dataset import FiveCropImageDataset

#to divide
from classification import utils_global
from classification.s2_utils import Partitioning, Hierarchy
from classification.dataset import MsgPackIterableDatasetMultiTargetWithDynLabels


The main link and paper that we need to follow is [this](https://github.com/TIBHannover/GeoEstimation) and [this](https://github.com/TIBHannover/GeoEstimation/releases/) for the pretrained models.

Davide ha trovato questo che forse è meglio [kaggle](https://www.kaggle.com/code/habedi/inspect-the-dataset/data)