### Setup Environment:

In [1]:
from src.embeddings import get_embeddings_df
import pandas as pd

## Embeddings Generation

* **Batch Size:** Images per batch to convert to embeddings (Adjust depending on your memory)

* **Path:** Path to the images

* **Output Directory:** Directory to save the embeddings

* **Backbone:** Select a backbone from the list of possible backbones:
    * 'dinov2_small'
    * 'dinov2_base'
    * 'dinov2_large'
    * 'dinov2_giant'
    * 'sam_base'
    * 'sam_large'
    * 'sam_huge'
    * 'clip_base',
    * 'clip_large',
    * 'convnextv2_tiny'
    * 'convnextv2_base'
    * 'convnextv2_large'
    * 'convnext_tiny'
    * 'convnext_small'
    * 'convnext_base'
    * 'convnext_large'
    * 'swin_tiny'
    * 'swin_small'
    * 'swin_base'
    * 'vit_base'
    * 'vit_large'

In [2]:
# Foundational Models
dino_backbone = ['dinov2_small', 'dinov2_base', 'dinov2_large', 'dinov2_giant']

sam_backbone = ['sam_base', 'sam_large', 'sam_huge']

clip_backbone = ['clip_base', 'clip_large']

# ImageNet:

### Convnext
convnext_backbone = ['convnextv2_tiny', 'convnextv2_base', 'convnextv2_large'] + ['convnext_tiny', 'convnext_small', 'convnext_base', 'convnext_large']

### Swin Transformer
swin_transformer_backbone = ['swin_tiny', 'swin_small', 'swin_base']

### ViT
vit_backbone = ['vit_base', 'vit_large']

backbones = dino_backbone + clip_backbone + sam_backbone + convnext_backbone + swin_transformer_backbone + vit_backbone

backbones

['dinov2_small',
 'dinov2_base',
 'dinov2_large',
 'dinov2_giant',
 'clip_base',
 'clip_large',
 'sam_base',
 'sam_large',
 'sam_huge',
 'convnextv2_tiny',
 'convnextv2_base',
 'convnextv2_large',
 'convnext_tiny',
 'convnext_small',
 'convnext_base',
 'convnext_large',
 'swin_tiny',
 'swin_small',
 'swin_base',
 'vit_base',
 'vit_large']

### Satellite Image Embeddings

* **[DAQUAR Dataset](https://www.mpi-inf.mpg.de/departments/computer-vision-and-machine-learning/research/vision-and-language/visual-turing-challenge#c7057)**:

DAQUAR (Dataset for Question Answering on Real-world images) dataset was created for the purpose of advancing research in visual question answering (VQA). It consists of indoor scene images, each accompanied by sets of questions related to the scene's content. The dataset serves as a benchmark for training and evaluating models in understanding images and answering questions about them.

We'll use the function `get_embeddings_df` to generate the embeddings in `datasets/daquar/images` and store the embeddings in `Embeddings/daquar/Embeddings_Backbone.csv`

In [4]:
batch_size = 32
path = 'datasets/daquar/images'
dataset = 'daquar'
backbone = 'dinov2_base'
out_dir = 'Embeddings'

get_embeddings_df(batch_size=batch_size, path=path, dataset_name=dataset, backbone=backbone, directory=out_dir)

##################################################  dinov2_base  ##################################################


Using cache found in /home/datascience/.cache/torch/hub/facebookresearch_dinov2_main


Processed batch number: 10
Processed batch number: 20
Processed batch number: 30
Processed batch number: 40


* **[COCO-QA Dataset](https://www.cs.toronto.edu/~mren/research/imageqa/data/cocoqa/)**:

The COCO-QA (COCO Question-Answering) dataset is designed for the task of visual question-answering. It is a subset of the COCO (Common Objects in Context) dataset, which is a large-scale dataset containing images with object annotations. The COCO-QA dataset extends the COCO dataset by including questions and answers associated with the images. Each image in the COCO-QA dataset is accompanied by a set of questions and corresponding answers.

We'll use the function `get_embeddings_df` to generate the embeddings in `datasets/coco-qa/images` and store the embeddings in `Embeddings/coco-qa/Embeddings_Backbone.csv`

In [5]:
batch_size = 32
path = 'datasets/coco-qa/images'
dataset = 'coco-qa'
backbone = 'dinov2_base'
out_dir = 'Embeddings'

get_embeddings_df(batch_size=batch_size, path=path, dataset_name=dataset, backbone=backbone, directory=out_dir)

##################################################  dinov2_base  ##################################################


Using cache found in /home/datascience/.cache/torch/hub/facebookresearch_dinov2_main


Processed batch number: 10
Processed batch number: 20
Processed batch number: 30
Processed batch number: 40
Processed batch number: 50
Processed batch number: 60
Processed batch number: 70
Processed batch number: 80
Processed batch number: 90
Processed batch number: 100
Processed batch number: 110
Processed batch number: 120
Processed batch number: 130
Processed batch number: 140
Processed batch number: 150
Processed batch number: 160
Processed batch number: 170
Processed batch number: 180
Processed batch number: 190
Processed batch number: 200
Processed batch number: 210
Processed batch number: 220
Processed batch number: 230
Processed batch number: 240
Processed batch number: 250
Processed batch number: 260
Processed batch number: 270
Processed batch number: 280
Processed batch number: 290
Processed batch number: 300
Processed batch number: 310
Processed batch number: 320
Processed batch number: 330
Processed batch number: 340
Processed batch number: 350
Processed batch number: 360
P

#### 

* **[Fakeddit Dataset](https://fakeddit.netlify.app/)**:

Fakeddit is a large-scale multimodal dataset for fine-grained fake news detection. It consists of over 1 million samples from multiple categories of fake news, including satire, misinformation, and fabricated news. The dataset includes text, images, metadata, and comment data, making it a rich resource for developing and evaluating fake news detection models.

We'll use the function `get_embeddings_df` to generate the embeddings in `datasets/fakeddit/images` and store the embeddings in `Embeddings/fakeddit/Embeddings_Backbone.csv`

In [2]:
batch_size = 32
path = 'datasets/fakeddit/images'
dataset = 'fakeddit'
backbone = 'dinov2_base'
out_dir = 'Embeddings'
image_files = pd.read_csv('datasets/fakeddit/labels.csv')['id'].tolist()

get_embeddings_df(batch_size=batch_size, path=path, dataset_name=dataset, backbone=backbone, directory=out_dir, image_files=image_files)

##################################################  dinov2_base  ##################################################
Skipping dhiibiw.jpg due to error
Skipping 8mp3d1.jpg due to error
Skipping d1vwukd.jpg due to error
Skipping cjxmb4c.jpg due to error
Skipping esi0vrr.jpg due to error
Skipping dzp14sm.jpg due to error
Skipping cmctgu7.jpg due to error
Skipping dij5nzu.jpg due to error
Skipping d5evm0v.jpg due to error
Skipping cqhowgl.jpg due to error
Skipping dq4x5ki.jpg due to error
Skipping c8knklu.jpg due to error
Skipping c87n4gu.jpg due to error
Skipping dltekd1.jpg due to error
Skipping ckk18va.jpg due to error
Skipping cqgkmgf.jpg due to error
Skipping cea9g9y.jpg due to error
Skipping cb2fi1s.jpg due to error
Skipping ccfviky.jpg due to error
Skipping c8fjng3.jpg due to error
Skipping dn4ln5a.jpg due to error
Skipping c9zml8c.jpg due to error
Skipping chmlqm2.jpg due to error
Skipping c3jgymh.jpg due to error
Skipping dn8r5le.jpg due to error
Skipping cid7s91.jpg due to error
S

Using cache found in /home/datascience/.cache/torch/hub/facebookresearch_dinov2_main


Processed batch number: 10
Processed batch number: 20
Processed batch number: 30
Processed batch number: 40
Processed batch number: 50
Processed batch number: 60
Processed batch number: 70
Processed batch number: 80
Processed batch number: 90
Processed batch number: 100
Processed batch number: 110
Processed batch number: 120
Processed batch number: 130
Processed batch number: 140
Processed batch number: 150
Processed batch number: 160
Processed batch number: 170
Processed batch number: 180
Processed batch number: 190
Processed batch number: 200
Processed batch number: 210
Processed batch number: 220
Processed batch number: 230
Processed batch number: 240
Processed batch number: 250
Processed batch number: 260
Processed batch number: 270
Processed batch number: 280
Processed batch number: 290
Processed batch number: 300
Processed batch number: 310
Processed batch number: 320
Processed batch number: 330
Processed batch number: 340
Processed batch number: 350
Processed batch number: 360
P

* **[Recipes5k Dataset](http://www.ub.edu/cvub/recipes5k/)**:

The Recipes5k dataset comprises 4,826 recipes featuring images and corresponding ingredient lists, with 3,213 unique ingredients simplified from 1,014 by removing overly-descriptive particles, offering a diverse collection of alternative preparations for each of the 101 food types from Food101, meticulously balanced across training, validation, and test splits. The dataset addresses intra- and inter-class variability, extracted from Yummly with 50 recipes per food type.


We'll use the function `get_embeddings_df` to generate the embeddings in `datasets/Recipes5k/images` and store the embeddings in `Embeddings/Recipes5k/Embeddings_Backbone.csv`

In [5]:
batch_size = 32
path = 'datasets/Recipes5k/images'
dataset = 'Recipes5k'
backbone = 'dinov2_base'
out_dir = 'Embeddings'
image_files = pd.read_csv('datasets/Recipes5k/labels.csv')['image'].tolist()

get_embeddings_df(batch_size=batch_size, path=path, dataset_name=dataset, backbone=backbone, directory=out_dir, image_files=image_files)

##################################################  dinov2_base  ##################################################


Using cache found in /home/datascience/.cache/torch/hub/facebookresearch_dinov2_main


Processed batch number: 10
Processed batch number: 20
Processed batch number: 30
Processed batch number: 40
Processed batch number: 50
Processed batch number: 60
Processed batch number: 70
Processed batch number: 80
Processed batch number: 90
Processed batch number: 100
Processed batch number: 110
Processed batch number: 120
Processed batch number: 130
Processed batch number: 140
Processed batch number: 150
