[![image](https://raw.githubusercontent.com/visual-layer/visuallayer/main/imgs/vl_horizontal_logo.png)](https://www.visual-layer.com)

# Generating Image Captions With fastdup

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/visual-layer/fastdup/blob/main/examples/caption_generation.ipynb)
[![Open in Kaggle](https://kaggle.com/static/images/open-in-kaggle.svg)](https://kaggle.com/kernels/welcome?src=https://github.com/visual-layer/fastdup/blob/main/examples/caption_generation.ipynb)


This notebook shows how you can use [fastdup](https://github.com/visual-layer/fastdup) to generate image captions.

## Install fastdup

First, install fastdup and verify the installation.

In [3]:
!pip install fastdup -Uq

Now, test the installation. If there's no error message, we are ready to go.

In [5]:
import fastdup
fastdup.__version__

'1.39'

In [10]:
import pandas as pd
from IPython.display import HTML

## Run fastdup

To run fastdup, simply point `input_dir` to the folder containing images from the dataset. 

In [6]:
fd = fastdup.create(input_dir='/Users/guysinger/Desktop/Large Files/coco10/')
fd.run(overwrite=True)

FastDup Software, (C) copyright 2022 Dr. Amir Alush and Dr. Danny Bickson.
2023-09-11 10:03:40 [INFO] Going to loop over dir /Users/guysinger/Desktop/Large Files/coco10
2023-09-11 10:03:40 [INFO] Found total 10 images to run on, 10 train, 0 test, name list 10, counter 10 
2023-09-11 10:03:41 [INFO] Found total 10 images to run onEstimated: 0 Minutes
2023-09-11 10:03:41 [INFO] 4) Finished write_index() NN model
2023-09-11 10:03:41 [INFO] Stored nn model index file work_dir/nnf.index
2023-09-11 10:03:41 [INFO] Total time took 1011 ms
2023-09-11 10:03:41 [INFO] Found a total of 0 fully identical images (d>0.990), which are 0.00 % of total graph edges
2023-09-11 10:03:41 [INFO] Found a total of 0 nearly identical images(d>0.980), which are 0.00 % of total graph edges
2023-09-11 10:03:41 [INFO] Found a total of 0 above threshold images (d>0.900), which are 0.00 % of total graph edges
2023-09-11 10:03:41 [INFO] Found a total of 1 outlier images         (d<0.050), which are 5.00 % of total gr



0

## Generate Captions

available model for captioning are:
- ViT-GPT2 : `'vitgpt2'`
- BLIP-2 : `'blip2'`
- BLIP : `'blip'`

available models for VQA are:
- Vilt-b32: `'indoors_outdoors'`
---> currently only available for indoor/outdoor VQA
- ViT-Age: `'age'`
---> currently only available for person age VQA

In [7]:
captions_df = fd.caption('vitgpt2')

  from .autonotebook import tqdm as notebook_tqdm
Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.
We strongly recommend passing in an `attention_mask` since your input_ids may be padded. See https://huggingface.co/docs/transformers/troubleshooting#incorrect-output-when-padding-tokens-arent-masked.
100%|██████████| 10/10 [00:04<00:00,  2.18it/s]


In [8]:
captions_df

Unnamed: 0,filename,index,error_code,is_valid,fd_index,caption
0,/Users/guysinger/Desktop/Large Files/coco10/test/000000032812.jpg,0,VALID,True,0,a woman swinging a tennis racket at a ball
1,/Users/guysinger/Desktop/Large Files/coco10/test/000000044135.jpg,1,VALID,True,1,a man wearing a blue shirt and a blue tie
2,/Users/guysinger/Desktop/Large Files/coco10/test/000000060472.jpg,2,VALID,True,2,a man with a keyboard and a mouse
3,/Users/guysinger/Desktop/Large Files/coco10/test/000000087018.jpg,3,VALID,True,3,a pile of apples and oranges sitting on a table
4,/Users/guysinger/Desktop/Large Files/coco10/test/000000126027.jpg,4,VALID,True,4,a woman on skis standing on a snow covered slope
5,/Users/guysinger/Desktop/Large Files/coco10/test/000000150127.jpg,5,VALID,True,5,a bedroom with a desk and a laptop on it
6,/Users/guysinger/Desktop/Large Files/coco10/test/000000198786.jpg,6,VALID,True,6,"a bathroom with a toilet, sink, and mirror"
7,/Users/guysinger/Desktop/Large Files/coco10/test/000000238892.jpg,7,VALID,True,7,a woman wearing a blue shirt and a pink tie
8,/Users/guysinger/Desktop/Large Files/coco10/test/000000269429.jpg,8,VALID,True,8,"a living room with a couch, chair, and fireplace"
9,/Users/guysinger/Desktop/Large Files/coco10/test/000000290827.jpg,9,VALID,True,9,a yellow and blue train is on the tracks


## Visualize Results

Use fastdup's built-in galleries methods to visualize the captioned images.
Additionally, captions can always be generated for a gallery by setting the `label_col` argument to one of the available model names listed above.

In [14]:
visualization_df = pd.DataFrame({'from':captions_df['filename'],'to':captions_df['filename'], 'label':captions_df['caption'], 'distance':0*len(captions_df),})
fastdup.create_outliers_gallery(visualization_df, save_path='.', num_images=10)

100%|██████████| 10/10 [00:00<00:00, 9841.16it/s]

Stored outliers visual view in  ./outliers.html





0

In [17]:
HTML('outliers.html')

Info,Unnamed: 1
Distance,0
Path,/Users/guysinger/Desktop/Large Files/coco10/test/000000032812.jpg
label,a woman swinging a tennis racket at a ball

Info,Unnamed: 1
Distance,0
Path,/Users/guysinger/Desktop/Large Files/coco10/test/000000044135.jpg
label,a man wearing a blue shirt and a blue tie

Info,Unnamed: 1
Distance,0
Path,/Users/guysinger/Desktop/Large Files/coco10/test/000000060472.jpg
label,a man with a keyboard and a mouse

Info,Unnamed: 1
Distance,0
Path,/Users/guysinger/Desktop/Large Files/coco10/test/000000087018.jpg
label,a pile of apples and oranges sitting on a table

Info,Unnamed: 1
Distance,0
Path,/Users/guysinger/Desktop/Large Files/coco10/test/000000126027.jpg
label,a woman on skis standing on a snow covered slope

Info,Unnamed: 1
Distance,0
Path,/Users/guysinger/Desktop/Large Files/coco10/test/000000150127.jpg
label,a bedroom with a desk and a laptop on it

Info,Unnamed: 1
Distance,0
Path,/Users/guysinger/Desktop/Large Files/coco10/test/000000198786.jpg
label,"a bathroom with a toilet, sink, and mirror"

Info,Unnamed: 1
Distance,0
Path,/Users/guysinger/Desktop/Large Files/coco10/test/000000238892.jpg
label,a woman wearing a blue shirt and a pink tie

Info,Unnamed: 1
Distance,0
Path,/Users/guysinger/Desktop/Large Files/coco10/test/000000269429.jpg
label,"a living room with a couch, chair, and fireplace"

Info,Unnamed: 1
Distance,0
Path,/Users/guysinger/Desktop/Large Files/coco10/test/000000290827.jpg
label,a yellow and blue train is on the tracks


# Wrap Up

Next, feel free to check out other tutorials -

+ ⚡ [**Quickstart**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/quick-dataset-analysis.ipynb): Learn how to install fastdup, load a dataset and analyze it for potential issues such as duplicates/near-duplicates, broken images, outliers, dark/bright/blurry images, and view visually similar image clusters. If you're new, start here!
+ 🧹 [**Clean Image Folder**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/cleaning-image-dataset.ipynb): Learn how to analyze and clean a folder of images from potential issues and export a list of problematic files for further action. If you have an unorganized folder of images, this is a good place to start.
+ 🖼 [**Analyze Image Classification Dataset**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/analyzing-image-classification-dataset.ipynb): Learn how to load a labeled image classification dataset and analyze for potential issues. If you have labeled ImageNet-style folder structure, have a go!
+ 🎁 [**Analyze Object Detection Dataset**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/analyzing-object-detection-dataset.ipynb): Learn how to load bounding box annotations for object detection and analyze for potential issues. If you have a COCO-style labeled object detection dataset, give this example a try. 


## VL Profiler
If you prefer a no-code platform to inspect and visualize your dataset, [**try our free cloud product VL Profiler**](https://app.visual-layer.com) - VL Profiler is our first no-code commercial product that lets you visualize and inspect your dataset in your browser. 

[Sign up](https://app.visual-layer.com) now, it's free.

[![image](https://raw.githubusercontent.com/visual-layer/fastdup/main/gallery/vl_profiler_promo.svg)](https://app.visual-layer.com)

As usual, feedback is welcome! 

Questions? Drop by our [Slack channel](https://visualdatabase.slack.com/join/shared_invite/zt-19jaydbjn-lNDEDkgvSI1QwbTXSY6dlA#/shared-invite/email) or open an issue on [GitHub](https://github.com/visual-layer/fastdup/issues).