diff --git a/README.md b/README.md
index be1f38e4..a470a379 100644
--- a/README.md
+++ b/README.md
@@ -18,10 +18,10 @@
[pypi-shield]: https://img.shields.io/badge/Python-3.7%20|%203.8%20|%203.9%20|%203.10-blue?style=for-the-badge
[pypi-url]: https://pypi.org/project/fastdup/
-[pypiversion-shield]: https://img.shields.io/pypi/v/fastdup?style=for-the-badge
+[pypiversion-shield]: https://img.shields.io/pypi/v/fastdup?style=for-the-badge&color=success
[downloads-shield]: https://img.shields.io/badge/dynamic/json?style=for-the-badge&label=downloads&query=%24.total_downloads&url=https%3A%2F%2Fapi.pepy.tech%2Fapi%2Fv2%2Fprojects%2Ffastdup&color=lightblue
[downloads-url]: https://pypi.org/project/fastdup/
-[contributors-shield]: https://img.shields.io/github/contributors/visual-layer/fastdup?style=for-the-badge
+[contributors-shield]: https://img.shields.io/github/contributors/visual-layer/fastdup?style=for-the-badge&color=orange
[contributors-url]: https://github.com/othneildrew/Best-README-Template/graphs/contributors
[license-shield]: https://img.shields.io/badge/License-CC%20BY--NC--ND%204.0-purple.svg?style=for-the-badge
[license-url]: https://github.com/visual-layer/fastdup/blob/main/LICENSE
@@ -40,27 +40,28 @@
Manage, Clean & Curate Visual Data - Fast and at Scale.
An unsupervised and free tool for image and video dataset analysis.
+
Explore the docs »
- Features
+ Features
·
Report Bug
·
- Read Blog
+ Blog
·
Quickstart
·
Enterprise Edition
·
- About us
+ About us
-
+
@@ -84,9 +85,10 @@
🚀 Introducing VL Profiler! 🚀
We're excited to announce our new cloud product, VL Profiler. It's designed to help you gain deeper insights and enhance your productivity while using fastdup. With VL Profiler, you can visualize your data, track changes over time, and much more.
+
👉 Check out VL Profiler here 👈
-Note: VL Profiler is a separate commercial product developed by the same team behind fastdup. Our goal with VL Profiler is to provide additional value to our users while continuing to support and maintain fastdup as a free, open-source project. We'd love for you to give VL Profiler a try and share your feedback with us! [Sign-up](https://cutt.ly/9wyxhZAI) now, it's free.
+📝 Note: VL Profiler is a separate commercial product developed by the same team behind fastdup. Our goal with VL Profiler is to provide additional value to our users while continuing to support and maintain fastdup as a free, open-source project. We'd love for you to give VL Profiler a try and share your feedback with us! [Sign-up](https://cutt.ly/9wyxhZAI) now, it's free.
@@ -103,9 +105,15 @@ concerns while providing extra functionalities.
## Why fastdup?
-- **Quality**: Find and remove anomalies and outliers from your dataset, including duplicates and similar images and videos at a large scale.
-- **Cost**: Reduce data operation costs by intelligently sampling high-quality or novel datasets before labeling and assessing labeled data quality.
-- **Scale**: fastdup's C++ graph engine is highly efficient and can handle up to 400M images on a single CPU machine.
+With a plethora of data visualization/profiling tools available, what sets fastdup apart?
+Here are the top benefits of fastdup:
+
++ **Quality**: High-quality analysis to remove duplicates/near-duplicates, anomalies, mislabels, broken images, and poor-quality images.
++ **Scale**: Handles 400M images on a single CPU machine. Enterprise version scales to billions of images.
++ **Speed**: Highly optimized C++ engine runs efficiently even on low-resource CPU machines.
++ **Privacy**: Runs locally or on your cloud infrastructure. Your data stays where it is.
++ **Ease of use**: Works on labeled or unlabeled datasets, images, or videos. Get started with just [3 lines of code](#getting-started).
+
## Setting up
@@ -185,486 +193,482 @@ fd.vis.similarity_gallery() # create a gallery of similar images
View the API docs [here](https://visual-layer.readme.io/docs/v1-api).
+
## Learn from Examples
+Learn the basics of fastdup through interactive examples. View the notebooks on GitHub or nbviewer. Even better, run them on Google Colab or Kaggle, for free.
+
+
-
+
-
-
-
- |
+
+
+
+
- Quick Dataset Analysis: In this example, learn how to quickly analyze a dataset for potential issues. Identify duplicates, outliers, dark/bright/blurry images, and cluster similar images with only a few lines of code. If you're new, start here.
+ ⚡ Quickstart: Learn how to install fastdup, load a dataset and analyze it for potential issues such as duplicates/near-duplicates, broken images, outliers, dark/bright/blurry images, and view visually similar image clusters. If you're new, start here!
+
+
+ 📌 Dataset: Oxford-IIIT Pet.
|
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
-
-
-
+
+
+
-
-
-
- |
+
+
+
+
- DINOv2 Embeddings: In this example, learn how to use DINOv2 models to visualize image embeddings of your dataset. Runs on CPU!
+ 🧹 Clean Image Folder: Learn how to analyze and clean a folder of images from potential issues and export a list of problematic files for further action. If you have an unorganized folder of images, this is a good place to start.
+
+
+ 📌 Dataset: Food-101.
|
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
-
-
-
+
+
+
-
-
-
- |
+
+
+
+
- Cleaning Image Dataset: In this tutorial, learn how to clean a dataset from broken images, duplicates, outliers, and identify dark/bright/blurry images.
+ 🖼 Analyze Image Classification Dataset: Learn how to load a labeled image classification dataset and analyze for potential issues. If you have labeled ImageNet-style folder structure, have a go!
+
+
+ 📌 Dataset: Imagenette.
|
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
-
-
-
+
+
+
-
-
-
- |
+
+
+
+
- Analyzing Labeled Image Classification Dataset: In this tutorial, learn how to analyze a labeled image classification dataset for potential issues. We use the Imagenette dataset, a 10-class, 13k image subset of ImageNet as a working example.
+ 🎁 Analyze Object Detection Dataset: Learn how to load bounding box annotations for object detection and analyze for potential issues. If you have a COCO-style labeled object detection dataset, give this example a try.
+
+
+ 📌 Dataset: COCO.
|
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
+
+
+
-
-
+
+## Exciting New Features
+
+> **Note**: We're happy to announce new features are out from beta testing and now available to the public, completely free of charge! We invite you to try them out and provide us with your valuable [feedback](https://visualdatabase.slack.com/join/shared_invite/zt-19jaydbjn-lNDEDkgvSI1QwbTXSY6dlA#/shared-invite/email)!
+
+
+
-
-
-
- |
+
+
+
+
- Analyzing Labeled Object Detection Dataset: In this tutorial learn how to load and analyze an object detection dataset with labeled bounding boxes and classes. We use the mini-coco dataset as a working example. Learn how to discover duplicates, outliers, and possible mislabeled bounding boxes.
+ 🤗 Analyze Hugging Face Datasets: Load and analyze datasets from Hugging Face Datasets. Perfect if you already have a dataset hosted on Hugging Face hub.
|
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
-
-
-
+
+
+
-
-
-
- |
+
+
+
+
- Analyzing Hugging Face Datasets: In this tutorial learn how to load and analyze datasets from Hugging Face Datasets.
+ 🦖 DINOv2 Embeddings: Extract feature vectors of your images using DINOv2 model. Runs on CPU.
|
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
-
-
-
-
-## Advanced Features
-
-The following are advanced functionalities of fastdup which are still in the beta testing phase.
-Sign up for free to be a beta tester and get early access. Drop us an email at info@visual-layer.com .
-
-
-
+
+
+
-
-
-
- |
+
+
+
+
- Face Detection Video Analysis: In this tutorial, learn how to use fastdup with a face detection model to detect and crop from videos. Following that we analyze the cropped faces for issues such as duplicates, near-duplicates, outliers, bright/dark/blurry faces.
+ ➡️ Use Your Own Feature Vectors: Read fastdup generated feature vectors in Python and use them for downstream processing, or run fastdup on your feature vectors.
|
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
-
-
-
+
+
+
-
-
-
- |
+
+
+
+
- YOLOv5 Object Detection Video Analysis: In this tutorial, learn how to use fastdup with a pre-trained yolov5 object detection model to detect and crop from videos. Following that we analyze the cropped objects for issues such as duplicates, near-duplicates, outliers, bright/dark/blurry objects.
+ 😗 Face Detection in Videos: Use fastdup with a face detection model to detect faces from videos and analyze the cropped faces for potential issues such as duplicates, near-duplicates, outliers, bright/dark/blurry faces.
|
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
-
-
+
-
-
-
- |
+
+
+
+
- Satellite Image Analysis: In this tutorial, learn how to use fastdup to load 16-bit grayscale satellite image, work with rotated bounding boxes, understand your dataset, find issues with the data and check the quality of annotations.
+ 🤖 Object Detection in Videos: Use fastdup with a pre-trained YOLOv5 model to detect and analyze objects for potential issues such as duplicates, near-duplicates, outliers, bright/dark/blurry objects.
|
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
-
-
+
-
-
-
+
+
+
|
- Surveillance Camera Analysis: In this tutorial, learn how to use fastdup to analyze surveillance camera videos, caption the activity inside the videos and detect indoor/ outdoor.
+ 🔢 Optical Character Recognition: Enrich your dataset by detecting multilingual texts with PaddleOCR.
|
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
-
-
-
+
+
+
-
-
-
+
+
+
|
- Image Search: In this tutorial, learn how to use fastdup to search through large image datasets for duplicates/similar images using a query image. Runs on CPU!
+ 📑 Captioning with BLIP: Enrich your dataset by captioning them using BLIP.
|
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
-
-
+
+
+
-
-
-
+
+
+
|
- Feature vectors: In this tutorial, learn how to read fastdup generated feature vectors in Python and use them for downstream processing, or run fastdup on your calculated feature vectors.
+ 🔍 Image Search: Search through large image datasets for duplicates/near-duplicates using a query image. Runs on CPU!
|
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
+
+
-
-
-
+
+
+
|
-
-
-
+
+
+
## Getting Help
Get help from the fastdup team or community members via the following channels -
+ [Slack](https://visualdatabase.slack.com/join/shared_invite/zt-19jaydbjn-lNDEDkgvSI1QwbTXSY6dlA#/shared-invite/email).
diff --git a/examples/Copy_of_mafat_final.ipynb b/examples/Copy_of_mafat_final.ipynb
deleted file mode 100644
index de7fef64..00000000
--- a/examples/Copy_of_mafat_final.ipynb
+++ /dev/null
@@ -1,7824 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "id": "2d3a2ba6-3ba0-4770-b025-c88adf5b292e",
- "metadata": {
- "id": "2d3a2ba6-3ba0-4770-b025-c88adf5b292e"
- },
- "source": [
- "# Fastdup for Sattelite Imagery\n",
- "In this notebook we load satellite data from Mafat Competition https://mafatchallenge.mod.gov.il/, which consists of 16 bit grayscale images with rotated bounding boxes.\n",
- "\n",
- "We show how to work with this dataset using fastdup. It takes 140 seconds to process 18,000 bounding boxes and find all similarities.\n",
- "\n",
- "We use components gallery to highly suspected wrong bounding boxes as well as correct bounding boxes.\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "b2cc8c20-4069-4183-a247-0dc28788b158",
- "metadata": {
- "id": "b2cc8c20-4069-4183-a247-0dc28788b158",
- "outputId": "4418adc7-191f-4c40-8a3f-98cb104017ed"
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "\u001b[33mDEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0mCollecting fastdup\n",
- " Using cached fastdup-0.904-cp38-cp38-macosx_11_0_arm64.whl (32.8 MB)\n",
- "Collecting requests==2.28.1\n",
- " Using cached requests-2.28.1-py3-none-any.whl (62 kB)\n",
- "Collecting certifi\n",
- " Using cached certifi-2022.12.7-py3-none-any.whl (155 kB)\n",
- "Collecting pandas\n",
- " Using cached pandas-1.5.3-cp38-cp38-macosx_11_0_arm64.whl (10.8 MB)\n",
- "Collecting sentry-sdk\n",
- " Using cached sentry_sdk-1.16.0-py2.py3-none-any.whl (184 kB)\n",
- "Collecting numpy\n",
- " Using cached numpy-1.24.2-cp38-cp38-macosx_11_0_arm64.whl (13.8 MB)\n",
- "Collecting opencv-python-headless<=4.5.5.64\n",
- " Using cached opencv_python_headless-4.5.5.64-cp37-abi3-macosx_11_0_arm64.whl (29.9 MB)\n",
- "Collecting tqdm\n",
- " Using cached tqdm-4.65.0-py3-none-any.whl (77 kB)\n",
- "Collecting packaging\n",
- " Using cached packaging-23.0-py3-none-any.whl (42 kB)\n",
- "Collecting pillow\n",
- " Using cached Pillow-9.4.0-cp38-cp38-macosx_11_0_arm64.whl (3.0 MB)\n",
- "Collecting pyyaml\n",
- " Using cached PyYAML-6.0-cp38-cp38-macosx_12_0_arm64.whl\n",
- "Collecting charset-normalizer<3,>=2\n",
- " Using cached charset_normalizer-2.1.1-py3-none-any.whl (39 kB)\n",
- "Collecting urllib3<1.27,>=1.21.1\n",
- " Using cached urllib3-1.26.15-py2.py3-none-any.whl (140 kB)\n",
- "Collecting idna<4,>=2.5\n",
- " Using cached idna-3.4-py3-none-any.whl (61 kB)\n",
- "Collecting pytz>=2020.1\n",
- " Using cached pytz-2022.7.1-py2.py3-none-any.whl (499 kB)\n",
- "Collecting python-dateutil>=2.8.1\n",
- " Using cached python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)\n",
- "Collecting six>=1.5\n",
- " Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)\n",
- "Installing collected packages: pytz, urllib3, tqdm, six, pyyaml, pillow, packaging, numpy, idna, charset-normalizer, certifi, sentry-sdk, requests, python-dateutil, opencv-python-headless, pandas, fastdup\n",
- " Attempting uninstall: pytz\n",
- " Found existing installation: pytz 2022.7.1\n",
- " Uninstalling pytz-2022.7.1:\n",
- " Successfully uninstalled pytz-2022.7.1\n",
- "\u001b[33m DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0m Attempting uninstall: urllib3\n",
- " Found existing installation: urllib3 1.26.15\n",
- " Uninstalling urllib3-1.26.15:\n",
- " Successfully uninstalled urllib3-1.26.15\n",
- "\u001b[33m DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0m Attempting uninstall: tqdm\n",
- " Found existing installation: tqdm 4.65.0\n",
- " Uninstalling tqdm-4.65.0:\n",
- " Successfully uninstalled tqdm-4.65.0\n",
- "\u001b[33m DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0m Attempting uninstall: six\n",
- " Found existing installation: six 1.16.0\n",
- " Uninstalling six-1.16.0:\n",
- " Successfully uninstalled six-1.16.0\n",
- "\u001b[33m DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0m Attempting uninstall: pyyaml\n",
- " Found existing installation: PyYAML 6.0\n",
- " Uninstalling PyYAML-6.0:\n",
- " Successfully uninstalled PyYAML-6.0\n",
- "\u001b[33m DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0m Attempting uninstall: pillow\n",
- " Found existing installation: Pillow 9.4.0\n",
- " Uninstalling Pillow-9.4.0:\n",
- " Successfully uninstalled Pillow-9.4.0\n",
- "\u001b[33m DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0m Attempting uninstall: packaging\n",
- " Found existing installation: packaging 23.0\n",
- " Uninstalling packaging-23.0:\n",
- " Successfully uninstalled packaging-23.0\n",
- "\u001b[33m DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0m Attempting uninstall: numpy\n",
- " Found existing installation: numpy 1.24.2\n",
- " Uninstalling numpy-1.24.2:\n",
- " Successfully uninstalled numpy-1.24.2\n",
- "\u001b[33m DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0m Attempting uninstall: idna\n",
- " Found existing installation: idna 3.4\n",
- " Uninstalling idna-3.4:\n",
- " Successfully uninstalled idna-3.4\n",
- "\u001b[33m DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0m Attempting uninstall: charset-normalizer\n",
- " Found existing installation: charset-normalizer 2.1.1\n",
- " Uninstalling charset-normalizer-2.1.1:\n",
- " Successfully uninstalled charset-normalizer-2.1.1\n",
- "\u001b[33m DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0m Attempting uninstall: certifi\n",
- " Found existing installation: certifi 2022.12.7\n",
- " Uninstalling certifi-2022.12.7:\n",
- " Successfully uninstalled certifi-2022.12.7\n",
- "\u001b[33m DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0m Attempting uninstall: sentry-sdk\n",
- " Found existing installation: sentry-sdk 1.16.0\n",
- " Uninstalling sentry-sdk-1.16.0:\n",
- " Successfully uninstalled sentry-sdk-1.16.0\n",
- "\u001b[33m DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0m Attempting uninstall: requests\n",
- " Found existing installation: requests 2.28.1\n",
- " Uninstalling requests-2.28.1:\n",
- " Successfully uninstalled requests-2.28.1\n",
- "\u001b[33m DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0m Attempting uninstall: python-dateutil\n",
- " Found existing installation: python-dateutil 2.8.2\n",
- " Uninstalling python-dateutil-2.8.2:\n",
- " Successfully uninstalled python-dateutil-2.8.2\n",
- "\u001b[33m DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0m Attempting uninstall: opencv-python-headless\n",
- " Found existing installation: opencv-python-headless 4.5.5.64\n",
- " Uninstalling opencv-python-headless-4.5.5.64:\n",
- " Successfully uninstalled opencv-python-headless-4.5.5.64\n",
- "\u001b[33m DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0m Attempting uninstall: pandas\n",
- " Found existing installation: pandas 1.5.3\n",
- " Uninstalling pandas-1.5.3:\n",
- " Successfully uninstalled pandas-1.5.3\n",
- "\u001b[33m DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0m Attempting uninstall: fastdup\n",
- " Found existing installation: fastdup 0.904\n",
- " Uninstalling fastdup-0.904:\n",
- " Successfully uninstalled fastdup-0.904\n",
- "\u001b[33m DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0m\u001b[33mDEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n",
- "\u001b[0m\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n",
- "awscli 1.27.39 requires botocore==1.29.39, but you have botocore 1.29.79 which is incompatible.\n",
- "awscli 1.27.39 requires PyYAML<5.5,>=3.10, but you have pyyaml 6.0 which is incompatible.\u001b[0m\u001b[31m\n",
- "\u001b[0mSuccessfully installed certifi-2022.12.7 charset-normalizer-2.1.1 fastdup-0.904 idna-3.4 numpy-1.24.2 opencv-python-headless-4.5.5.64 packaging-23.0 pandas-1.5.3 pillow-9.4.0 python-dateutil-2.8.2 pytz-2022.7.1 pyyaml-6.0 requests-2.28.1 sentry-sdk-1.16.0 six-1.16.0 tqdm-4.65.0 urllib3-1.26.15\n",
- "Note: you may need to restart the kernel to use updated packages.\n"
- ]
- }
- ],
- "source": [
- "# install latst fastdup (required 0.904 or up)\n",
- "%pip install fastdup -U --force-reinstall"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "62c0ac2e-cd8d-428e-b5ff-1b75c917f9e3",
- "metadata": {
- "id": "62c0ac2e-cd8d-428e-b5ff-1b75c917f9e3"
- },
- "outputs": [],
- "source": [
- "#download mafat traing data, extract the zip file and put the notebook one level below images/ folder"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "538d2699-4678-4f0b-a570-412d4a97c7ae",
- "metadata": {
- "id": "538d2699-4678-4f0b-a570-412d4a97c7ae"
- },
- "source": [
- "# Prepare annotation for fastdup format"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "f2fa9853-0765-4d0a-a474-1eb703ea0a66",
- "metadata": {
- "id": "f2fa9853-0765-4d0a-a474-1eb703ea0a66"
- },
- "outputs": [],
- "source": [
- "# Here we read the data as given in the competition, one annotation file per each image. We combine all files into a single flat table"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "8e6087e1-9a59-4958-9110-a199c35c10f6",
- "metadata": {
- "id": "8e6087e1-9a59-4958-9110-a199c35c10f6"
- },
- "outputs": [],
- "source": [
- "import os\n",
- "files=!ls labelTxt\n",
- "files = [os.path.join('labelTxt', f) for f in files]"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "d64f0fa9-2ae4-4636-8866-a5303a490669",
- "metadata": {
- "id": "d64f0fa9-2ae4-4636-8866-a5303a490669"
- },
- "outputs": [],
- "source": [
- "def read_annotations(f):\n",
- " with open(f, 'r') as fd:\n",
- " lines = fd.readlines()\n",
- "\n",
- " bounding_boxes = []\n",
- "\n",
- " for line in lines:\n",
- " tokens = line.split()\n",
- " x1, y1, x2, y2, x3, y3, x4, y4 = map(float, tokens[:8])\n",
- " label = tokens[8]\n",
- " bounding_box = {'annot':f , 'x1': x1, 'y1': y1, 'x2': x2, 'y2': y2, 'x3': x3, 'y3': y3, 'x4': x4, 'y4': y4, 'label': label}\n",
- " bounding_boxes.append(bounding_box)\n",
- " return bounding_boxes"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "696a9865-8a7d-45e4-9f8b-eea4b424c91f",
- "metadata": {
- "id": "696a9865-8a7d-45e4-9f8b-eea4b424c91f"
- },
- "outputs": [],
- "source": [
- "annot = []\n",
- "for f in files:\n",
- " annot.extend(read_annotations(f))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "d6d95cd5-990c-4ce0-9a0d-c127a8a456b6",
- "metadata": {
- "id": "d6d95cd5-990c-4ce0-9a0d-c127a8a456b6",
- "outputId": "6a8f5bb2-a3f7-4b6e-93ed-dce74799ad54"
- },
- "outputs": [
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " annot | \n",
- " x1 | \n",
- " y1 | \n",
- " x2 | \n",
- " y2 | \n",
- " x3 | \n",
- " y3 | \n",
- " x4 | \n",
- " y4 | \n",
- " label | \n",
- " filename | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " | 0 | \n",
- " labelTxt/10011_0_0.txt | \n",
- " 828.32 | \n",
- " 18.09 | \n",
- " 866.47 | \n",
- " 43.15 | \n",
- " 775.87 | \n",
- " 181.07 | \n",
- " 737.72 | \n",
- " 156.01 | \n",
- " pylon | \n",
- " images/10011_0_0.tiff | \n",
- "
\n",
- " \n",
- " | 1 | \n",
- " labelTxt/10011_0_0.txt | \n",
- " 817.90 | \n",
- " 155.27 | \n",
- " 864.29 | \n",
- " 185.28 | \n",
- " 740.46 | \n",
- " 376.68 | \n",
- " 694.07 | \n",
- " 346.67 | \n",
- " pylon | \n",
- " images/10011_0_0.tiff | \n",
- "
\n",
- " \n",
- " | 2 | \n",
- " labelTxt/10011_0_0.txt | \n",
- " 834.47 | \n",
- " 685.91 | \n",
- " 875.96 | \n",
- " 712.86 | \n",
- " 782.08 | \n",
- " 857.43 | \n",
- " 740.59 | \n",
- " 830.48 | \n",
- " pylon | \n",
- " images/10011_0_0.tiff | \n",
- "
\n",
- " \n",
- " | 3 | \n",
- " labelTxt/10011_0_0.txt | \n",
- " 816.47 | \n",
- " 431.04 | \n",
- " 865.36 | \n",
- " 464.02 | \n",
- " 743.65 | \n",
- " 644.47 | \n",
- " 694.76 | \n",
- " 611.49 | \n",
- " pylon | \n",
- " images/10011_0_0.tiff | \n",
- "
\n",
- " \n",
- " | 4 | \n",
- " labelTxt/10011_0_0.txt | \n",
- " 719.43 | \n",
- " -35.14 | \n",
- " 770.69 | \n",
- " 0.74 | \n",
- " 737.59 | \n",
- " 48.01 | \n",
- " 686.33 | \n",
- " 12.12 | \n",
- " pylon | \n",
- " images/10011_0_0.tiff | \n",
- "
\n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " annot x1 y1 x2 y2 x3 y3 \\\n",
- "0 labelTxt/10011_0_0.txt 828.32 18.09 866.47 43.15 775.87 181.07 \n",
- "1 labelTxt/10011_0_0.txt 817.90 155.27 864.29 185.28 740.46 376.68 \n",
- "2 labelTxt/10011_0_0.txt 834.47 685.91 875.96 712.86 782.08 857.43 \n",
- "3 labelTxt/10011_0_0.txt 816.47 431.04 865.36 464.02 743.65 644.47 \n",
- "4 labelTxt/10011_0_0.txt 719.43 -35.14 770.69 0.74 737.59 48.01 \n",
- "\n",
- " x4 y4 label filename \n",
- "0 737.72 156.01 pylon images/10011_0_0.tiff \n",
- "1 694.07 346.67 pylon images/10011_0_0.tiff \n",
- "2 740.59 830.48 pylon images/10011_0_0.tiff \n",
- "3 694.76 611.49 pylon images/10011_0_0.tiff \n",
- "4 686.33 12.12 pylon images/10011_0_0.tiff "
- ]
- },
- "execution_count": 4,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "import pandas as pd\n",
- "df = pd.DataFrame(annot)\n",
- "df['filename'] = df['annot'].apply(lambda x: x.replace('labelTxt', 'images').replace('.txt', '.tiff'))\n",
- "df.head()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "1b4ccdaa-6162-4684-9808-303966e080bd",
- "metadata": {
- "id": "1b4ccdaa-6162-4684-9808-303966e080bd",
- "outputId": "96945552-7694-4a14-dc5d-b04cc0a93a2f"
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "total annotations 18113\n"
- ]
- }
- ],
- "source": [
- "print('total annotations', len(df))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "c46545d0-3e52-4257-91cf-68e1a2b8d10c",
- "metadata": {
- "id": "c46545d0-3e52-4257-91cf-68e1a2b8d10c"
- },
- "outputs": [],
- "source": [
- "df.index.name = 'index'\n",
- "df[['filename', 'x1', 'y1', 'x2', 'y2', 'x3', 'y3', 'x4', 'y4', 'label']].to_csv('mafat.csv',index_label='index')"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "92a5df03-d456-40a2-a01f-42a47f6835b5",
- "metadata": {
- "id": "92a5df03-d456-40a2-a01f-42a47f6835b5",
- "outputId": "b7f5eb0b-65ad-47bc-c4ac-8f2075572019"
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "index,filename,x1,y1,x2,y2,x3,y3,x4,y4,label\n",
- "0,images/10011_0_0.tiff,828.32,18.09,866.47,43.15,775.87,181.07,737.72,156.01,pylon\n",
- "1,images/10011_0_0.tiff,817.9,155.27,864.29,185.28,740.46,376.68,694.07,346.67,pylon\n",
- "2,images/10011_0_0.tiff,834.47,685.91,875.96,712.86,782.08,857.43,740.59,830.48,pylon\n",
- "3,images/10011_0_0.tiff,816.47,431.04,865.36,464.02,743.65,644.47,694.76,611.49,pylon\n",
- "4,images/10011_0_0.tiff,719.43,-35.14,770.69,0.74,737.59,48.01,686.33,12.12,pylon\n",
- "5,images/10011_0_0.tiff,834.54,344.16,874.27,369.97,779.75,515.51,740.02,489.7,pylon\n",
- "6,images/10011_0_0.tiff,1233.3,540.65,1238.65,547.24,1233.78,551.2,1228.43,544.61,heavy_equipment\n",
- "7,images/10011_0_0.tiff,817.78,769.25,866.68,801.74,746.22,983.05,697.32,950.56,pylon\n",
- "8,images/10011_1280_0.tiff,701.87,421.42,755.68,465.62,744.99,478.63,691.18,434.43,pylon\n"
- ]
- }
- ],
- "source": [
- "# This is the reuired input by fastdup\n",
- "!head mafat.csv"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "620799ea-3318-4a74-8dd0-d74ec3f42849",
- "metadata": {
- "id": "620799ea-3318-4a74-8dd0-d74ec3f42849"
- },
- "source": [
- "# Run fastdup to crop and build a model for the crops"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "d8dcc080-7ef8-4789-8e14-7b56794c4d22",
- "metadata": {
- "id": "d8dcc080-7ef8-4789-8e14-7b56794c4d22"
- },
- "outputs": [],
- "source": [
- "import numpy as np\n",
- "import cv2\n",
- "\n",
- "!rm -fr output"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "d5abac7d-3b78-4090-9c6a-50abea31b0db",
- "metadata": {
- "id": "d5abac7d-3b78-4090-9c6a-50abea31b0db"
- },
- "outputs": [],
- "source": [
- "import pandas as pd\n",
- "import fastdup\n",
- "fd = fastdup.create(input_dir='.', work_dir='output')\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "a4264a0c-d313-4f8a-9d4f-eb0e2cc6e10e",
- "metadata": {
- "id": "a4264a0c-d313-4f8a-9d4f-eb0e2cc6e10e"
- },
- "outputs": [],
- "source": [
- "# for running advanced bounding boxes in fastdup please send email to info@databasevisual.com to get your free license. \n",
- "# Rotated bounding boxes are not supported in the free version."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "94156d52-1c7d-400f-a0c2-63df5648a0e9",
- "metadata": {
- "id": "94156d52-1c7d-400f-a0c2-63df5648a0e9",
- "outputId": "c843b0e6-7163-49af-e34b-a4efa90c4268"
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "FastDup Software, (C) copyright 2022 Dr. Amir Alush and Dr. Danny Bickson.\n",
- "2023-03-15 06:27:02 [INFO] Going to loop over dir /var/folders/4m/17tfvm293lg5scctpk1cd2940000gn/T/tmpfvaycon7.csv\n",
- "2023-03-15 06:27:02 [INFO] Found total 18113 images to run on\n",
- "FastDup Software, (C) copyright 2022 Dr. Amir Alush and Dr. Danny Bickson.utes 0 Features\n",
- "2023-03-15 06:29:06 [INFO] Going to loop over dir /var/folders/4m/17tfvm293lg5scctpk1cd2940000gn/T/crops_input.csv\n",
- "2023-03-15 06:29:06 [INFO] Found total 18113 images to run on\n",
- "2023-03-15 06:29:28 [INFO] Found total 18113 images to run onimated: 0 Minutes 0 Features\n",
- "2023-03-15 06:29:31 [INFO] 3396) Finished write_index() NN model\n",
- "2023-03-15 06:29:31 [INFO] Stored nn model index file output/nnf.index\n",
- "2023-03-15 06:29:33 [INFO] Total time took 27053 ms\n",
- "2023-03-15 06:29:33 [INFO] Found a total of 12 fully identical images (d>0.990), which are 0.02 %\n",
- "2023-03-15 06:29:33 [INFO] Found a total of 573 nearly identical images(d>0.980), which are 1.05 %\n",
- "2023-03-15 06:29:33 [INFO] Found a total of 32089 above threshold images (d>0.900), which are 59.05 %\n",
- "2023-03-15 06:29:33 [INFO] Found a total of 1811 outlier images (d<0.050), which are 3.33 %\n",
- "2023-03-15 06:29:33 [INFO] Min distance found 0.625 max distance 0.992\n",
- "2023-03-15 06:29:33 [INFO] Running connected components for ccthreshold 0.950000 \n",
- ".0\n",
- " ########################################################################################\n",
- "\n",
- "Dataset Analysis Summary: \n",
- "\n",
- " Dataset contains 18113 images\n",
- " Valid images are 100.00% (18,113) of the data, invalid are 0.00% (0) of the data\n",
- " Similarity: 21.91% (3,969) belong to 60 similarity clusters (components).\n",
- " 78.09% (14,144) images do not belong to any similarity cluster.\n",
- " Largest cluster has 382 (2.11%) images.\n",
- " For a detailed analysis, use `.connected_components()`\n",
- "(similarity threshold used is 0.9, connected component threshold used is 0.95).\n",
- "\n",
- " Outliers: 5.95% (1,077) of images are possible outliers, and fall in the bottom 5.00% of similarity values.\n",
- " For a detailed list of outliers, use `.outliers()`.\n"
- ]
- }
- ],
- "source": [
- "\n",
- "fd.run(annotations=df, overwrite=True, license='XXX', bounding_box='rotated', augmentation_additive_margin=15,\n",
- " verbose=False, ccthreshold=0.95)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "a834aaaa-a76c-49bc-b293-c3c3e114d7aa",
- "metadata": {
- "id": "a834aaaa-a76c-49bc-b293-c3c3e114d7aa"
- },
- "source": [
- "# Find suspected wrong bounding boxes\n",
- "\n",
- "From - crop image name\n",
- "To - similar images\n",
- "where the labels are not matching"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "4e445a56-ffa9-448d-9e74-715413fc4f3c",
- "metadata": {
- "id": "4e445a56-ffa9-448d-9e74-715413fc4f3c",
- "outputId": "b622776d-2159-4e76-f7dd-a3f70c97d913"
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "heavy_equipment\n"
- ]
- },
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "100%|████████████████████████████████████████████| 20/20 [00:00<00:00, 293.53it/s]"
- ]
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Finished OK. Components are stored as image files output/galleries/components_[index].jpg\n",
- "Stored components visual view in output/galleries/components.html\n",
- "Execution time in seconds 2.0\n"
- ]
- },
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "\n"
- ]
- },
- {
- "data": {
- "text/html": [
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " Components Report\n",
- " , slice: diff
\n",
- " \n",
- "\n",
- "\n",
- "\n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- "
\n",
- "
\n",
- "
Components Report
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " \n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 2500 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 4 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9791 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | small_vessel | \n",
- " 3 | \n",
- "
\n",
- "\n",
- " | medium_vessel | \n",
- " 1 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 736 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 2 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9777 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | large_vehicle | \n",
- " 1 | \n",
- "
\n",
- "\n",
- " | medium_vessel | \n",
- " 1 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 4622 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 2 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9754 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | medium_vessel | \n",
- " 1 | \n",
- "
\n",
- "\n",
- " | small_vessel | \n",
- " 1 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 251 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 2 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9753 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | medium_vessel | \n",
- " 1 | \n",
- "
\n",
- "\n",
- " | small_vessel | \n",
- " 1 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 8885 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 2 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9742 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | bus | \n",
- " 1 | \n",
- "
\n",
- "\n",
- " | large_vehicle | \n",
- " 1 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 334 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 4 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9742 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | medium_vehicle | \n",
- " 2 | \n",
- "
\n",
- "\n",
- " | medium_vessel | \n",
- " 1 | \n",
- "
\n",
- "\n",
- " | small_vessel | \n",
- " 1 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 1892 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 3 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9739 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | small_vessel | \n",
- " 2 | \n",
- "
\n",
- "\n",
- " | small_vehicle | \n",
- " 1 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 1851 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 4 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9737 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | medium_vessel | \n",
- " 3 | \n",
- "
\n",
- "\n",
- " | small_vessel | \n",
- " 1 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 2523 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 4 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9731 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | small_vessel | \n",
- " 3 | \n",
- "
\n",
- "\n",
- " | medium_vessel | \n",
- " 1 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 184 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 7 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.973 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | double_trailer_truck | \n",
- " 5 | \n",
- "
\n",
- "\n",
- " | large_vehicle | \n",
- " 2 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 2511 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 2 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.973 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | bus | \n",
- " 1 | \n",
- "
\n",
- "\n",
- " | small_vessel | \n",
- " 1 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 4117 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 2 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9729 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | small_vehicle | \n",
- " 1 | \n",
- "
\n",
- "\n",
- " | small_vessel | \n",
- " 1 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 6470 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 3 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9724 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | medium_vehicle | \n",
- " 2 | \n",
- "
\n",
- "\n",
- " | large_vehicle | \n",
- " 1 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 250 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 3 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.972 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | small_vessel | \n",
- " 2 | \n",
- "
\n",
- "\n",
- " | medium_vehicle | \n",
- " 1 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 347 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 3 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9719 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | medium_vessel | \n",
- " 2 | \n",
- "
\n",
- "\n",
- " | small_vessel | \n",
- " 1 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 1846 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 7 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9717 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | small_vehicle | \n",
- " 4 | \n",
- "
\n",
- "\n",
- " | medium_vehicle | \n",
- " 3 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 1404 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 4 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9715 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | medium_vehicle | \n",
- " 3 | \n",
- "
\n",
- "\n",
- " | large_vehicle | \n",
- " 1 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 2324 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 2 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9713 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | small_vehicle | \n",
- " 1 | \n",
- "
\n",
- "\n",
- " | small_vessel | \n",
- " 1 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 5611 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 2 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9712 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | large_vehicle | \n",
- " 1 | \n",
- "
\n",
- "\n",
- " | medium_vehicle | \n",
- " 1 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 1567 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 2 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9711 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | large_vehicle | \n",
- " 1 | \n",
- "
\n",
- "\n",
- " | small_vehicle | \n",
- " 1 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " \n",
- " "
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "fd.vis.component_gallery(load_crops=True,enhance_image=True,keep_aspect_ratio=True,slice='diff', num_images=20, save_artifacts=True)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "44174ffd-72f0-4a63-8849-6989bf982fa2",
- "metadata": {
- "id": "44174ffd-72f0-4a63-8849-6989bf982fa2"
- },
- "outputs": [],
- "source": [
- "# Looking at the raw cluster to link back cluster name to to file"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "d1129fcd-ab0b-4ef7-93a0-30fea445be2f",
- "metadata": {
- "id": "d1129fcd-ab0b-4ef7-93a0-30fea445be2f"
- },
- "outputs": [],
- "source": [
- "df = pd.read_csv('output/galleries/components.csv')"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "1422b9cd-34cf-496f-be2a-48ca5f358193",
- "metadata": {
- "id": "1422b9cd-34cf-496f-be2a-48ca5f358193",
- "outputId": "e343d422-06d9-4e4e-a64f-d35b781603b1"
- },
- "outputs": [
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " Unnamed: 0 | \n",
- " component_id | \n",
- " files | \n",
- " label | \n",
- " distance | \n",
- " len | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " | 0 | \n",
- " 2500 | \n",
- " 2500 | \n",
- " ['crops/images13591_6400_1280.tiff_1_412_4_409_9_415_6_418.jpg', 'crops/images13591_6400_1280.tiff_4_410_7_407_12_413_9_416.jpg', 'crops/images13591_6400_1280.tiff_7_408_10_406_15_411_12_414.jpg', 'crops/images1675_3840_11520.tiff_2_77_8_76_10_92_4_93.jpg'] | \n",
- " ['small_vessel', 'small_vessel', 'small_vessel', 'medium_vessel'] | \n",
- " 0.9791 | \n",
- " 4 | \n",
- "
\n",
- " \n",
- " | 1 | \n",
- " 736 | \n",
- " 736 | \n",
- " ['crops/images1081_1280_10240.tiff_556_144_559_139_574_149_571_154.jpg', 'crops/images18849_6400_0.tiff_1080_96_1085_94_1092_112_1087_114.jpg'] | \n",
- " ['large_vehicle', 'medium_vessel'] | \n",
- " 0.9777 | \n",
- " 2 | \n",
- "
\n",
- " \n",
- " | 2 | \n",
- " 4622 | \n",
- " 4622 | \n",
- " ['crops/images1675_3840_11520.tiff_7_36_15_39_6_60_-1_57.jpg', 'crops/images214_10240_1280.tiff_1267_223_1271_221_1276_230_1272_233.jpg'] | \n",
- " ['medium_vessel', 'small_vessel'] | \n",
- " 0.9754 | \n",
- " 2 | \n",
- "
\n",
- " \n",
- " | 3 | \n",
- " 251 | \n",
- " 251 | \n",
- " ['crops/images10669_8960_0.tiff_1178_336_1186_324_1191_326_1183_339.jpg', 'crops/images4079_3840_5120.tiff_169_1175_176_1178_174_1182_167_1179.jpg'] | \n",
- " ['medium_vessel', 'small_vessel'] | \n",
- " 0.9753 | \n",
- " 2 | \n",
- "
\n",
- " \n",
- " | 4 | \n",
- " 8885 | \n",
- " 8885 | \n",
- " ['crops/images5532_1280_0.tiff_815_931_817_935_807_941_804_937.jpg', 'crops/images5532_1280_0.tiff_1073_1007_1073_1002_1085_1003_1085_1008.jpg'] | \n",
- " ['large_vehicle', 'bus'] | \n",
- " 0.9742 | \n",
- " 2 | \n",
- "
\n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " Unnamed: 0 component_id \\\n",
- "0 2500 2500 \n",
- "1 736 736 \n",
- "2 4622 4622 \n",
- "3 251 251 \n",
- "4 8885 8885 \n",
- "\n",
- " files \\\n",
- "0 ['crops/images13591_6400_1280.tiff_1_412_4_409_9_415_6_418.jpg', 'crops/images13591_6400_1280.tiff_4_410_7_407_12_413_9_416.jpg', 'crops/images13591_6400_1280.tiff_7_408_10_406_15_411_12_414.jpg', 'crops/images1675_3840_11520.tiff_2_77_8_76_10_92_4_93.jpg'] \n",
- "1 ['crops/images1081_1280_10240.tiff_556_144_559_139_574_149_571_154.jpg', 'crops/images18849_6400_0.tiff_1080_96_1085_94_1092_112_1087_114.jpg'] \n",
- "2 ['crops/images1675_3840_11520.tiff_7_36_15_39_6_60_-1_57.jpg', 'crops/images214_10240_1280.tiff_1267_223_1271_221_1276_230_1272_233.jpg'] \n",
- "3 ['crops/images10669_8960_0.tiff_1178_336_1186_324_1191_326_1183_339.jpg', 'crops/images4079_3840_5120.tiff_169_1175_176_1178_174_1182_167_1179.jpg'] \n",
- "4 ['crops/images5532_1280_0.tiff_815_931_817_935_807_941_804_937.jpg', 'crops/images5532_1280_0.tiff_1073_1007_1073_1002_1085_1003_1085_1008.jpg'] \n",
- "\n",
- " label \\\n",
- "0 ['small_vessel', 'small_vessel', 'small_vessel', 'medium_vessel'] \n",
- "1 ['large_vehicle', 'medium_vessel'] \n",
- "2 ['medium_vessel', 'small_vessel'] \n",
- "3 ['medium_vessel', 'small_vessel'] \n",
- "4 ['large_vehicle', 'bus'] \n",
- "\n",
- " distance len \n",
- "0 0.9791 4 \n",
- "1 0.9777 2 \n",
- "2 0.9754 2 \n",
- "3 0.9753 2 \n",
- "4 0.9742 2 "
- ]
- },
- "execution_count": 7,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df.head()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "bcb6a063-698c-480b-88e4-8ec3c9bfdb27",
- "metadata": {
- "id": "bcb6a063-698c-480b-88e4-8ec3c9bfdb27"
- },
- "outputs": [],
- "source": [
- "# Looking at good labels"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "5225bde9-baea-4a45-92fd-baab7d6d4553",
- "metadata": {
- "id": "5225bde9-baea-4a45-92fd-baab7d6d4553",
- "outputId": "0e571cfd-274a-48b3-90e7-de3bdfc83024"
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "small_aircraft\n"
- ]
- },
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "100%|████████████████████████████████████████████| 20/20 [00:00<00:00, 295.60it/s]"
- ]
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Finished OK. Components are stored as image files output/galleries/components_[index].jpg\n",
- "Stored components visual view in output/galleries/components.html\n",
- "Execution time in seconds 2.0\n"
- ]
- },
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "\n"
- ]
- },
- {
- "data": {
- "text/html": [
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " Components Report\n",
- " , slice: same
\n",
- " \n",
- "\n",
- "\n",
- "\n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- "
\n",
- "
\n",
- "
Components Report
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " \n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 9946 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 12 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9512 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | container | \n",
- " 12 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 1038 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 11 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.951 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | medium_vehicle | \n",
- " 11 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 584 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 11 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9628 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | container | \n",
- " 11 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 9948 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 11 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9508 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | container | \n",
- " 11 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 2732 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 11 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.951 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | bus | \n",
- " 11 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 8817 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 10 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9678 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | container | \n",
- " 10 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 198 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 9 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9697 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | double_trailer_truck | \n",
- " 9 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 8684 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 9 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9506 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | container | \n",
- " 9 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 8827 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 8 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9624 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | container | \n",
- " 8 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 3052 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 8 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9542 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | pylon | \n",
- " 8 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 587 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 8 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9657 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | container | \n",
- " 8 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 1657 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 8 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9553 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | pylon | \n",
- " 8 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 582 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 8 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9643 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | container | \n",
- " 8 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 9986 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 8 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.951 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | container | \n",
- " 8 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 1661 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 8 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9549 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | pylon | \n",
- " 8 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 7355 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 8 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9559 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | medium_vessel | \n",
- " 8 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 8653 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 7 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9505 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | container | \n",
- " 7 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 8668 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 7 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9568 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | container | \n",
- " 7 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 1664 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 7 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9513 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | pylon | \n",
- " 7 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | component | \n",
- " 1092 | \n",
- "
\n",
- "\n",
- " | num_images | \n",
- " 7 | \n",
- "
\n",
- "\n",
- " | mean_distance | \n",
- " 0.9506 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Label | \n",
- "
\n",
- "\n",
- " | small_aircraft | \n",
- " 7 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " \n",
- " "
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "fd.vis.component_gallery(load_crops=True,enhance_image=True,keep_aspect_ratio=True,slice='same', num_images=20, save_artifacts=True)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "5b86b38f-2f3e-4ab5-911b-f43079f82e93",
- "metadata": {
- "id": "5b86b38f-2f3e-4ab5-911b-f43079f82e93"
- },
- "outputs": [],
- "source": [
- "# Let's look on outliers on the satellite image level"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "4082bd38-22ab-445b-a9a2-a72856352870",
- "metadata": {
- "id": "4082bd38-22ab-445b-a9a2-a72856352870",
- "outputId": "ea579444-fbf5-4168-e935-40200b33b69f"
- },
- "outputs": [
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "100%|██████████████████████████████████████████| 20/20 [00:00<00:00, 34836.41it/s]\n"
- ]
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Stored outliers visual view in output/galleries/outliers.html\n"
- ]
- },
- {
- "data": {
- "text/html": [
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " Outliers Report\n",
- " Showing image outliers, one per row
\n",
- " \n",
- "\n",
- "\n",
- "\n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- "
\n",
- "
\n",
- "
Outliers Report
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " \n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.62536 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/17939_1280_0.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " large_vessel | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.653902 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/2978_1280_7680.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.662745 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/12365_3840_0.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.679181 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/12365_1280_2560.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.682993 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/19122_3840_2560.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.683379 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/18029_0_5120.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.689777 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/4752_3840_3840.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " large_vessel | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.692824 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/5431_6400_5120.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " large_vessel | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.693226 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/12365_2560_1280.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.693298 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/12365_1280_2560.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.693298 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/12365_0_2560.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.694098 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/12365_0_3840.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.697002 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/1296_2560_2560.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " large_vessel | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.700964 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/5795_1280_3840.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.703567 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/12365_2560_1280.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.716707 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/1081_0_8960.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " large_vessel | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.720739 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/12365_1280_2560.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.721096 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/9967_1280_3840.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.721096 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/9967_1280_5120.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.72187 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/12365_3840_0.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " \n",
- " "
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "fd.vis.outliers_gallery()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "f7998fe4-db21-4c06-aca6-3287119b74d2",
- "metadata": {
- "id": "f7998fe4-db21-4c06-aca6-3287119b74d2"
- },
- "outputs": [],
- "source": [
- "# Now we look at outliers at the crop level"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "925c986e-18d9-4a6f-adc5-2cd7949f8424",
- "metadata": {
- "id": "925c986e-18d9-4a6f-adc5-2cd7949f8424",
- "outputId": "149f3684-c7e4-463d-a606-b0e580d7813e"
- },
- "outputs": [
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "100%|██████████████████████████████████████████| 20/20 [00:00<00:00, 34620.75it/s]"
- ]
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Stored outliers visual view in output/galleries/outliers.html\n"
- ]
- },
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "\n"
- ]
- },
- {
- "data": {
- "text/html": [
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " Outliers Report\n",
- " Showing image outliers, one per row
\n",
- " \n",
- "\n",
- "\n",
- "\n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- "
\n",
- "
\n",
- "
Outliers Report
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " \n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.62536 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/17939_1280_0.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " large_vessel | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.653902 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/2978_1280_7680.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.662745 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/12365_3840_0.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.679181 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/12365_1280_2560.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.682993 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/19122_3840_2560.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.683379 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/18029_0_5120.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.689777 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/4752_3840_3840.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " large_vessel | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.692824 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/5431_6400_5120.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " large_vessel | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.693226 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/12365_2560_1280.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.693298 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/12365_1280_2560.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.693298 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/12365_0_2560.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.694098 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/12365_0_3840.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.697002 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/1296_2560_2560.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " large_vessel | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.700964 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/5795_1280_3840.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.703567 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/12365_2560_1280.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.716707 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/1081_0_8960.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " large_vessel | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.720739 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/12365_1280_2560.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.721096 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/9967_1280_3840.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.721096 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/9967_1280_5120.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | Distance | \n",
- " 0.72187 | \n",
- "
\n",
- "\n",
- " | Path | \n",
- " images/12365_3840_0.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " \n",
- " "
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "fd.vis.outliers_gallery(load_crops=True)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "47cfd1cc-7db6-4256-9550-62ab7fe3e81e",
- "metadata": {
- "id": "47cfd1cc-7db6-4256-9550-62ab7fe3e81e"
- },
- "outputs": [],
- "source": [
- "# We look for the brightest satellite images"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "4a861aab-50a2-4f39-944e-f139fe60327a",
- "metadata": {
- "id": "4a861aab-50a2-4f39-944e-f139fe60327a",
- "outputId": "ce13b08d-b6ea-4439-b86b-bdc2d52fb9ac"
- },
- "outputs": [
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "100%|█████████████████████████████████████████████| 20/20 [00:00<00:00, 51.94it/s]\n"
- ]
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Stored mean visual view in output/galleries/mean.html\n"
- ]
- },
- {
- "data": {
- "text/html": [
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " Bright Image Report\n",
- " Showing example images, sort by descending order
\n",
- " \n",
- "\n",
- "\n",
- "\n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- "
\n",
- "
\n",
- "
Bright Image Report
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " \n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 204.1419 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/1362_5120_1280.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " small_aircraft | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 200.4892 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/9606_7680_2560.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 198.2261 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/9606_7680_2560.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 194.7672 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/9606_7680_2560.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 183.3484 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/12872_3840_5120.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " small_aircraft | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 183.298 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/5247_3840_3840.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " small_aircraft | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 181.3148 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/5247_6400_3840.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " small_aircraft | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 175.75 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/5247_6400_3840.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " small_aircraft | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 174.6339 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/8688_5120_2560.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " small_aircraft | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 171.6046 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/5247_7680_3840.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " small_aircraft | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 170.957 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/14919_0_2560.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 170.8223 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/6298_11520_6400.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " heavy_equipment | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 170.8009 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/12872_3840_5120.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " small_aircraft | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 169.5945 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/12872_3840_5120.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " small_aircraft | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 169.5548 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/14919_3840_7680.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " pylon | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 168.5053 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/5247_7680_3840.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " small_aircraft | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 168.0129 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/12872_3840_5120.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " small_aircraft | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 167.867 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/5247_5120_3840.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " small_aircraft | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 167.8209 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/12872_3840_5120.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " small_aircraft | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | mean | \n",
- " 167.4776 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " images/1362_5120_0.tiff | \n",
- "
\n",
- "\n",
- " | label | \n",
- " small_aircraft | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " \n",
- " "
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "fd.vis.stats_gallery(metric='mean')"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "9711f363-9d0f-4d42-b4cd-66f5f9ab1b00",
- "metadata": {
- "id": "9711f363-9d0f-4d42-b4cd-66f5f9ab1b00"
- },
- "outputs": [],
- "source": [
- "# Now we look for the most blurry images"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "c0a2d9d9-5180-4ebe-b073-f7feef1e4c6d",
- "metadata": {
- "id": "c0a2d9d9-5180-4ebe-b073-f7feef1e4c6d",
- "outputId": "96d72992-2dd0-4d8c-a266-d7fe6f47b1c8"
- },
- "outputs": [
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "100%|███████████████████████████████████████████| 20/20 [00:00<00:00, 2195.86it/s]"
- ]
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Stored blur visual view in output/galleries/blur.html\n"
- ]
- },
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "\n"
- ]
- },
- {
- "data": {
- "text/html": [
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " Blurry Image Report\n",
- " Showing example images, sort by ascending order
\n",
- " \n",
- "\n",
- "\n",
- "\n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- "
\n",
- "
\n",
- "
Blurry Image Report
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " \n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 0.3481 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images10781_5120_5120.tiff_183_192_191_190_201_225_193_227.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " large_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 0.4059 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images10781_5120_5120.tiff_202_286_202_293_175_294_175_287.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " large_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 0.4622 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images10781_5120_5120.tiff_185_107_193_104_203_139_195_141.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " large_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 0.5688 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images19448_1280_7680.tiff_1183_516_1192_506_1196_510_1187_520.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " medium_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 0.57 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images19231_3840_3840.tiff_861_966_860_958_894_954_895_962.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " large_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 0.6532 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images9249_3840_1280.tiff_91_1179_102_1180_102_1186_90_1185.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " medium_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 0.7369 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images8528_0_5120.tiff_89_1261_105_1253_109_1260_93_1268.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " medium_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 0.8693 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images5307_3840_8960.tiff_531_205_542_206_542_211_531_211.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " medium_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 0.9059 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images8528_0_8960.tiff_749_662_754_657_764_668_759_673.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " medium_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 0.9141 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images5809_1280_8960.tiff_1149_828_1151_846_1146_846_1144_829.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " medium_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 0.9563 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images4580_7680_3840.tiff_225_1109_238_1112_237_1117_224_1114.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " medium_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 0.9667 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images2289_7680_7680.tiff_675_176_688_167_691_172_678_180.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " medium_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 0.9992 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images8528_0_8960.tiff_761_657_765_653_775_664_771_668.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " medium_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 1.0048 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images13085_1280_3840.tiff_1198_423_1202_426_1197_431_1194_429.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " medium_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 1.0294 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images10781_5120_5120.tiff_181_236_188_234_198_267_191_269.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " large_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 1.0413 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images6286_1280_3840.tiff_1048_1189_1054_1185_1060_1195_1055_1198.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " medium_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 1.0931 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images9967_1280_1280.tiff_1120_881_1130_883_1129_888_1119_886.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " medium_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 1.1291 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images10781_5120_5120.tiff_164_239_172_237_181_270_174_272.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " large_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 1.1569 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images6290_5120_10240.tiff_692_404_696_408_683_418_680_413.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " medium_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- "

\n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- " \n",
- " | Info | \n",
- "
\n",
- "\n",
- " | blur | \n",
- " 1.1676 | \n",
- "
\n",
- "\n",
- " | filename | \n",
- " output/crops/images10781_5120_5120.tiff_169_226_176_223_187_257_179_260.jpg | \n",
- "
\n",
- "\n",
- " | label | \n",
- " large_vehicle | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n",
- " \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " \n",
- " "
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "fd.vis.stats_gallery(metric='blur',load_crops=True)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "a8fe9bbf-6be1-4907-b555-53605befbf6d",
- "metadata": {
- "id": "a8fe9bbf-6be1-4907-b555-53605befbf6d"
- },
- "outputs": [],
- "source": []
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3 (ipykernel)",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.8.16"
- },
- "colab": {
- "provenance": []
- }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
\ No newline at end of file
diff --git a/examples/mafat-final.ipynb b/examples/satellite-image-analysis.ipynb
similarity index 99%
rename from examples/mafat-final.ipynb
rename to examples/satellite-image-analysis.ipynb
index 03eef553..e66e715b 100644
--- a/examples/mafat-final.ipynb
+++ b/examples/satellite-image-analysis.ipynb
@@ -5,7 +5,7 @@
"id": "2d3a2ba6-3ba0-4770-b025-c88adf5b292e",
"metadata": {},
"source": [
- "# Fastdup for Sattelite Imagery\n",
+ "# Satellite Image Analsis\n",
"In this notebook we load satellite data from Mafat Competition https://mafatchallenge.mod.gov.il/, which consists of 16 bit grayscale images with rotated bounding boxes.\n",
"\n",
"We show how to work with this dataset using fastdup. It takes 140 seconds to process 18,000 bounding boxes and find all similarities.\n",
diff --git a/gallery/coco_thumbnail.jpg b/gallery/coco_thumbnail.jpg
index cd2f2bad..dd430231 100644
Binary files a/gallery/coco_thumbnail.jpg and b/gallery/coco_thumbnail.jpg differ
diff --git a/gallery/dino.png b/gallery/dino.png
deleted file mode 100644
index c9591d74..00000000
Binary files a/gallery/dino.png and /dev/null differ
diff --git a/gallery/dino_thumbnail.jpg b/gallery/dino_thumbnail.jpg
new file mode 100644
index 00000000..dd5103da
Binary files /dev/null and b/gallery/dino_thumbnail.jpg differ
diff --git a/gallery/feature_vector.jpg b/gallery/feature_vector.jpg
new file mode 100644
index 00000000..de54148b
Binary files /dev/null and b/gallery/feature_vector.jpg differ
diff --git a/gallery/feature_vector.png b/gallery/feature_vector.png
deleted file mode 100644
index 38e4f807..00000000
Binary files a/gallery/feature_vector.png and /dev/null differ
diff --git a/gallery/food_101_thumbnail.jpg b/gallery/food_101_thumbnail.jpg
deleted file mode 100644
index f1852844..00000000
Binary files a/gallery/food_101_thumbnail.jpg and /dev/null differ
diff --git a/gallery/food_thumbnail.jpg b/gallery/food_thumbnail.jpg
new file mode 100644
index 00000000..cec28709
Binary files /dev/null and b/gallery/food_thumbnail.jpg differ
diff --git a/gallery/nbviewer_logo.png b/gallery/nbviewer_logo.png
new file mode 100644
index 00000000..3d872a1f
Binary files /dev/null and b/gallery/nbviewer_logo.png differ
diff --git a/gallery/ocr_thumbnail.jpg b/gallery/ocr_thumbnail.jpg
new file mode 100644
index 00000000..8bc377b3
Binary files /dev/null and b/gallery/ocr_thumbnail.jpg differ
diff --git a/gallery/product-matching.jpg b/gallery/product-matching.jpg
index e03ebfb6..df9c6805 100644
Binary files a/gallery/product-matching.jpg and b/gallery/product-matching.jpg differ
diff --git a/gallery/purple-3d-text-editable-text-style-vector-2753015678 b/gallery/purple-3d-text-editable-text-style-vector-2753015678
new file mode 100644
index 00000000..d47abbd0
Binary files /dev/null and b/gallery/purple-3d-text-editable-text-style-vector-2753015678 differ
diff --git a/gallery/satellite.png b/gallery/satellite.png
deleted file mode 100644
index c8d4deff..00000000
Binary files a/gallery/satellite.png and /dev/null differ
diff --git a/gallery/satellite_thumbnail.jpg b/gallery/satellite_thumbnail.jpg
new file mode 100644
index 00000000..32a9d17a
Binary files /dev/null and b/gallery/satellite_thumbnail.jpg differ
diff --git a/gallery/surveillance.png b/gallery/surveillance.png
deleted file mode 100644
index f72d87b1..00000000
Binary files a/gallery/surveillance.png and /dev/null differ
diff --git a/gallery/surveillance_thumbnail.jpg b/gallery/surveillance_thumbnail.jpg
new file mode 100644
index 00000000..f39ac233
Binary files /dev/null and b/gallery/surveillance_thumbnail.jpg differ
diff --git a/gallery/video-yolov5-detection.png b/gallery/video-yolov5-detection.png
index e0d8c54b..f33114c0 100644
Binary files a/gallery/video-yolov5-detection.png and b/gallery/video-yolov5-detection.png differ