Skip to content

HamletMonkey/image_dataset_explorer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Image Dataset Explorer

Dataset Statistics

data_vis.py - To visualize the distribution of image datasets compiled.

The result returned is an image with 6 subplots (2 rows 3 columns) containing:

  1. Class distribution in image dataset
  2. Normalized co-occurence matrix (Jaccard similarity)
  3. Histogram of image aspect ratio
  4. Mean area of bounding box per class
  5. Aspect ratio of bounding box in image dataset
  6. Square root of relative area (size) of bounding box to image (per class)

Running of Script

Run data_vis.py in terminal: python data_vis.py --ann_path ./path/to/annotations

To add in main title for the visualization plot, eg.'Raw Dataset': python data_vis.py --ann_path ./path/to/annotations --title 'Raw Dataset'

The result image will be saved in working directory.

IoU: Duplicate Annotations Check

iou_score_vis.py - To visualize and gets the IoU score between different combination pair of bounding boxes within an image. A high IoU score indicates potential annotation duplication of the same object, especially if both labels are of the same class.

Main function iou_score_plot() also returns a pd.DataFrame with information of each bounding box pair coordinates, classes and its iou_score. To get the intersection coordinates between each bounding box pair, the function iou_inter_coord() can be applied on filtered dataframe.

Running of Script

Run iou_score_vis.py in terminal: python iou_score_vis.py --xmlpath ./path/to/annotations -p True

A csv file iou_score.csv of the returned pandas.Dataframe (df_iou) will be saved in working directory.

About

Visualization tools to help identify possible issues with image dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages