Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Scripts to visualize the Kaggle NDSB training images
branch: master

update README

latest commit cb1475a3cd
Eben Olson authored
Failed to load latest commit information.
circlepack adding circlepack
viewer adding viewer
.gitignore Initial commit
DejaVuSansMono-Bold.ttf add font
README.md update README
machine_setup.sh initial commit
make_bubblechart.py initial commit
make_mosaics.py initial commit
taxonomy.txt initial commit

README.md

kaggle-ndsb-visualization

This is a collection of scripts for visualizing plankton images from the Kaggle National Data Science Bowl competition. Images within each training class are compiled into a single mosaic image, and a bubble plot is created which groups mosaics according to the provided taxonomy. Due to the large size of this image it is saved as a tile pyramid which can be viewed with the included Polymaps viewer.

Dependencies

These scripts require numpy, scipy, PIL and matplotlib. Executing machine_setup.sh on a fresh Ubuntu 14.04 installation will install all necessary packages.

Data Preparation

Clone this repository into a directory adjecent to your data directory, containing the training images in data/train.

.
├── data
│   └── train
└── visualization

Usage

  1. python make_mosaics.py will generate mosaic images in the mosaics subdirectory.
  2. python make_bubbleplot.py will generate the bubbleplot and write tile images into the pyramid subdirectory.
  3. After generation, the bubble plot can be viewed by opening the file viewer/index.html in a web browser.

System Requirements

Memory usage is currently very unoptimized, and at least 16GB of RAM is required to render the bubble plot. Use of an Amazon EC2 m3.2xlarge instance is recommended. Numpy memmap can be used to reduce this, but computation time is greatly increased. Mosaic creation (make_mosaics.py) is much less memory intensive.

Screenshots

Zoomed out completely Zoomed out completely Zooming in on one group Zooming in on one group A single mosaic Single mosaic

Something went wrong with that request. Please try again.