This repository contains the supporting code for the paper:
Staib, Matthew and Jegelka, Stefanie. Wasserstein k-means++ for Cloud Regime Histogram Clustering. In Proceedings of the Seventh International Workshop on Climate Informatics, 2017.
@inproceedings{staib2017wasserstein,
author = {Staib, Matthew and Jegelka, Stefanie},
title = {Wasserstein k-means++ for Cloud Regime Histogram Clustering},
booktitle = {Proceedings of the Seventh International Workshop on Climate
Informatics: CI 2017},
year = {2017}
}
- TFOCS
- Gurobi (for the Wasserstein gradient oracle)
- MOSEK (for an alternative Wasserstein gradient oracle)
- FastEMD (for an alternative Wasserstein gradient oracle)
- sinkhornTransport.m from Marco Cuturi (for an alternative Wasserstein gradient oracle)
- ICCSP D1 dataset
- First parse the ICCSP dataset by navigating to the directory with all the .hdf files, then running
extract_cloud_histograms
- Add the
compute-optimal-transport
directory to the path, as well as any other third party code (e.g. TFOCS) - Run
cluster_histograms
which will load the preprocessed ICCSP data and run various clustering algorithms. (be sure to modify the first few lines ofcluster_histograms
to load the.mat
file from step 1, wherever it was stored)
At this point, the figures from the paper can be generated by running
weather_state_plots
prepare_globe_plots
followed bymake_globe_plots