Overview

This repository provides 10 synthetic orbital action datasets of stars (sibling stars), generated by our numerical simulation that mimics the formation process of the Milky Way. Each dataset ("/dataset/seed**.csv") contains pre-processed 3-dim. orbital action features of 275 stars, associated to each empirical uncertainty set (consisting of 101 feature instances for each). Overall, each dataset contains 275 [stars] x 101 [feature instances in each uncertainty set / each star] = 27775 [feature instances] associated with pre-computed penalties. This repository also provides R source codes to reproduce experimental results in our paper [1].

While [1] provides a greedy and optimistic approach to clustering (GOC) and applies GOC to the synthetic datasets (whose true clusters are known), [2] further applies GOC to real-world orbital action datasets of stars. Please see [1] for further details of these synthetic datasets and experimental results.

[1] A. Okuno and K. Hattori. (2022) "A Greedy and Optimistic Approach to Clustering with a Specified Uncertainty of Covariates", arXiv:2204.08205
[2] K. Hattori, A. Okuno, and I. U. Roederer. (2023) "Finding r-II sibling stars in the Milky Way with the Greedy Optimistic Clustering algorithm", The Astrophysical Journal, 946(1), 48.

BiBTeX Citation

For the use of these datasets, prease cite our papers [1] and [2] with the following BiBTeX entries:

@article{Okuno2022,
    year      = {2022},
    publisher = {CoRR},
    volume    = {},
    number    = {},
    pages     = {},
    author    = {Okuno, Akifumi and Hattori, Kohei},
    title     = {A Greedy and Optimistic Approach to Clustering with a Specified Uncertainty of Covariates},
    journal   = {arXiv preprint arXiv:2204.08205},
    note = {submitted.}
}

@article{Hattori2023,
    year      = {2023},
    publisher = {The American Astronomical Society},
    volume    = {946},
    number    = {1},
    pages     = {48},
    author    = {Hattori, Kohei and Okuno, Akifumi and Roederer, Ian U.},
    title     = {Finding r-II Sibling Stars in the Milky Way with the Greedy Optimistic Clustering Algorithm},
    journal   = {The Astrophysical Journal}
}

Contact Information

A. Okuno (ISM and RIKEN AIP, okuno@ism.ac.jp; https://okuno.net/)
K. Hattori (NAOJ, ISM, and U. Michigan, khattori@ism.ac.jp; https://koheihattori.github.io/)

(Left) true classes, (Right) clusters detected by GOC

Repository Descriptions

/dataset

You can find 10 datasets "seed1.csv"-"seed10.csv". See [1] (Example 1 in Section 2 and Appendix A) for more detailed description of these datasets. The standardized instances (including removal of a few outliers) can be found: they were processed by "/verbose/preprocessing.R". In "/dataset/verbose", you can find intermediate products of our numerical simulation (to generate the synthetic datasets).

"/verbose/dependencies.R" lists depended packages (used in experiments.R and visualization.R)
"/verbose/preprocessing.R" preprocess (standardize) the dataset instances
"/verbose/evaluation_metrics.R" provides clustering scores (NMI and F-measure)
"/verbose/GOC.R" provides GOC function equipped with clustering oracle "clm" therein. Particularly, we employed k-means, k-medoids (in ClusterR), Gaussian mixture model (in ClusterR and Mclust)
"/verbose/summary_in_LaTeX_form.R" outputs the summary of experimental results (in the form of LaTeX tables)
"/verbose/convergence.R" outputs the summary of experimental results on the convergence of GOC
"/verbose/affinity_propagation.R" conducts experiments on affinity propagaion

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
dataset		dataset
output/visualization		output/visualization
verbose		verbose
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
experiments.R		experiments.R
visualization.R		visualization.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

BiBTeX Citation

Contact Information

Repository Descriptions

/dataset

visualization.R

experiments.R

/verbose

About

Releases

Packages

Contributors 2

Languages

License

oknakfm/GOC

Folders and files

Latest commit

History

Repository files navigation

Overview

BiBTeX Citation

Contact Information

Repository Descriptions

/dataset

visualization.R

experiments.R

/verbose

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages