Skip to content

oknakfm/GOC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

This repository provides 10 synthetic orbital action datasets of stars (sibling stars), generated by our numerical simulation that mimics the formation process of the Milky Way. Each dataset ("/dataset/seed**.csv") contains pre-processed 3-dim. orbital action features of 275 stars, associated to each empirical uncertainty set (consisting of 101 feature instances for each). Overall, each dataset contains 275 [stars] x 101 [feature instances in each uncertainty set / each star] = 27775 [feature instances] associated with pre-computed penalties. This repository also provides R source codes to reproduce experimental results in our paper [1].

While [1] provides a greedy and optimistic approach to clustering (GOC) and applies GOC to the synthetic datasets (whose true clusters are known), [2] further applies GOC to real-world orbital action datasets of stars. Please see [1] for further details of these synthetic datasets and experimental results.

[1] A. Okuno and K. Hattori. (2022) "A Greedy and Optimistic Approach to Clustering with a Specified Uncertainty of Covariates", arXiv:2204.08205
[2] K. Hattori, A. Okuno, and I. U. Roederer. (2023) "Finding r-II sibling stars in the Milky Way with the Greedy Optimistic Clustering algorithm", The Astrophysical Journal, 946(1), 48.

BiBTeX Citation

For the use of these datasets, prease cite our papers [1] and [2] with the following BiBTeX entries:

@article{Okuno2022,
    year      = {2022},
    publisher = {CoRR},
    volume    = {},
    number    = {},
    pages     = {},
    author    = {Okuno, Akifumi and Hattori, Kohei},
    title     = {A Greedy and Optimistic Approach to Clustering with a Specified Uncertainty of Covariates},
    journal   = {arXiv preprint arXiv:2204.08205},
    note = {submitted.}
}

@article{Hattori2023,
    year      = {2023},
    publisher = {The American Astronomical Society},
    volume    = {946},
    number    = {1},
    pages     = {48},
    author    = {Hattori, Kohei and Okuno, Akifumi and Roederer, Ian U.},
    title     = {Finding r-II Sibling Stars in the Milky Way with the Greedy Optimistic Clustering Algorithm},
    journal   = {The Astrophysical Journal}
}

Contact Information

A. Okuno (ISM and RIKEN AIP, okuno@ism.ac.jp; https://okuno.net/)
K. Hattori (NAOJ, ISM, and U. Michigan, khattori@ism.ac.jp; https://koheihattori.github.io/)


(Left) true classes, (Right) clusters detected by GOC

Repository Descriptions

You can find 10 datasets "seed1.csv"-"seed10.csv". See [1] (Example 1 in Section 2 and Appendix A) for more detailed description of these datasets. The standardized instances (including removal of a few outliers) can be found: they were processed by "/verbose/preprocessing.R". In "/dataset/verbose", you can find intermediate products of our numerical simulation (to generate the synthetic datasets).

This script provides visualizations of datasets, and clustering results. The visualization results are saved to "/output/visualization".

This script applies GOC to our synthetic datasets. The clustering results are saved to "/output".

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages