Skip to content

pengchzn/yonder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

YONDER

https://img.shields.io/pypi/v/yonder

https://img.shields.io/badge/build-passing-successw

yonder: A python package for data denoising and reconstruction

Main paper:J-PLUS: A catalogue of globular cluster candidates around the M81/M82/NGC3077 triplet of galaxies

You can get the docs here!

YONDER is a package that uses singular value decomposition to perform low-rank data denoising and reconstruction. It takes a tabular data matrix and an error matrix as input and returns a denoised version of the original dataset as output. The approach enables a more accurate data analysis in the presence of uncertainties. Consequently, this package can be used as a simple toolbox to perform astronomical data cleaning.

How to install YONDER

The YONDER can be installed via the PyPI and pip:

pip install yonder

If you download the repository, you can also install it in the yonder directory:

git clone https://github.com/pengchzn/yonder
cd yonder
python setup.py install

How to use YONDER

Here is a simple example for the use of YONDER

from yonder import yonder
import numpy as np

#import the data
X = pd.read_csv('./datasets/Xobs.csv')
Xsd = pd.read_csv('./datasets/Xsd.csv')

# put the data into the algorithm
# Get the value
U, S, V = yonder.yonder(X, Xsd, 2)

# Get the denoised data
result = U @ S @ V.T

After the YONDER procedure, you can connect any additional algorithms or models to the denoised data.

Here is the distribution of noisy data and the distribution of denoised data in our test case:

https://github.com/pengchzn/yonder/blob/main/tests/figures/Noisy_data.png

https://github.com/pengchzn/yonder/blob/main/tests/figures/Denoised_data.png

In addition, we simulate how the data is used on a daily basis, run the HDBScan on both sets of data, and show the findings. It is obvious from the figures below that YONDER may effectively reduce noise. When it comes to classification, denoised data can be quite beneficial, resulting in a superior outcome.

https://github.com/pengchzn/yonder/blob/main/tests/figures/classification.png

You can test the test example in this notebook locally by yourself! If you are new to Python or don't know how to run YONDER locally, you can click here to create a new Colaboratory notebook, so you can run YONDER in the cloud!

Requirements

  • python 3
  • numpy >= 1.21.5
  • Scipy >= 1.7.3

YONDER primarily uses the most recent version of Scipy for single value decomposition. Make sure your Scipy installation is up to date before using YONDER.

Copyright & License

2021 Peng Chen (pengchzn@gmail.com) & Rafael S. de Souza (drsouza@shao.ac.cn)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

References

Citing yonder

If you want to cite yonder, please use the following citations.

Software Citation: Peng Chen, & Rafael S. de Souza. (2022). pengchzn/yonder: RNAAS Submission (1.1). Zenodo. https://doi.org/10.5281/zenodo.6321520