Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DUPLEX with external test set #50

Closed
zecojls opened this issue Jan 9, 2023 · 1 comment
Closed

DUPLEX with external test set #50

zecojls opened this issue Jan 9, 2023 · 1 comment

Comments

@zecojls
Copy link

zecojls commented Jan 9, 2023

Hi Leo,

Is it possible to extend the DUPLEX algorithm with external test sets? The idea is to sample a primary bigger SSL given a small secondary SSL with both sharing the underlying spectral properties. The required parameters would be the X primary, X secondary, and the subset (n) size.

Cheers

@zecojls
Copy link
Author

zecojls commented Jan 9, 2023

I've been playing with this idea and found a way to search for similar samples. Do you think that this makes sense?

  1. Run PCA on primary and get scores (P).
  2. Project secondary onto primary space and get scores (S).
  3. Calculate covariance (C) of the first p principal components of the secondary (I set p=20).
  4. Estimate Mahalanobis distance of P given covariance C from S:
    mahalanobis(P, center = FALSE, C)
  5. Filter top n samples given the Mahalanobis distance.

PCA of secondary (red) projected onto primary space (black):
image

After running the algorithm, neighbor samples (n=15000) are selected:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants