-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Organization of clustering and correlation analytics #13
Comments
Great minds think alike :)
For all the reasons you just stated, I am making a new data object that
will hold all cluster data. Specifically it will hold how many clusters, if
any, each gene pair has and for each cluster it will hold a sample mask
describing the cluster. So in the future once we get mixture models into
the new KINC correlation analytics will take two data object inputs... an
emx and a ccm(cluster data object) and produce a cmx. Your mixture model
analytic will take an emx as input and output a ccm.
Workflow:
*.emx --(mixture model analytic)--> *.ccm
*.emx --(correlation analytic)--> *.cmx
*.ccm --^
Once I release KINC version 3.1 it will have the new cluster matrix data
type. So what you need to study is the expression matrix and cluster matrix
data types because your analytic will be using those as input/output.
Sincerely,
Joshua Burns
…On Tue, Dec 19, 2017 at 6:41 AM, Ben Shealy ***@***.***> wrote:
As I'm looking at how to implement mixture model clustering, I'm beginning
to see a multi-stage pipeline with options at several points:
*.emx ---> clustering ---> correlation ---> *.cmx
clustering:
- none
- k-means
- GMM
correlation:
- Pearson
- Spearman
- ...
So I'm trying to figure out how to best implement this pipeline for the
long-term. It looks like KINCv1 can combine clustering with any correlation
method, with minimal duplication. Perhaps we will need to create a new data
type for the "augmented" expression matrix? It would parallel the
PairWiseClusterList from KINCv1. Then the clustering and correlation
analytics could be kept separate and the user could simply use the pipeline
illustrated above.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#13>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/APD4uS5IUyOe-XrAk-k5tq7LlMjboEd0ks5tB8sxgaJpZM4RHD4Y>
.
|
Ah, so that's how the *.ccm fits into all of this. Excellent! |
This is some excellent communication! See their grand plan Ben? It's awesome. We gotta get past the default so we can use the ecosystem.
Sent from my Verizon, Samsung Galaxy smartphone
-------- Original message --------From: Ben Shealy <notifications@github.com> Date: 12/20/17 1:52 PM (GMT-05:00) To: SystemsGenetics/KINC <KINC@noreply.github.com> Cc: Subscribed <subscribed@noreply.github.com> Subject: Re: [SystemsGenetics/KINC] Organization of clustering and correlation analytics (#13)
Ah, so that's how the *.ccm fits into all of this. Excellent!
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/SystemsGenetics/KINC","title":"SystemsGenetics/KINC","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/SystemsGenetics/KINC"}},"updates":{"snippets":[{"icon":"PERSON","message":"@bentsherman in #13: Ah, so that's how the *.ccm fits into all of this. Excellent!"}],"action":{"name":"View Issue","url":"#13 (comment)"}}}
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
As I'm looking at how to implement mixture model clustering, I'm beginning to see a multi-stage pipeline with options at several points:
So I'm trying to figure out how to best implement this pipeline for the long-term. It looks like KINCv1 can combine clustering with any correlation method, with minimal duplication. Perhaps we will need to create a new data type for the "augmented" expression matrix? It would parallel the
PairWiseClusterList
from KINCv1. Then the clustering and correlation analytics could be kept separate and the user could simply use the pipeline illustrated above.The text was updated successfully, but these errors were encountered: