Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

correlation between celltypes and age #1845

Open
5 tasks
FADHLyemen opened this issue May 17, 2021 · 5 comments
Open
5 tasks

correlation between celltypes and age #1845

FADHLyemen opened this issue May 17, 2021 · 5 comments
Labels
Enhancement ✨ good first issue easy first issue to get started in OSS community contribution!

Comments

@FADHLyemen
Copy link

  • Additional function parameters / changed functionality / changed defaults?
  • New analysis tool: A simple analysis tool you have been using and are missing in sc.tools?
  • New plotting function: A kind of plot you would like to seein sc.pl?
  • External tools: Do you know an existing package that should go into sc.external.*?
  • Other?

...

How to do correlation between celltypes and age in scanpy?

@giovp giovp added the good first issue easy first issue to get started in OSS community contribution! label May 19, 2021
@FADHLyemen
Copy link
Author

@giovp I want to make correlation plot between cell types and the continuous variables stored in .obs

@Koncopd
Copy link
Member

Koncopd commented May 25, 2021

I would say this is not a scanpy question.
It is not clear what do you mean by correlation of a categorical variable with multiple categories and a continuous variable.
If you have a binary categorical variable, you can calculate Point Biserial Correlation, but for a multicategorical variable you would have to discretize your continuous variable and calculate Chi-squared test. You can also try ANOVA. If you think you know what variables are dependent and independent you can use logistic regression and look at its coefficients or try ANCOVA.
some additional information with examples
https://datascience.stackexchange.com/questions/893/how-to-get-correlation-between-two-categorical-variable-and-a-categorical-variab

@FADHLyemen
Copy link
Author

@Koncopd it is a correlation between two continuous variables as celltypes are continuous and age is also continuous. how to correlate X with continuous variables stored in .obs ?

@Koncopd
Copy link
Member

Koncopd commented May 27, 2021

Are celltypes really continuous? How does this variable look like?
for continuous you can do
from scipy.stats import pearsonr
r, _ = pearsonr(adata.obs["celltypes"], adata.obs["age"])

@FADHLyemen
Copy link
Author

@Koncopd it is the # of celltypes per each cohort or the relative_frequencies per each group:
image

is it something researchers looking for? or do you think this not good approach as cells depends on how many cells per sample

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement ✨ good first issue easy first issue to get started in OSS community contribution!
Projects
None yet
Development

No branches or pull requests

3 participants