-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interaction term in nebula #16
Comments
Hi AdelynTsai,
Thank you for your question.
Yes, you can do that.
To include the interaction term between sqrtCAA and biochemical in the
model, you can try the following
cov.mm <- model.matrix(~sqrtCAA*biochemical + Batch_Flowcell + Gender +
Age_At_Death, data=meta.data.f)
Then, you should see an additional column sqrtCAA:biochemical in cov.mm
corresponding to the interaction term.
Best regards,
Liang
…On Mon, Jan 30, 2023 at 4:18 PM AdelynTsai ***@***.***> wrote:
Hi,
Thank you for the tool. It's very helpful!
I'm wondering if it's possible to include an interaction term in the
nebula, and if so, how should I code it?
Here's how I code now without the interaction:
cov.mm <- model.matrix(~sqrtCAA + Batch_Flowcell + Gender + Age_At_Death,
data=meta.data.f)
nebulafit <- nebula(count=nebula.mm.f,id=meta.data.f$subject,pred=cov.mm
,offset=total)
sqrtCAA is the phenotype of interest (it's a continuous phenotype),
Batch_Flowcell + Gender + Age_At_Death are the fixed covariates, and
subject is the random covariate.
However, I also have some biochemical measures and I want to know how the
effect of biochemical measures together with sqrtCAA can affect expression.
I'd like to do this analysis with Nebula. Please let me know if this is
possible.
Thank you!
—
Reply to this email directly, view it on GitHub
<#16>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGDISURYHTMMHFHMZU2ECSDWU7LTRANCNFSM6AAAAAAULI4J2U>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Hi Liang, Thank you again! |
Hi AdelynTsai,
Please see the interpretation of logFC in my answer to the previous
question
#14 .
Best regards,
Liang
…On Thu, Feb 2, 2023 at 2:56 PM AdelynTsai ***@***.***> wrote:
Hi Liang,
Thank you for your response!
One other question I have is that given that my phenotype (sqrtCAA) is a
continuous variable, can I interpret the logFC_sqrtCAA in the summary
output as the correlation coefficient (i.e. beta/estimate)?
Thank you again!
—
Reply to this email directly, view it on GitHub
<#16 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGDISUQFK6GOQGNVO6Z2BZLWVO4INANCNFSM6AAAAAAULI4J2U>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Hi Liang, Thank you so much again! |
Hi AdelynTsai,
I need the following information to better understand what's going on.
Are sqrtCAA and biochem cell-level or sample-level variables (sample-level
variables share the same value across all cells from a sample)? Is biochem
a binary variable or continuous?
How many samples and cells are there in your data? And how many columns in
your design matrix cov.mm? What do you get if you put both variables in the
model without the interaction term?
Best regards,
Liang
…On Thu, Mar 2, 2023 at 8:34 PM AdelynTsai ***@***.***> wrote:
Hi Liang,
Thanks for your previous answers. I've started doing interaction analysis
using Nebula. As previously mentioned, I used cov.mm <-
model.matrix(~sqrtCAA*biochem + Batch_Flowcell + Gender + Age_At_Death,
data=meta.data.f).
I have some questions about interpreting the results from the interaction
analysis. I attached the results from 2 genes here from DEG analysis with
sqrtCAA alone, with biochemical measures alone (cd31_tx_std &
ab40_tbs_ln_std) and with interactions between sqrtCAA x biochemical
measures.
I know I should be specifically looking at the interaction results from
the column with sqrtCAA:biochem, but I'm wondering why the logFC_sqrtCAA
and logFC_biochem from the interaction analysis, as well as the results of
se and p, so different from the logFC, se and p when I did the analysis
with just the sqrtCAA and biochemical measures alone?
Thank you so much again!
Nebula_interaction_Q.xlsx
<https://github.com/lhe17/nebula/files/10875195/Nebula_interaction_Q.xlsx>
—
Reply to this email directly, view it on GitHub
<#16 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGDISURUS42RMQUOIE2U2OLW2DY4TANCNFSM6AAAAAAULI4J2U>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Hi Liang, Samples and cells differ by the biochem measures and cell type. For the example I gave, Astrocyte cd31_tx has 78 samples and 17722 cells. For microglia ab40_tbs, there are 78 samples and 18409 cells. In general, I've a range from 74~78 samples and 971 cells to 44151 cells among all the cell types I have. For the design matrix, when I used model.matrix(~sqrtCAA*biochem + Batch_Flowcell + Gender + Age_At_Death, data=meta.data.f), there are 9 columns corresponding to the variables I gave in the model.matrix (I've 5 different batch_flowcell that makes it 4 different batch_flowcell columns in the design matrix). On the other hand, when I put both variables in the model without the interaction term, which is model.matrix(~sqrtCAA + biochem + Batch_Flowcell + Gender + Age_At_Death, data=meta.data.f), I've 8 columns. It seems like when I used the interction term in the model.matrix, I got a column sqrtCAA:biochem which is the product of sqrtCAA x biochem. I included the two design matrices from astrocyte_cd31tx in the 2nd and 3rd tab of the excel file attached here. As for the results when I put both variables in the model without the interaction term, I put them in the first tab in the excel file. There're no additional sqrtCAA:biochem columns if I don't include the interaction term. Thank you for your help. |
Hi AdelynTsai,
Thank you for your information.
Based on the summary statistics and information you shared, my
interpretation is that biochem and sqrtCAA have a significant
interaction effect on the gene expression. sqrtCAA modulates the effect
of biochem. For example, biochem has a strong positive effect (i.e.,
higher biochem increases the expression) on HSPD1 in Ast when sqrtCAA is
small, but this effect moves towards negative when sqrtCAA becomes
large. In the model without the interaction term, the logFC of biochem
gives a marginal effect of biochem. Because the positive and negative
effects of biochem in the sqrtCAA_high and sqrtCAA_low groups cancel out
if these two groups are considered together, the overall marginal effect
of biochem is not significant.
Best regards,
Liang
…On 3/3/2023 6:46 PM, AdelynTsai wrote:
Hi Liang,
sqrtCAA and biochem are sample-level variables. Both of them are
continuous variables.
Samples and cells differ by the biochem measures and cell type. For
the example I gave, Astrocyte cd31_tx has 78 samples and 17722 cells.
For microglia ab40_tbs, there are 78 samples and 18409 cells. In
general, I've a range from 74~78 samples and 971 cells to 44151 cells
among all the cell types I have.
For the design matrix, when I used model.matrix(~sqrtCAA*biochem +
Batch_Flowcell + Gender + Age_At_Death, data=meta.data.f), there are 9
columns corresponding to the variables I gave in the model.matrix
(I've 5 different batch_flowcell that makes it 4 different
batch_flowcell columns in the design matrix). On the other hand, when
I put both variables in the model without the interaction term, which
is model.matrix(~sqrtCAA + biochem + Batch_Flowcell + Gender +
Age_At_Death, data=meta.data.f), I've 8 columns. It seems like when I
used the interction term in the model.matrix, I got a column
sqrtCAA:biochem which is the product of sqrtCAA x biochem. I included
the two design matrices in the 2nd and 3rd tab of the excel file
attached here.
As for the results when I put both variables in the model without the
interaction term, I put them in the first tab in the excel file.
There're no additional sqrtCAA:biochem columns if I don't include the
interaction term.
Thank you for your help.
Nebula_interaction_Q.xlsx
<https://github.com/lhe17/nebula/files/10884922/Nebula_interaction_Q.xlsx>
—
Reply to this email directly, view it on GitHub
<#16 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGDISURWQMFZ2LFCMONWZ6LW2IU7ZANCNFSM6AAAAAAULI4J2U>.
You are receiving this because you commented.Message ID:
***@***.***>
|
Hi AdelynTsai,
I think that the situation for NR4A3 is not very different from the
previous one except that sqrtCAA now has a significant marginal effect as
well. The interaction effect in the case of VCAM1 is the opposite. This can
be illustrated in the following example. Marginally, there is no
correlation between G (expression) and S (sqrtCAA) or B (biochem). However,
for those with low S=0, G is anti-correlated with B (strong negative
effect), and for those with high S=1, G is positively correlated with B
(strong positive effect of the interaction term).
G S B
1 0 -1
0 0 0
-1 0 1
-1 1 -1
0 1 0
1 1 1
Best regards,
Liang
…On Tue, Mar 7, 2023 at 12:26 AM AdelynTsai ***@***.***> wrote:
Hi Liang,
Thank you so much for your answer. That's really helpful.
Upon looking more detailed into the results and following your logic of
interpretation, I found some results I'm hard to interpret and I'm giving
examples in the attached excel.
For NR4A3 from Fib x cd31, how can sqrtCAA and cd31 both have strong
positive effect on its expression but the logFC of sqrtCAAxcd31 is strongly
negative?
For VCAM1 from Fib x cldn5, sqrtCAA has moderately positive effect and
cldn5 has strong negative effect on its expression, but how does that turn
into a logFC of sqrtCAAxcldn5 that is strongly positive?
Much appreciated for your help!
Nebula_interaction_question.xlsx
<https://github.com/lhe17/nebula/files/10903815/Nebula_interaction_question.xlsx>
—
Reply to this email directly, view it on GitHub
<#16 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGDISUTNXC42LFNBY5D5PGLW2ZXBNANCNFSM6AAAAAAULI4J2U>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Hi,
Thank you for the tool. It's very helpful!
I'm wondering if it's possible to include an interaction term in the nebula, and if so, how should I code it?
Here's how I code now without the interaction:
cov.mm <- model.matrix(~sqrtCAA + Batch_Flowcell + Gender + Age_At_Death, data=meta.data.f)
nebulafit <- nebula(count=nebula.mm.f,id=meta.data.f$subject,pred=cov.mm,offset=total)
sqrtCAA is the phenotype of interest (it's a continuous phenotype), Batch_Flowcell + Gender + Age_At_Death are the fixed covariates, and subject is the random covariate.
However, I also have some biochemical measures and I want to know how the effect of biochemical measures together with sqrtCAA can affect expression. I'd like to do this analysis with Nebula. Please let me know if this is possible.
Thank you!
The text was updated successfully, but these errors were encountered: