Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Results of bin.size.limit parameters to12 and 72 were same #26

Closed
YuZhang-learner opened this issue Jul 25, 2022 · 12 comments
Closed

Results of bin.size.limit parameters to12 and 72 were same #26

YuZhang-learner opened this issue Jul 25, 2022 · 12 comments

Comments

@YuZhang-learner
Copy link

Hi,
When I set the bin.size.limit parameters to12 and 72, their results were the same and very different from pNST. But the pNST result is more consistent with the inferences of my other results.

image

Therefore, I would like to know if the result can be closer to the results of pNST by changing the parameters of bin.size.limit to 36 and 48.

In addition, if I want to use the results of pNST, how can I get the contributions of various processes of rare or Abundant species like ICAMP pipeline step 15 : summarize core, rare, and other taxa.

@DaliangNing
Copy link
Owner

Based on Bin-12 and Bin-72 results, bin.size.limit 36 and 48 will probably give very similar results. If targeting the stochasticity levels similar to NST results, you may try bin.size.limit to 96 or even 144. Or if you have environmental measurements, try to use key environmental factors to figure out a reasonable bin.size.limit by phylogenetic signal test, following Step 7 and 8 in the example code https://github.com/DaliangNing/iCAMP1/blob/master/Examples/SimpleOTU/icamp.test.r.

But in some studies, iCAMP results can be quite different from NST. After all, they are based on very different null model algorithms: one is based on phylogenetic bins, and the other is on the entire community. Both have special assumptions, advantages, and limitations. You may use the one more reasonable and consistent with other evidence.

For NST of a part of taxa in communities, you may extract sub-community matrixes (rare vs abundant), then calculate NST for them, separately, which means to perform null model randomization for each type of subcommunities separately. This is easier. Just apply pNST to each subcommunity matrix. An alternative way is to perform null model randomization for entire communities but calculate null betaMNTD for each type of subcommunities separately, and then compared with observed betaMNTD values to get NST for each type of subcommunities. I will try to write a function for this, a few weeks later, after I finish some urgent tasks.

@YuZhang-learner
Copy link
Author

@DaliangNing Thank you for your advice. I want to know, does this parameter have to be a multiple of 12? How about 60 or 84? Here are some experiments with my data.

bin.size.limit=48
image

bin.size.limit=72

image

bin.size.limit=96

image

result of pNST
image

Can I set the parameter to 60 or 84 to make the result closer to pNST? Because I want the trend of pNST results, the stochastic process of first three rows are larger than the last three rows.

@DaliangNing
Copy link
Owner

Yes, you can. When phylogenetic signal test is not applicable, pNST is a practical way to optimize bin.size.limit, although unfortunately, it is still time-consuming now.

@YuZhang-learner
Copy link
Author

@DaliangNing Does this parameter of bin.size.limit have to be a multiple of 12?

@DaliangNing
Copy link
Owner

no need. You may use any number. Using 12 as the interval is just my preference, which gave enough difference between two options when I explored the simulated data before.

@YuZhang-learner
Copy link
Author

YuZhang-learner commented Aug 5, 2022

@DaliangNing Thanks for your advice and I got a good ICAMP result. At the same time, I ran pNST and NCM analyses on the same dataset, but the results left me a little confused.
From group A to group D, Hos increases and the contribution of deterministic process increases, but DL decreases and m value of NCM also decreases. At the same time, my niche breadth index also decreased from A toD. Homogeneous selection indicates specific habitat, and select specific microorganisms (as evidenced by narrow niche breadth ), which are less likely to survive in another habitat, thus decreasing mitigation rate (m value), but should not DL increase at this time? A lot of articles say that a decrease in m means an increase in DL. So, I don't quite understand why DL decrease when HoS increase and m value decrease?

image

@DaliangNing
Copy link
Owner

iCAMP result is about the 'relative importance' of assembly processes. Dispersal rate can be lower, dispersal limitation can be more obvious but less important because selection increased.

@YuZhang-learner
Copy link
Author

An error occurred when I ran the following code:
pnstout=NST::pNST(comm=otu, pd.desc=pd.big$pd.file, pd.wd=pd.big$pd.wd, pd.spname=pd.big$tip.label, group=treat.use, abundance.weighted=TRUE, rand=rand.time, phylo.shuffle=TRUE, nworker=nworker, output.rand = TRUE, SES=FALSE, RC=FALSE)

Error in Ops.data.frame(group[, 1], grp.lev[i]) :
‘==’ only defined for equally-sized data frames
Calls: -> which -> Ops.data.frame
Execution halted

I don't know why, was it because my "group" or"treat.use" only has one group of data?

@DaliangNing
Copy link
Owner

pNST should be able to deal with one-level group input. Please send me (ningdaliang@ou.edu) your input files and R code to debug.

@DaliangNing
Copy link
Owner

Yuzhang has figured out the problem was caused by format. The input for 'group' (i.e., 'treat.use' in this case) should be 'data.frame' in R.

Below is our discussion about the significance test.
"In this case, different groups can be considered under different metacommunities and the original dataset is too large to be processed together, thus splitting data by group becomes a good choice. For the significance test, you can get bootstrapping results for each group using the function nst.boot (set out.detail=TRUE; in the output, check the element detail - NST.boot). Then, you can calculate the difference of bootstrapping results between two groups, and if the observed NST in group A > group B, the proportion of bootstrapping NST A < B is the P value based on bootstrapping. Similarly, iCAMP has a function icamp.boot to get bootstrapping results for each group."

As the problems are solved, I am closing this issue.

@M123mym
Copy link

M123mym commented Sep 16, 2022

Hello,Mr DaliangNing
there is an error occurred tree is NULL when I ran the following code:
WENTU1

@DaliangNing
Copy link
Owner

This must be a format issue. Please send me (ningdaliang@ou.edu) your input files and R code to debug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants