-
Notifications
You must be signed in to change notification settings - Fork 96
-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
behaviour of rrarefy() #144
Comments
The data(mite)
rrarefy(mite, pmin(rowSums(mite), 100)) I would consider adding alternative tricks, but they should be well thought and designed. For instance, how would you align results with other variables after clandestine removal of some unspecified rows? A simple thing, and an improvement to the current policy, is to return non-rarefied community (with a warning) if |
I agree that silently removing samples is inadmissible and I am probably not familiar enough with the vegans universe to judge all the ramifications. If I get it right,
although without the warning. I maintain though that it could be a legit choice to exclude samples that have (way) fewer individuals (as opposed to keep them non-rarefied). Would you consider including an option as |
I'd rather have an option to turn these rows into |
Good point about the warnings. I was thinking of setting the option to thanks for taking the feedback seriously and thanks again for providing this resource to the community! |
This would call for four alternative strategies:
|
Commit f5ff44b in branch rrarefy suggests an implementation of all four alternatives I outlined above (not yet documented). The design must be discussed. One side effect is that the behaviour is now different from other functions: |
I really don't like the change in f5ff44b: it adds unnecessary complexity to the function, produces confusing output and makes it inconsistent with other rrarefy(mite[rowSums(mite) >= 100,], 100) without need of building complicated branches in functions. However, the current behaviour of |
PR #145 addresses this issue. It is based on the minimal change where I won't implement the fix of f5ff44b: no vegan function should work so that random rows are removed in rrarewithdrop <-
function(x, sample)
{
rrarefy(x[rowSums(x) >= sample, , drop = FALSE], sample)
} |
Hello,
I use
vegane::rarefy
to rarefy my community datasets. Because I have a lot of samples and and the sampling depth (it's sequencing data) is quite uneven, I choose to rarefy at depth X and exclude samples with N < X.however, the function attempts to rarefy each community and throws the error
cannot take a sample larger than the population when replace = FALSE
Therefore I always have to manually exclude those samples first. I was wondering if it wouldn't be possible to handle this internally? i.e exclude all samples with N < X and issue a warning along the lines of
WARNING: the following samples had less counts than the chosen rarefaction depth and where therefore excluded from the dataset: sample[1], sample[6] (...)
If you agree that this behaviour would be desirable, I could also rewrite the function and submit it as a pull request.
thanks for providing this great resource!
best
Fabian
The text was updated successfully, but these errors were encountered: