Give informative error when values are missing in addRDA #626

RiboRings · 2024-08-10T08:12:49Z

Fix for issue #432.

Usage examples:

# Import TreeSE
library(mia)
data("enterotype", package = "mia")
tse <- enterotype

# Throw error when na.action is na.fail and some values are missing

tse <- addCCA(tse, formula = assay ~ ClinicalStatus + Gender + Age)
# Error: Variables contain missing values. Set na.action to na.exclude to remove
# samples with missing values.

tse <- addRDA(tse, formula = assay ~ ClinicalStatus + Gender + Age,
              FUN = vegan::vegdist, method = "bray")
# Error: Variables contain missing values. Set na.action to na.exclude to remove
# samples with missing values.

# Work as usual when na.action is na.omit or na.exclude

tse <- addCCA(tse, formula = assay ~ ClinicalStatus + Gender + Age,
              na.action = na.exclude)

tse <- addRDA(tse, formula = assay ~ ClinicalStatus + Gender + Age,
              FUN = vegan::vegdist, method = "bray",
              na.action = na.exclude)

codecov · 2024-08-10T08:57:15Z

Codecov Report

Attention: Patch coverage is 40.00000% with 9 lines in your changes missing coverage. Please review.

Please upload report for BASE (devel@5282a72). Learn more about missing BASE report.

Files	Patch %	Lines
R/runCCA.R	40.00%	9 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff            @@
##             devel     #626   +/-   ##
========================================
  Coverage         ?   67.80%           
========================================
  Files            ?       44           
  Lines            ?     5302           
  Branches         ?        0           
========================================
  Hits             ?     3595           
  Misses           ?     1707           
  Partials         ?        0

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

antagomir · 2024-08-10T08:57:47Z

Thanks!

Original data dimension

dim(enterotype)
[1] 553 280

Dimension after RDA (or CCA):

dim(reducedDim(tse, "RDA"))
[1] 280 6

Ok the samples match and they are not dropped. This solution seems OK to me.

We could consider providing an optional (hidden?) argument to drop out samples with missing values but perhaps this is not critical for now.

Confirm that we have sufficient unit tests in place, and documentation is clear about the key issues related to this one.

I am wondering where the 6 component solution is defined, why the default is 6 dimensions and is it possible to change that somehow? (not really related to this issue but it would be useful to find out - if time not available now, perhaps open another issue on that?).

RiboRings · 2024-08-10T09:54:43Z

We could consider providing an optional (hidden?) argument to drop out samples with missing values but perhaps this is not critical for now.

I think this can be simply done by using na.omit instead of na.exclude, so in a sense the argument is already there.

antagomir · 2024-08-10T09:57:09Z

We could consider providing an optional (hidden?) argument to drop out samples with missing values but perhaps this is not critical for now.

I think this can be simply done by using na.omit instead of na.exclude, so in a sense the argument is already there.

Right, sounds good.

Could/should we mention this explicitly in the documentation (@details section perhaps)?

Good to check that sufficient unit tests are in place.

antagomir · 2024-08-27T20:45:59Z

Resolve conflicts.

TuomasBorman

Seems good!

TuomasBorman · 2024-09-01T11:56:59Z

We could consider providing an optional (hidden?) argument to drop out samples with missing values but perhaps this is not critical for now.

I think this can be simply done by using na.omit instead of na.exclude, so in a sense the argument is already there.

That works only for get* functions. When add* function is used, the results is added with .add_object_to_reduceddim. That function checks if the result is missing samples compared to the TreeSE. If the result is missing samples and subset_result = FALSE, this functions adds samples to result to match with TreeSE.

However, that might not be the behavior that user wants. (If the result is missing samples, user specified na.omit and that is what user wants.)

So I suggest that we

replace default value of subset_result to TRUE
Rename subset_result --> subset.result

subset_result is not documented anywhere and it is intended just for us to control the behavior. Otherwise, the function should work (check).

antagomir · 2024-09-01T12:25:07Z

ok

RiboRings · 2024-09-08T18:03:06Z

Hi! I got back to this PR.

I noticed that the example with GlobalPatterns is very slow. Is it ok if I change the dataset to enterotype?

RiboRings · 2024-09-08T18:36:00Z

This PR is ready to merge from my side.

Give informative error when values are missing in addRDA

0d1528a

RiboRings requested a review from TuomasBorman August 10, 2024 08:13

RiboRings linked an issue Aug 10, 2024 that may be closed by this pull request

Suggestion to remove na.action argument from runRDA #432

Closed

Merge branch 'devel' into na_action

9615ae8

TuomasBorman approved these changes Sep 1, 2024

View reviewed changes

Change subset.results to TRUE, polish examples and fix tests

e631c25

TuomasBorman and others added 3 commits September 9, 2024 08:49

Merge branch 'devel' into na_action

9ddba15

up

fc3e228

up

e4d7dc6

TuomasBorman approved these changes Sep 9, 2024

View reviewed changes

TuomasBorman and others added 2 commits September 9, 2024 16:48

Merge branch 'devel' into na_action

24b0179

Add miaViz to suggested packages

88cea4e

TuomasBorman merged commit 52d2a23 into devel Sep 10, 2024
3 checks passed

TuomasBorman deleted the na_action branch September 10, 2024 05:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Give informative error when values are missing in addRDA #626

Give informative error when values are missing in addRDA #626

RiboRings commented Aug 10, 2024

codecov bot commented Aug 10, 2024

antagomir commented Aug 10, 2024

RiboRings commented Aug 10, 2024

antagomir commented Aug 10, 2024

antagomir commented Aug 27, 2024

TuomasBorman left a comment

TuomasBorman commented Sep 1, 2024

antagomir commented Sep 1, 2024

RiboRings commented Sep 8, 2024

RiboRings commented Sep 8, 2024

Give informative error when values are missing in addRDA #626

Give informative error when values are missing in addRDA #626

Conversation

RiboRings commented Aug 10, 2024

codecov bot commented Aug 10, 2024

Codecov Report

antagomir commented Aug 10, 2024

RiboRings commented Aug 10, 2024

antagomir commented Aug 10, 2024

antagomir commented Aug 27, 2024

TuomasBorman left a comment

Choose a reason for hiding this comment

TuomasBorman commented Sep 1, 2024

antagomir commented Sep 1, 2024

RiboRings commented Sep 8, 2024

RiboRings commented Sep 8, 2024