bug fix and adding recent updates to vignettes #624

cristinamullin · 2025-07-29T18:59:06Z

Check vignettes to make sure recent updates have been incorporated
Consider adding tests to avoid issues in the future

add unit to grouping cols for TADA_AggregateMeasurements, fix TN and TP bug, add combos to harmonization table, update vignette

remove intermediate variables and update scatterplot examples

Combine TADA_AutoFilter with TADA_RetainRequired and delete TADA_AutoFilter

incorporate changes made to TADA_FindQCActivities, TADA_RetainRequired, and TADA_AutoFilter

cristinamullin · 2025-07-29T20:59:36Z

Need to look into these tomorrow:

── Failed tests ─────────────────────────────────────────────────────────────────────────────
Failure (test-TADARefTables.R:28:3): No combos were missed in NP key from harmonization table
dim(orphs)[1] == 0 is not TRUE

`actual`:   FALSE
`expected`: TRUE 

Failure (test-Transformations.R:21:3): np summation key matches nutrient harmonization ref
any(is.na(check$np)) is not FALSE

`actual`:   TRUE 
`expected`: FALSE

hillarymarler · 2025-07-30T19:56:35Z

Need to look into these tomorrow:

── Failed tests ─────────────────────────────────────────────────────────────────────────────
Failure (test-TADARefTables.R:28:3): No combos were missed in NP key from harmonization table
dim(orphs)[1] == 0 is not TRUE

`actual`:   FALSE
`expected`: TRUE 

Failure (test-Transformations.R:21:3): np summation key matches nutrient harmonization ref
any(is.na(check$np)) is not FALSE

`actual`:   TRUE 
`expected`: FALSE

I took a look at this and it looks like the following may be missing from TADA_GetNutrientSummationRef

I haven't worked on the NutrientSummation ref before, so if you'd like me to work on adding those additional rows, we might need to have a quick chat about everything I should consider when filling this ref table out, @cristinamullin

cristinamullin · 2025-07-30T20:36:20Z

#588 I will work to incorporate this feedback into this PR. Also, thanks @hillarymarler for looking into the missing combinations. They will need to get prioritized for inclusion in the TN/TP calculations in that ref table. I have not edited this since Elise and I worked on it when she was an ORISE with our branch. I'll take a look & may loop the TADA team in to review the logic.

cristinamullin · 2025-07-31T23:29:07Z

I found a bug in TADA_ConvertSpecialChars where we are getting new columns with TADA.TADA. if the function input column already contains the TADA prefix. Writing a test to double check for this issue.

test_that("Column names do not contain the pattern 'TADA.TADA.'", {
  test_TADA.TADA. <-
    TADA_ConvertSpecialChars(
      Data_Nutrients_UT,
      "TADA.DetectionQuantitationLimitMeasure.MeasureValue"
    )
  # Create a logical vector indicating which columns contain the pattern
  pattern_found <- grepl("TADA.TADA.", colnames(test_TADA.TADA.))
  
  # Test should pass if none of the columns contain the pattern
  expect_false(any(pattern_found), info = "Some column names contain the pattern 'TADA.TADA.'")
})

test_that("Column names do not contain the pattern 'TADA.TADA.'", {
  test_TADA.TADA. <-
    TADA_ConvertSpecialChars(
      Data_Nutrients_UT,
      "TADA.ResultMeasureValue"
    )
  # Create a logical vector indicating which columns contain the pattern
  pattern_found <- grepl("TADA.TADA.", colnames(test_TADA.TADA.))
  
  # Test should pass if none of the columns contain the pattern
  expect_false(any(pattern_found), info = "Some column names contain the pattern 'TADA.TADA.'")
})

test_that("Column names do not contain the pattern 'TADA.TADA.'", {
  test_TADA.TADA. <- TADA_AutoClean(Data_R5_TADAPackageDemo)
  # Create a logical vector indicating which columns contain the pattern
  pattern_found <- grepl("TADA.TADA.", colnames(test_TADA.TADA.))
  
  # Test should pass if none of the columns contain the pattern
  expect_false(any(pattern_found), info = "Some column names contain the pattern 'TADA.TADA.'")
})

I added logic to only run TADA_ConvertSpecialChars on a col if the associated flag col for that col does not already exist.

add tests and examples for TADA_ConvertSpecialChars and TADA_CalculateTotalNP

wokenny13 · 2025-08-01T18:35:11Z

I am looking at the example in the Module 1 vignette and just for phosphorus ( TADAProfileClean5)

It seems like there are only 3 other phosphorus TADA.ComparableDataIdentifier in our data frame. Since they are already as P would it then be converted as is to Total Phosphorus? Should we expect the Total Phosphorus, Mixed Forms to only increase by a maximum of (the sum of 62 + 8 + 125) 195 rows once you run TADA_CalculateTotalNP?

It seems like it goes from 581 rows to 1180? Are duplicate rows being added on to the Total Phosphorus, Mixed Forms summation?

cristinamullin · 2025-08-01T18:50:34Z

The TN/TP function is not working correctly...

today <- Sys.Date()
oneyearago <- as.character(today - 1 * 365)
testdat <- TADA_DataRetrieval(statecode = "UT", 
                              startDate = oneyearago, 
                              characteristicType = "Nutrient",
                              ask = FALSE)

testdat <- TADA_SimpleCensoredMethods(testdat,
                                      nd_method = "multiplier",
                                      nd_multiplier = 0.5,
                                      od_method = "as-is",
                                      od_multiplier = "null")

testdat2 <- TADA_CalculateTotalNP(testdat, daily_agg = "max")

unique(testdat2$TADA.ResultMeasureValueDataTypes.Flag)

# Filter data to show only rows where TADA.ResultMeasureValue is NA
filtered_data <- testdat2[is.na(testdat$TADA.ResultMeasureValue), ]

unique(filtered_data$TADA.NutrientSummation.Flag)
unique(filtered_data$TADA.ResultMeasureValueDataTypes.Flag)

# isolate rows that are NA related to TP and TN function
filtered_data_2 <- filtered_data[!(
  filtered_data$TADA.ResultMeasureValueDataTypes.Flag %in% 
    c("TP estimated from one or more subspecies.", 
      "TN estimated from one or more subspecies.")
)]

# isolate rows that are NA related to TP and TN function
# reduces number of columns... why
filtered_data_3 <- filtered_data[!(
  filtered_data$TADA.NutrientSummation.Flag %in% 
    c("Used to calculate Total Phosphorus as P.", 
      "Used to calculate Total Nitrogen as N." #,
      #"Not used to calculate Total N or P."
      )
)]

# these should not be used to calculate TN and TP if result value is NA
# and TADA.ResultMeasureValueDataTypes.Flag equals any of the following...
# [1] "NA - Not Available"                                        
# [2] "Result Value/Unit Cannot Be Estimated From Detection Limit"
# [3] "Text"

cristinamullin · 2025-08-01T18:58:01Z

I am looking at the example in the Module 1 vignette and just for phosphorus ( TADAProfileClean5)

It seems like there are only 3 other phosphorus TADA.ComparableDataIdentifier in our data frame. Since they are already as P would it then be converted as is to Total Phosphorus? Should we expect the Total Phosphorus, Mixed Forms to only increase by a maximum of (the sum of 62 + 8 + 125) 195 rows once you run TADA_CalculateTotalNP?

It seems like it goes from 581 rows to 1180? Are duplicate rows being added on to the Total Phosphorus, Mixed Forms summation?

I noticed this too. I did add additional grouping cols on lines 342-351 in Transformations.R. I will comment these out for now. Can you test this again? Taking a break on the PR for now - please feel free to jump in if you can find/fix the issue!

# "TADA.ResultMeasure.MeasureUnitCode", # this may not need to be included since we have TADA.ComparableDataIdentifier
# consider adding columns below... 
# "OrganizationIdentifier", # should be okay to include here since this is by MonitoringLocationIdentifier which are org specific anyway
# "OrganizationFormalName",
# "CountryCode",
# "StateCode",
# "CountyCode",
# "HUCEightDigitCode",
# "MonitoringLocationTypeName",
# "MonitoringLocationDescriptionText",

wokenny13 · 2025-08-01T19:02:31Z

Yes, I can take a look into this

cristinamullin · 2025-08-01T19:02:54Z

R/RequiredCols.R

@@ -449,10 +452,71 @@ TADA_CheckRequiredFields <- function(.data) {
 #' reducedcols_Data_Nutrients_UT <- TADA_RetainRequired(Data_Nutrients_UT)
 #'
 TADA_RetainRequired <- function(.data) {


This needs review. I combined functions TADA_RetainRequired and TADA_AutoFilter here. Does that make sense or are they may be better left as separate functions? Possible new name suggestion: maybe TADA_ReduceCols

cristinamullin · 2025-08-01T19:05:59Z

R/Transformations.R

    "TADA.ComparableDataIdentifier",
-    # "TADA.ResultMeasure.MeasureUnitCode",
+    # "TADA.ResultMeasure.MeasureUnitCode", # this may not need to be included since we have TADA.ComparableDataIdentifier
+    # consider adding columns below... 


@wokenny13 FYI - these column additions were meant to address this issue with metadata being carried over. Maybe it can be addressed elsewhere in this function (or might not be possible). Needs more research: #588

add dataRetrieval depends to EPATADA depends, improve print messages, add example to TADA_CalculateTotalNP, address QC and non-numeric results within TADA_CalculateTotalNP, add TADA_CalculateTotalNP tests

move functions to more applicable .R files

the default should be clean = FALSE for TADA_AggregateMeasurements and TADA_RunKeyFlagFunctions, add better messages for flag functions

add more TN/TP combinations, update TADA_CalculateTotalNP documentation, add tests

add additional matches, handle aggregated values, handle non-numeric results

fix tests and add more, update logic

cristinamullin · 2025-08-12T02:29:31Z

ideally, these could be run in any order... or we should think about running censored data handling in autoclean to ensure the proper order....

Suggest to add logic for NA TADA.ResultMeasure.MeasureUnitCode's to TADA_ConvertSpecialChars in future PR

test_that("TADA_SimpleCensoredMethods does not introduce duplicates or NAs in result or unit cols", {
  
  testdat <- TADA_RandomTestingData()
  
  testdat <- TADA_ConvertSpecialChars(testdat, 
                                      col = "TADA.ResultMeasureValue",
                                      clean = TRUE)
  
  # Test to ensure the column is entirely numeric
  expect_true(is.numeric(testdat$TADA.ResultMeasureValue))
  
  # Test to ensure value column does not contain any NA values
  expect_true(!any(is.na(testdat$TADA.ResultMeasureValue)))
  
  # TADA_ConvertSpecialChars does not handle this yet 8/11/25
  # # Test to ensure unit column does not contain any NA values
  # expect_true(!any(is.na(testdat$TADA.ResultMeasure.MeasureUnitCode)))
  
  testdat2 <- TADA_SimpleCensoredMethods(testdat,
                                         nd_method = "multiplier",
                                         nd_multiplier = 0.5,
                                         od_method = "as-is",
                                         od_multiplier = "null")
  
  testdat3 <- TADA_ConvertSpecialChars(testdat3, 
                                      col = "TADA.ResultMeasureValue",
                                      clean = TRUE)
  
  # Test to ensure the column is entirely numeric
  expect_true(is.numeric(testdat3$TADA.ResultMeasureValue))
  
  # Test to ensure value column does not contain any NA values
  expect_true(!any(is.na(testdat3$TADA.ResultMeasureValue)))
  
  # # Test to ensure unit column does not contain any NA values
  # expect_true(!any(is.na(testdat2$TADA.ResultMeasure.MeasureUnitCode)))
})

cristinamullin · 2025-08-12T15:54:48Z

See new combined PR: #630

cristinamullin added 8 commits July 28, 2025 17:25

add new TADA_ConvertSpecialChars clean = T option

013ccd1

fix TN and TP harmonization

8affacf

add unit to grouping cols for TADA_AggregateMeasurements, fix TN and TP bug, add combos to harmonization table, update vignette

Update TADAModule1.Rmd

89f4a8b

remove intermediate variables and update scatterplot examples

Update TADAModule1.Rmd

4a46565

removes TADA_AutoFilter and adds functionality to TADA_RetainRequired

310a149

Combine TADA_AutoFilter with TADA_RetainRequired and delete TADA_AutoFilter

update vignettes

1d6ffbf

incorporate changes made to TADA_FindQCActivities, TADA_RetainRequired, and TADA_AutoFilter

Update TADA_RetainRequired.Rd

c2d5951

Update TADAModule1.Rmd

5f15474

cristinamullin requested review from hillarymarler and wokenny13 July 29, 2025 18:59

cristinamullin added 2 commits July 29, 2025 16:21

update refs

5800357

update example data

8b8575d

cristinamullin linked an issue Jul 31, 2025 that may be closed by this pull request

Suggestion to include additional metadata fields in TADA_CalculateTotalNP output #588

Closed

8 tasks

cristinamullin added 2 commits July 31, 2025 12:05

add missing rows to NP/TN summation key, add metadata to output

d3a1c53

Update RequiredCols.R

80226af

cristinamullin added 3 commits July 31, 2025 19:44

add tests

98eebb2

only run if not run previously

990d711

define flag col and add tests

1796e24

add tests and examples for TADA_ConvertSpecialChars and TADA_CalculateTotalNP

revert column additions

959aec0

cristinamullin commented Aug 1, 2025

View reviewed changes

cristinamullin linked an issue Aug 1, 2025 that may be closed by this pull request

add curl as dependency #627

Closed

cristinamullin added 11 commits August 1, 2025 16:46

add curl as dependency

dd07628

update TADA_CalculateTotalNP

0575e56

add dataRetrieval depends to EPATADA depends, improve print messages, add example to TADA_CalculateTotalNP, address QC and non-numeric results within TADA_CalculateTotalNP, add TADA_CalculateTotalNP tests

organize functions

8e9e4a4

move functions to more applicable .R files

update docs

280b732

Update Transformations.R

19701ff

update TADA_AggregateMeasurements and TADA_RunKeyFlagFunctions

71e71d8

the default should be clean = FALSE for TADA_AggregateMeasurements and TADA_RunKeyFlagFunctions, add better messages for flag functions

update TADA_CalculateTotalNP

3bb4958

add more TN/TP combinations, update TADA_CalculateTotalNP documentation, add tests

update TADA_CalculateTotalNP

5f7599f

add additional matches, handle aggregated values, handle non-numeric results

fix NA bug in TADA_CalculateTotalNP

e184454

remove duplicated info, improve TADA_CalculateTotalNP flags

76899d4

fix TADA_CalculateTotalNP and TADA_IDCensoredData logic and create tests

52a6943

fix tests and add more, update logic

cristinamullin linked an issue Aug 11, 2025 that may be closed by this pull request

Getting all combinations of characteristic-fraction-speciation-unit in harmonization table #319

Closed

6 tasks

update tests

ba56bee

cristinamullin mentioned this pull request Aug 12, 2025

Combined PR review #630

Merged

cristinamullin closed this Aug 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug fix and adding recent updates to vignettes #624

bug fix and adding recent updates to vignettes #624

Uh oh!

cristinamullin commented Jul 29, 2025 •

edited

Loading

Uh oh!

cristinamullin commented Jul 29, 2025

Uh oh!

hillarymarler commented Jul 30, 2025 •

edited

Loading

Uh oh!

cristinamullin commented Jul 30, 2025

Uh oh!

cristinamullin commented Jul 31, 2025 •

edited

Loading

Uh oh!

wokenny13 commented Aug 1, 2025

Uh oh!

cristinamullin commented Aug 1, 2025

Uh oh!

cristinamullin commented Aug 1, 2025

Uh oh!

wokenny13 commented Aug 1, 2025

Uh oh!

cristinamullin Aug 1, 2025

Uh oh!

cristinamullin Aug 1, 2025 •

edited

Loading

Uh oh!

cristinamullin commented Aug 12, 2025

Uh oh!

cristinamullin commented Aug 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bug fix and adding recent updates to vignettes #624

bug fix and adding recent updates to vignettes #624

Uh oh!

Conversation

cristinamullin commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cristinamullin commented Jul 29, 2025

Uh oh!

hillarymarler commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cristinamullin commented Jul 30, 2025

Uh oh!

cristinamullin commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wokenny13 commented Aug 1, 2025

Uh oh!

cristinamullin commented Aug 1, 2025

Uh oh!

cristinamullin commented Aug 1, 2025

Uh oh!

wokenny13 commented Aug 1, 2025

Uh oh!

cristinamullin Aug 1, 2025

Choose a reason for hiding this comment

Uh oh!

cristinamullin Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cristinamullin commented Aug 12, 2025

Uh oh!

cristinamullin commented Aug 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cristinamullin commented Jul 29, 2025 •

edited

Loading

hillarymarler commented Jul 30, 2025 •

edited

Loading

cristinamullin commented Jul 31, 2025 •

edited

Loading

cristinamullin Aug 1, 2025 •

edited

Loading