Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big int cohort id fix #163

Merged
merged 26 commits into from
Apr 9, 2024
Merged

Big int cohort id fix #163

merged 26 commits into from
Apr 9, 2024

Conversation

azimov
Copy link
Contributor

@azimov azimov commented Apr 3, 2024

Resolves #162

R/Analyses.R Outdated
checkmate::assertLogical(outcomeOfInterest, add = errorMessages)
checkmate::assertNumeric(trueEffectSize, len = 1, null.ok = TRUE, add = errorMessages)
checkmate::assertInt(riskWindowStart, null.ok = TRUE, add = errorMessages)
checkmate::assertInt(riskWindowEnd, null.ok = TRUE, add = errorMessages)
checkmate::reportAssertions(collection = errorMessages)
if (!is.null(startAnchor) && !grepl("start$|end$", startAnchor, ignore.case = TRUE)) {
stop("startAnchor should have value 'cohort start' or 'cohort end'")
stop("startAnchor should have value \'cohort start\' or \'cohort end\'")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you help me understand why you're escaping single quotes? It is not needed when the string uses double quotes

R/RunAnalyses.R Outdated
createCmDataTask <- function(i) {
refRow <- subset[i, ]
refRow <- subset[i,]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per our code guidelines, a comma should be followed by a space. Why remove it here?

R/RunAnalyses.R Outdated
)
return(task)

if (file.exists(file.path(outputFolder, refRow$psFile))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the condition? Is there something wrong with the code for generating the reference table that will specify strataFiles to be created without there being psFiles? Maybe best to fix it there?

R/RunAnalyses.R Outdated
studyPop <- readRDS(file.path(outputFolder, refRow$studyPopFile))
studyPop <- addPsToStudyPopulation(studyPop, ps)
saveRDS(studyPop, file.path(outputFolder, refRow$psFile))
tryCatch({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the error message thrown by saveRDS not informative enough? If so, maybe we should have a generic wrapper for saveRDS in CohortMethod that we always use?

R/RunAnalyses.R Outdated
params$psFile))
ps <- applyTrimMatchStratify(ps, params$args)
saveRDS(ps, params$strataFile)
tryCatch({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes, this tryCatch block covers multiple function calls, but the error message will be about the saving of the stataFile.

R/RunAnalyses.R Outdated
if (length(covariatesToExclude) != 0) {
covariates <- covariates %>%
filter(!.data$covariateId %in% covariatesToExclude)
tryCatch({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double yikes! What is the point in wrapping such a large block of code in a try catch?

R/RunAnalyses.R Outdated
attr(class(filteredCohortMethodData), "package") <- "CohortMethod"
saveCohortMethodData(filteredCohortMethodData, params$prefilteredCovariatesFile)
}, error = function(err) {
ParallelLogger::logError(err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code has no effect. Errors are automatically caught by ParalllelLogger. Just rethrowing the error will achieve the same thing

@schuemie
Copy link
Member

schuemie commented Apr 8, 2024

I'm sorry, I can't accept this PR. It contains a lot of changes that have nothing to do with the issue at hand.

I'm open to rethinking how we deal with errors, but that should be a separate issue and a separate PR (but only after we agree on the approach).

Also, please adhere to the style guide (as also implemented by the styler package).

@azimov
Copy link
Contributor Author

azimov commented Apr 8, 2024 via email

@schuemie
Copy link
Member

schuemie commented Apr 9, 2024

Thanks! Not sure why we're failing check now. Seems to be some SQL issue. I'll merge and debug

@schuemie schuemie merged commit 8ef8bcb into develop Apr 9, 2024
2 of 10 checks passed
@schuemie
Copy link
Member

schuemie commented Apr 9, 2024

Ah, the problem is that SQLite does not allow changing the column type of an existing table. The good news is that SQLite INT is dynamic, and will contain BIGINT values.

Why do we have cohort IDs > 2^31?

@azimov
Copy link
Contributor Author

azimov commented Apr 9, 2024

Why do we have cohort IDs > 2^31?

These are generally cohorts made from templates or subsets where we do something like concept_id * 1000 as cohort_definition_id, this happened when running this on the reward cohorts

@schuemie
Copy link
Member

Ok, we'll probably need fixes throughout HADES then. I always assumed, since we're assigning cohort IDs ourselves, we would not exceed 2^31.

I think I fixed the unit tests now, by fixing the SqlRender translation of ALTER TABLE ALTER COLUMN.

I also removed the PostgreSQL-specific version of the migration script. We really shouldn't have platform-specific code in HADES packages other than SqlRender and DatabaseConnector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants