-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
agregating data in temporal-covarate-settings produces NA timeId in covaraitesContinuous #114
Comments
Hi @javier-gracia-tabuenca-tuni - can you link me to where in CohortDiagnostics where you found this code? Upon first glance, the parameter |
Sorry, this code is not in CohortDiagnostics. I'm trying to run a custom cohort diagnosis which shows in Temporal Characterisation the count of visits in each time period. Here is some code to reproduce the error: title: "R Notebook"
|
However, I traced the error and it seems to originate in FeatureExtraction. For this reason I posted it here. Here is a simpler code to reproduce the error in FeatureExtraction: title: "R Notebook"
|
Hi @javier-gracia-tabuenca-tuni - thanks for the code to help reproduce this problem! I've taken your code and made some changes changed it since there were some functions you referenced that are missing and I don't have the Asthma cohort referenced. So, I constructed a very basic cohort to use anyone in the Eunomia data set that has an observation_period entry to keep things simple. Below is the code: library(SqlRender)
library(Eunomia)
#> Loading required package: DatabaseConnector
library(FeatureExtraction)
#> Loading required package: Andromeda
#> Loading required package: dplyr
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
connectionDetails <- getEunomiaConnectionDetails()
connection <- connect(connectionDetails)
#> Connecting using SQLite driver
cdmDatabaseSchema <- "main"
oracleTempSchema <- NULL
cohortDatabaseSchema <- "main"
cohortTable <- "cohort"
# Construct a cohort of people
cohortMembers <- querySql(connection, "SELECT 1 cohort_definition_id,
person_id subject_id,
observation_period_start_date cohort_start_date,
observation_period_end_date cohort_end_date
FROM observation_period")
# Insert into the cohort table
DatabaseConnector::insertTable(connection = connection,
tableName = cohortTable,
data = cohortMembers,
dropTableIfExists = FALSE,
createTable = FALSE)
# Characterize visit counts over the following time windows
tempCovaraitesSettings <- createTemporalCovariateSettings(
useVisitConceptCount = TRUE,
temporalStartDays = c(0,31,61,91),
temporalEndDays = c(30,60,90,365*2+1)
)
tempCov <- getDbCovariateData(
connectionDetails = connectionDetails,
oracleTempSchema = oracleTempSchema,
cdmDatabaseSchema = cdmDatabaseSchema,
cohortTable = cohortTable,
cohortDatabaseSchema = cdmDatabaseSchema,
cohortTableIsTemp = FALSE,
cohortId = 1,
rowIdField = "subject_id",
covariateSettings = tempCovaraitesSettings,
aggregated = TRUE
)
#> Connecting using SQLite driver
#> Sending temp tables to server
#> Constructing features on server
#> | | | 0% | |== | 4% | |===== | 7% | |======== | 11% | |========== | 14% | |============ | 18% | |=============== | 21% | |================== | 25% | |==================== | 29% | |====================== | 32% | |========================= | 36% | |============================ | 39% | |============================== | 43% | |================================ | 46% | |=================================== | 50% | |====================================== | 54% | |======================================== | 57% | |========================================== | 61% | |============================================= | 64% | |================================================ | 68% | |================================================== | 71% | |==================================================== | 75% | |======================================================= | 79% | |========================================================== | 82% | |============================================================ | 86% | |============================================================== | 89% | |================================================================= | 93% | |==================================================================== | 96% | |======================================================================| 100%
#> Executing SQL took 0.0273 secs
#> Fetching data from server
#> Fetching data took 0.125 secs
data.frame(tempCov$timeRef)
#> timeId startDay endDay
#> 1 1 0 30
#> 2 2 31 60
#> 3 3 61 90
#> 4 4 91 731
data.frame(tempCov$covariatesContinuous)
#> cohortDefinitionId covariateId countValue minValue maxValue averageValue
#> 1 1 9201911 15 0 1 0.002807412
#> standardDeviation medianValue p10Value p25Value p75Value p90Value timeId
#> 1 0 0 0 0 0 0 NA Created on 2021-01-12 by the reprex package (v0.3.0) As shown, the temporal analysis attempts to use 4 different time periods but the resulting continuous covariate result does not use the FeatureExtraction/inst/sql/sql_server/ConceptCounts.sql Lines 177 to 179 in da48fda
@schuemie - tagging you here to get your input on this particular code to ask: is this a bug or a decision to not include the temporal time identifiers as part of the results when constructing these types of covariates? Or does this perspective change at all when doing this at a patient level vs aggregating the results? |
Thanks Anthon. The asthma cohort was in atlas-demo.ohdsi.org, but yeap, using query to build cohort much better for reproducing. I'll do it like that next time. I see you dug deeper than me. Let's wait what @schuemie has to say. |
Column
timeId
is all NAI can't find why this happens.
I came to find this error when using CohortDiagnostics and I trying to get the temporal characterization of visit counts.
The text was updated successfully, but these errors were encountered: