You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After doing a hybrid of the Linux bulk load and some R scripts (#17), I'm seeing duplication of row IDs in some tables. For example, my observations table has
Does person 8 (0141bae5-c190-4e83-aab0-eed8dc2e91bd) have a 424213003 observation from visit 211 (acf6725e-23ea-4e10-a1f6-d8e2196d51fb) on 1985-09-13 or does person 1 (00002c66-a365-4e88-8e80-d52bcad4869e) have a 233604007 from visit 10 (120aa894-4465-4b04-af96-1928191f1c36) on 2010-07-26?
Could this have something to do with my vocabulary choices? I'm using everything that can be downloaded without requiring a licence. I have loaded the remotely-downloaded CPT codes.
Possibly relevant warnings?
> local.LoadCDMTables(cd,"cdm_synthea10","native")
Connecting using PostgreSQL driver
Running: insert_person.sql
|======================================================================| 100%
Executing SQL took 0.00914 secs
Running: insert_observation_period.sql
|======================================================================| 100%
Executing SQL took 0.027 secs
Running: insert_visit_occurrence.sql
|======================================================================| 100%
Executing SQL took 0.153 secs
Running: insert_condition_occurrence.sql
|======================================================================| 100%
Executing SQL took 20.2 secs
Running: insert_observation.sql
|======================================================================| 100%
Executing SQL took 21.7 secs
Running: insert_measurement.sql
|======================================================================| 100%
Executing SQL took 8.85 mins
Running: insert_procedure_occurrence.sql
|======================================================================| 100%
Executing SQL took 47.6 secs
Running: insert_drug_exposure.sql
|======================================================================| 100%
Executing SQL took 1.3 mins
Running: insert_condition_era.sql
|======================================================================| 100%
Executing SQL took 0.152 secs
Running: insert_drug_era.sql
|======================================================================| 100%
Executing SQL took 25.2 secs
Warning messages:
1: In SqlRender::render(sqlQuery, cdm_schema = cdmDatabaseSchema, synthea_schema = syntheaDatabaseSchema, :
Parameter 'vocab_schema' not found in SQL
2: In SqlRender::render(sqlQuery, cdm_schema = cdmDatabaseSchema, synthea_schema = syntheaDatabaseSchema, :
Parameter 'vocab_schema' not found in SQL
3: In SqlRender::render(sqlQuery, cdm_schema = cdmDatabaseSchema, synthea_schema = syntheaDatabaseSchema, :
Parameter 'synthea_schema' not found in SQL
4: In SqlRender::render(sqlQuery, cdm_schema = cdmDatabaseSchema, synthea_schema = syntheaDatabaseSchema, :
Parameter 'vocab_schema' not found in SQL
5: In SqlRender::render(sqlQuery, cdm_schema = cdmDatabaseSchema, synthea_schema = syntheaDatabaseSchema, :
Parameter 'synthea_schema' not found in SQL
6: In SqlRender::render(sqlQuery, cdm_schema = cdmDatabaseSchema, synthea_schema = syntheaDatabaseSchema, :
Parameter 'vocab_schema' not found in SQL
7: In SqlRender::render(sqlQuery, cdm_schema = cdmDatabaseSchema, synthea_schema = syntheaDatabaseSchema, :
Parameter 'synthea_schema' not found in SQL
>
The text was updated successfully, but these errors were encountered:
closing... I found that running the R scripts after a failed bulk load might have led to duplicate row IDs, mixed up concept realms and concept_id values of 0 in the clinical tables (#17). In the end, I just set the schema search path to include native and cdm_synthea10 and hand-ran the queries in ETL/SQL/*.sql, starting with "creating visit logic tables..." (ETL/SQL/AllVisitTable.sql)
After doing a hybrid of the Linux bulk load and some R scripts (#17), I'm seeing duplication of row IDs in some tables. For example, my observations table has
Does person 8 (0141bae5-c190-4e83-aab0-eed8dc2e91bd) have a 424213003 observation from visit 211 (acf6725e-23ea-4e10-a1f6-d8e2196d51fb) on 1985-09-13 or does person 1 (00002c66-a365-4e88-8e80-d52bcad4869e) have a 233604007 from visit 10 (120aa894-4465-4b04-af96-1928191f1c36) on 2010-07-26?
Same thing for drug_exposure and measurement.
BTW, source_to_concept_map is empty
Could this have something to do with my vocabulary choices? I'm using everything that can be downloaded without requiring a licence. I have loaded the remotely-downloaded CPT codes.
Possibly relevant warnings?
The text was updated successfully, but these errors were encountered: