Issue with loadVocabFromCsv #198

katy-sadowski · 2024-07-14T18:21:44Z

I got the following error when trying to load the vocabulary tables from csv. It seems it's putting quotes around the whole list of column names in the insert, so it thinks that the list is the name of a single column. I checked and my table looks normal, with the 4 columns, as does the csv I'm inserting.

> cd <- DatabaseConnector::createConnectionDetails(
+   dbms     = "postgresql", 
+   server   = "localhost/ohdsi", 
+   user     = "postgres", 
+   password = "postgres", 
+   port     = 5432)
> 
> cdmSchema      <- "dbt_synthea_1k"
> cdmVersion     <- "5.4"
> syntheaVersion <- "3.0.0"
> syntheaSchema  <- "synthea_1k"
> syntheaFileLoc <- "~/Synthea/output/csv"
> vocabFileLoc   <- "~/Synthea/vocab_shard_1k"
> ETLSyntheaBuilder::LoadVocabFromCsv(connectionDetails = cd, cdmSchema = cdmSchema, vocabFileLoc = vocabFileLoc)
Connecting using PostgreSQL driver
Working on file ~/Synthea/vocab_shard_1k/concept_ancestor.csv
 - reading file 
 - type converting
 - uploading 263067 rows of data in 1 chunks.
  |==========================================================================================| 100%
Executing SQL took 0.605 secs
 - chunk uploading started on 2024-07-13 19:30:27 for rows 1 to 263067
  |                                                                                          |   0%Error in rJava::.jcall(batchedInsert, "Z", "executeBatch") : 
  java.sql.BatchUpdateException: Batch entry 0 INSERT INTO dbt_synthea_1k.concept_ancestor ("ancestor_concept_id,descendant_concept_id,min_levels_of_separation,max_levels_of_separation") VALUES(5.82111112743323E14) was aborted: ERROR: column "ancestor_concept_id,descendant_concept_id,min_levels_of_separat" of relation "concept_ancestor" does not exist
  Position: 46  Call getNextException to see other errors in the batch.

The text was updated successfully, but these errors were encountered:

burrowse · 2024-07-24T20:32:53Z

@katy-sadowski Would you be able to send me the file you are loading? The only thing I can think of is the fact that the name of the function is misleading in that it expects a tabbed delimiter instead of an actual comma in the load

 vocabTable <-
        data.table::fread(
          file = paste0(vocabFileLoc, "/", csv),
          stringsAsFactors = FALSE,
          header = TRUE,
          sep = "\t",
          na.strings = ""
        )

katy-sadowski · 2024-07-25T23:21:59Z

Ah, yep, that is it! Maybe a param could be added to specify the separator? (No rush - this is not blocking me 😄 )

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with loadVocabFromCsv #198

Issue with loadVocabFromCsv #198

katy-sadowski commented Jul 14, 2024

burrowse commented Jul 24, 2024

katy-sadowski commented Jul 25, 2024

Issue with loadVocabFromCsv #198

Issue with loadVocabFromCsv #198

Comments

katy-sadowski commented Jul 14, 2024

burrowse commented Jul 24, 2024

katy-sadowski commented Jul 25, 2024