I am running into an error (logically) when I am reading csv files with pgloader. If there are lines with fewer columns than expected for that table, pgloader throws a "list index out of range" error, after which the whole loading process stops.
Although the error is to be expected with junk csv files, I am looking to making pgloader a bit more resillient on this, such that if it detects a line with fewer columns then expected, it just writes that line to reject.log and skips that line.
The problem is, I don't know exactly where in pgloader the error occurs. I am looking at readlines in pgloader.csvreader.CSVReader, but don't know exactly where the error occurs.
Is there a way to show the full stacktrace in pgloader logs where the "list index out of range" exception occurs? That would make looking for that line a lot easier. Or can somebody point me at the correct location. Then I could change this behaviour.
Ok, I made a commit in my own code that makes a debug log of the full stack trace if an exception occurs in pgloader.pglaoder.PGLoader.run (which already catches and logs the exception itself).
Try running pgloader with the -d --debug option, that should provide you with the stack trace you're looking for. That's a bug we indeed need to fix, I'm curious about the test case (some data, the setup, the version, the stacktrace).
Missing columns in the source file is now properly handled in pgloader (version 3).