Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better error message for copy #2854

Merged
merged 1 commit into from
Feb 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ template<class T, bool IS_SIGNED = true>
inline void simpleIntegerCast(
const char* input, uint64_t len, T& result, LogicalTypeID typeID = LogicalTypeID::ANY) {
if (!trySimpleIntegerCast<T, IS_SIGNED>(input, len, result)) {
throw ConversionException(stringFormat("Cast failed. {} is not in {} range.",
throw ConversionException(stringFormat("Cast failed. Could not convert \"{}\" to {}.",
std::string{input, len}, LogicalTypeUtils::toString(typeID)));
}
}
Expand Down
10 changes: 7 additions & 3 deletions src/processor/operator/persistent/reader/csv/driver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,13 @@ void ParsingDriver::addValue(
stringFormat("Error in file {}, on line {}: expected {} values per row, but got more.",
reader->fileInfo->path, reader->getLineNumber(), reader->numColumns));
}

function::CastString::copyStringToVector(
chunk.getValueVector(columnIdx).get(), rowNum, value, &reader->option);
try {
acquamarin marked this conversation as resolved.
Show resolved Hide resolved
function::CastString::copyStringToVector(
chunk.getValueVector(columnIdx).get(), rowNum, value, &reader->option);
} catch (ConversionException& e) {
throw CopyException(stringFormat("Error in file {} on line {}: {}", reader->fileInfo->path,
reader->getLineNumber(), e.what()));
}
}

bool ParsingDriver::addRow(uint64_t /*rowNum*/, common::column_id_t columnCount) {
Expand Down
4 changes: 2 additions & 2 deletions test/test_files/csv/errors.test
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,10 @@ Copy exception: Error in file ${KUZU_ROOT_DIRECTORY}/dataset/csv-error-tests/too

-STATEMENT LOAD FROM "${KUZU_ROOT_DIRECTORY}/dataset/csv-error-tests/union-no-conversion.csv" (HEADER=TRUE) RETURN *
---- error
Conversion exception: Could not convert to union type UNION(u:UINT8, s:INT8): a.
Copy exception: Error in file ${KUZU_ROOT_DIRECTORY}/dataset/csv-error-tests/union-no-conversion.csv on line 2: Conversion exception: Could not convert to union type UNION(u:UINT8, s:INT8): a.

# Test that errors in serial mode don't hang the database.
# File is large so the window for the race is large enough.
-STATEMENT LOAD FROM "${KUZU_ROOT_DIRECTORY}/dataset/csv-error-tests/large-conversion-failure.csv" (HEADER=TRUE, PARALLEL=FALSE) RETURN *
---- error
Conversion exception: Cast failed. a is not in INT64 range.
Copy exception: Error in file ${KUZU_ROOT_DIRECTORY}/dataset/csv-error-tests/large-conversion-failure.csv on line 1002: Conversion exception: Cast failed. Could not convert "a" to INT64.
7 changes: 7 additions & 0 deletions test/test_files/exceptions/copy/invalid_row.test
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,10 @@
-STATEMENT COPY watch FROM "${KUZU_ROOT_DIRECTORY}/dataset/copy-fault-tests/invalid-row/eWatches.csv"
---- error
Runtime exception: Unable to find primary key value 5.

-CASE InvalidHeader
-STATEMENT create node table tableOfTypes (id INT64, int64Column INT64, doubleColumn DOUBLE, booleanColumn BOOLEAN, dateColumn DATE, timestampColumn TIMESTAMP, stringColumn STRING, listOfInt INT64[], PRIMARY KEY (id));
---- ok
-STATEMENT COPY tableOfTypes FROM "${KUZU_ROOT_DIRECTORY}/dataset/copy-test/node/csv/types_50k.csv"
---- error
Copy exception: Error in file ${KUZU_ROOT_DIRECTORY}/dataset/copy-test/node/csv/types_50k.csv on line 1: Conversion exception: Cast failed. Could not convert "id" to INT64.
6 changes: 3 additions & 3 deletions test/test_files/exceptions/copy/wrong_header.test
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ False
True
-STATEMENT COPY person2 FROM "${KUZU_ROOT_DIRECTORY}/dataset/copy-fault-tests/wrong-header/vPerson.csv" (HEADER=true)
---- error
Conversion exception: Cast failed. Guodong is not in INT64 range.
Copy exception: Error in file ${KUZU_ROOT_DIRECTORY}/dataset/copy-fault-tests/wrong-header/vPerson.csv on line 2: Conversion exception: Cast failed. Could not convert "Guodong" to INT64.
-STATEMENT COPY person3 FROM "${KUZU_ROOT_DIRECTORY}/dataset/copy-fault-tests/wrong-header/vPerson.csv" (HEADER=true)
---- error
Binder exception: Number of columns mismatch. Expected 1 but got 2.
Expand All @@ -39,7 +39,7 @@ Binder exception: Number of columns mismatch. Expected 2 but got 1.
Binder exception: Number of columns mismatch. Expected 3 but got 4.
-STATEMENT COPY knows FROM "${KUZU_ROOT_DIRECTORY}/dataset/copy-fault-tests/wrong-header/eKnowsMissingColumn.csv" (HEADER=true)
---- error
Conversion exception: Error occurred during parsing interval. Field name is missing.
Copy exception: Error in file ${KUZU_ROOT_DIRECTORY}/dataset/copy-fault-tests/wrong-header/eKnowsMissingColumn.csv on line 2: Conversion exception: Error occurred during parsing interval. Field name is missing.

-CASE ParquetHeaderMismatch
-STATEMENT CREATE NODE TABLE User(name STRING, age INT64, PRIMARY KEY (name));
Expand Down Expand Up @@ -85,7 +85,7 @@ Binder exception: Number of columns mismatch. Expected 3 but got 2.
---- ok
-STATEMENT COPY person FROM "${KUZU_ROOT_DIRECTORY}/dataset/tinysnb/vPerson.csv" (HEADER=true)
---- error
Conversion exception: Cast failed. Alice is not in INT64 range.
Copy exception: Error in file ${KUZU_ROOT_DIRECTORY}/dataset/tinysnb/vPerson.csv on line 2: Conversion exception: Cast failed. Could not convert "Alice" to INT64.

-CASE HeaderError
-STATEMENT create node table person (ID INT64, fName STRING, PRIMARY KEY (ID))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
RETURN to_int64(foo_string) AS foo, to_int64(empty_string) AS empty;
## Outcome: the result should be, in any order:
---- error
Conversion exception: Cast failed. foo is not in INT64 range.
Conversion exception: Cast failed. Could not convert "foo" to INT64.

# `toInteger()` handling mixed number types
-CASE Scenario3
Expand Down
Loading
Loading