PHOENIX-7267 CsvBulkLoadTool fails job due to a bad record with "(sta…#2399
PHOENIX-7267 CsvBulkLoadTool fails job due to a bad record with "(sta…#2399xavifeds8 wants to merge 5 commits into
Conversation
2404a9d to
41a8c42
Compare
|
With commons-csv 1.0, CsvBulkLoadTool would fail the entire MapReduce job when encountering a malformed CSV record. Sanity test for the upgrade : https://gist.github.com/xavifeds8/bd6015a1733ddbf630cbbdb453bdbc0d |
|
Changes made :
Tests:
|
40f08c1 to
ff599da
Compare
|
Hi @virajjasani, could you please trigger a CI run for this PR? Also, let me know if the changes look good to you. |
|
@xavifeds8 the CI builds are broken temporarily. Could you run all csv bulk load realted tests and some more in your local to confirm nothing is broken? |
virajjasani
left a comment
There was a problem hiding this comment.
+1, need manual verification of build and some test results
| try (CSVParser csvParser = | ||
| CSVParser.builder().setFormat(csvFormat).setReader(new StringReader(input)).get()) { | ||
| return Iterables.getFirst(csvParser, null); | ||
| } catch (UncheckedIOException e) { |
There was a problem hiding this comment.
Please ensure that the newer csvParser actually throws UncheckedIOException for a bad record. We should not catch anything more unless needed
There was a problem hiding this comment.
Hey @NihalJain
https://github.com/apache/commons-csv/blob/6f93c7edfa0f758f757227b1d30588411fdbf669/src/main/java/org/apache/commons/csv/CSVParser.java#L234
Here in 1.14.1 in csv-commons CSVParser IOException is wraped with UncheckedIOException.
also have added a UT to verify this behaviour 25be349
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.phoenix.mapreduce.CsvToKeyValueMapperTest
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.064 s -- in org.apache.phoenix.mapreduce.CsvToKeyValueMapperTest
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 23.525 s
[INFO] Finished at: 2026-05-18T12:02:52+05:30
[INFO] ------------------------------------------------------------------------
% cat phoenix-core/target/surefire-reports/org.apache.phoenix.mapreduce.CsvToKeyValueMapperTest-output.txt
Exception type: java.io.UncheckedIOException
Message: org.apache.commons.csv.CSVException: (startline 1) EOF reached before encapsulated token finished
Cause type: org.apache.commons.csv.CSVException
Cause message: (startline 1) EOF reached before encapsulated token finished
phoenix %
…rtline 1) EOF reached before encapsulated token finished"
…rtline 1) EOF reached before encapsulated token finished"