Skip to content

[Enhancement] when a csv file include a separator at the end of line, the stream load could ignore it. #10025

@lide-reed

Description

@lide-reed

Search before asking

  • I had searched in the issues and found no similar issues.

Description

when a csv file include a separator at the end of line, the stream load could ignore it.
The current code will report error and users must be produce it by themselves.
For an example:
`+ curl --location-trusted -u admin:upbAGZ2495 -T nation.tbl -H label:nation-1651803743 -H 'column_separator:|' http://10.0.10.191:8030/api/tpchdb/nation/_stream_load
{
"TxnId": 384,
"Label": "nation-1651803743",
"TwoPhaseCommit": "false",
"Status": "Fail",
"Message": "too many filtered rows",
"NumberTotalRows": 25,
"NumberLoadedRows": 0,
"NumberFilteredRows": 25,
"NumberUnselectedRows": 0,
"LoadBytes": 2224,
"LoadTimeMs": 5,
"BeginTxnTimeMs": 0,
"StreamLoadPutTimeMs": 0,
"ReadDataTimeMs": 0,
"WriteDataTimeMs": 3,
"CommitAndPublishTimeMs": 0,
"ErrorURL": "http://10.0.10.13:8040/api/_load_error_log?file=__shard_3/error_log_insert_stmt_4141b040d918db2c-53d1e1450b976b2_4141b040d918db2c_53d1e1450b976b2"
}

Reason: actual column number is more than schema column number.actual number: 5, column separator: [|], line delimiter: [
], schema number: 4; . src line [0|ALGERIA|0| haggle. carefully final deposits detect slyly agai|];`

Solution

Just ignore the separator at the end of line when loading.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    StaleusercaseImportant user case type label

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions