Skip to content

[Bug] BE crash while load json data #34280

@cambyzju

Description

@cambyzju

Search before asking

  • I had searched in the issues and found no similar issues.

Version

1.2 & 2.0

master do not crash, maybe rapidjson lib version is different.

What's Wrong?

*** Query id: 1643dd385c317bb5-2370a4f3153c9ba0 ***
*** Aborted at 1714320572 (unix time) try "date -d @1714320572" if you are using GNU date ***
*** Current BE git commitID: a7d2ef9 ***
*** SIGSEGV address not mapped to object (@0x0) received by PID 98855 (TID 0x7f701866f700) from PID 0; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) in /usr/local/service/doris/lib/be/doris_be
 1# os::Linux::chained_handler(int, siginfo*, void*) in /usr/local/jdk/jre/lib/amd64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/local/jdk/jre/lib/amd64/server/libjvm.so
 3# signalHandler(int, siginfo*, void*) in /usr/local/jdk/jre/lib/amd64/server/libjvm.so
 4# 0x00007F7127FE0400 in /lib64/libc.so.6
 5# __strlen_sse2_pminub in /lib64/libc.so.6
 6# std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::compare(char const*) const in /usr/local/service/doris/lib/be/doris_be
 7# doris::vectorized::NewJsonReader::_vhandle_simple_json(std::vector<COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn>, std::allocator<COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn> > >&, std::vector<doris::SlotDescriptor*, std::allocator<doris::SlotDescriptor*> > const&, bool*, bool*) in /usr/local/service/doris/lib/be/doris_be
 8# doris::vectorized::NewJsonReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) in /usr/local/service/doris/lib/be/doris_be
 9# doris::vectorized::VFileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) in /usr/local/service/doris/lib/be/doris_be
10# doris::vectorized::VScanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) in /usr/local/service/doris/lib/be/doris_be
11# doris::vectorized::ScannerScheduler::_scanner_scan(doris::vectorized::ScannerScheduler*, doris::vectorized::ScannerContext*, doris::vectorized::VScanner*) in /usr/local/service/doris/lib/be/doris_be
12# doris::ThreadPool::dispatch_thread() in /usr/local/service/doris/lib/be/doris_be
13# doris::Thread::supervise_thread(void*) in /usr/local/service/doris/lib/be/doris_be
14# start_thread in /lib64/libpthread.so.0
15# __clone in /lib64/libc.so.6

What You Expected?

do not crash

How to Reproduce?

  1. create a test table:
CREATE TABLE `test` (
  `k1` int(11) NULL,
  `v` int(11) NULL
) ENGINE=OLAP
DUPLICATE KEY(`k1`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`k1`) BUCKETS 1
PROPERTIES (
"replication_num"="1");
  1. prepare a wrong format json file, test.json
    [[{"k1":33,"v":44}]]

  2. close simdjson in be config:
    enable_simdjson_reader=false

  3. use stream load to load this json file, and open fuzzy_parse:
    curl --location-trusted -u admin: -T test.json -H "fuzzy_parse:true" -H "strip_outer_array:true" -H "expect: 100-continue" -H "columns:k1,v" -H "format:json" http://127.0.0.1:8030/api/example_db/test/_stream_load

Anything Else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions