Skip to content

[Fix](inverted index) try not to delete entries in DorisCompoundReader::close#36420

Merged
qidaye merged 1 commit intoapache:masterfrom
airborne12:fix-compound-reader
Jun 18, 2024
Merged

[Fix](inverted index) try not to delete entries in DorisCompoundReader::close#36420
qidaye merged 1 commit intoapache:masterfrom
airborne12:fix-compound-reader

Conversation

@airborne12
Copy link
Member

Proposed changes

Issue Number: close #xxx

Deleting entries in DorisCompoundReader::close() might lead to a coredump in certain cases, especially with problematic inverted index files. A coredump stack as an example as follows:

*** SIGSEGV address not mapped to object (@0x20) received by PID 11696 (TID 14029 OR 0x7fe5edc8f700) from PID 32; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:421
 1# os::Linux::chained_handler(int, siginfo*, void*) in /usr/local/software/jdk1.8.0_131/jre/lib/amd64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/local/software/jdk1.8.0_131/jre/lib/amd64/server/libjvm.so
 3# signalHandler(int, siginfo*, void*) in /usr/local/software/jdk1.8.0_131/jre/lib/amd64/server/libjvm.so
 4# 0x00007FEFF3CEA090 in /lib/x86_64-linux-gnu/libc.so.6
 5# doris::segment_v2::DorisCompoundReader::fileExists(char const*) const at /root/doris/be/src/olap/rowset/segment_v2/inverted_index_compound_reader.cpp:223
 6# lucene::index::SegmentInfos::_FindSegmentsFile::doRun() at /root/doris/be/src/clucene/src/core/CLucene/index/SegmentInfos.cpp:1074
 7# lucene::index::SegmentInfos::FindSegmentsFile<lucene::index::DirectoryIndexReader*>::run() at /root/doris/be/src/clucene/src/core/CLucene/index/_SegmentInfos.h:508
 8# lucene::index::DirectoryIndexReader::open(lucene::store::Directory*, int, bool, lucene::index::IndexDeletionPolicy*) at /root/doris/be/src/clucene/src/core/CLucene/index/DirectoryIndexReader.cpp:189
 9# lucene::index::IndexReader::open(lucene::store::Directory*, int, bool, lucene::index::IndexDeletionPolicy*) at /root/doris/be/src/clucene/src/core/CLucene/index/IndexReader.cpp:131
10# doris::segment_v2::FulltextIndexSearcherBuilder::build(lucene::store::Directory*, std::optional<std::variant<std::shared_ptr<lucene::search::IndexSearcher>, std::shared_ptr<lucene::util::bkd::bkd_reader> > >&) in /mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be
11# doris::segment_v2::IndexSearcherBuilder::get_index_searcher(lucene::store::Directory*) at /root/doris/be/src/olap/rowset/segment_v2/inverted_index_searcher.cpp:101
12# doris::segment_v2::InvertedIndexReader::create_index_searcher(lucene::store::Directory*, std::variant<std::shared_ptr<lucene::search::IndexSearcher>, std::shared_ptr<lucene::util::bkd::bkd_reader> >*, doris::MemTracker*, doris::segment_v2::InvertedIndexReaderType) at /root/doris/be/src/olap/rowset/segment_v2/inverted_index_reader.cpp:237
13# doris::segment_v2::InvertedIndexReader::handle_searcher_cache(doris::segment_v2::InvertedIndexCacheHandle*, doris::OlapReaderStatistics*) in /mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be
14# doris::segment_v2::FullTextIndexReader::query(doris::OlapReaderStatistics*, doris::RuntimeState*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, void const*, doris::segment_v2::InvertedIndexQueryType, std::shared_ptr<roaring::Roaring>&) at /root/doris/be/src/olap/rowset/segment_v2/inverted_index_reader.cpp:340
15# doris::segment_v2::InvertedIndexIterator::read_from_inverted_index(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, void const*, doris::segment_v2::InvertedIndexQueryType, unsigned int, std::shared_ptr<roaring::Roaring>&, bool) at /root/doris/be/src/olap/rowset/segment_v2/inverted_index_reader.cpp:1180
16# doris::MatchPredicate::evaluate(std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<doris::vectorized::IDataType const> > const&, doris::segment_v2::InvertedIndexIterator*, unsigned int, roaring::Roaring*) const at /root/doris/be/src/olap/match_predicate.cpp:69
17# doris::segment_v2::SegmentIterator::_apply_inverted_index_on_column_predicate(doris::ColumnPredicate*, std::vector<doris::ColumnPredicate*, std::allocator<doris::ColumnPredicate*> >&, bool*) at /root/doris/be/src/olap/rowset/segment_v2/segment_iterator.cpp:1008
18# doris::segment_v2::SegmentIterator::_apply_inverted_index() in /mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be
19# doris::segment_v2::SegmentIterator::_get_row_ranges_by_column_conditions() at /root/doris/be/src/olap/rowset/segment_v2/segment_iterator.cpp:516
20# doris::segment_v2::SegmentIterator::_lazy_init() at /root/doris/be/src/olap/rowset/segment_v2/segment_iterator.cpp:365
21# doris::segment_v2::SegmentIterator::_next_batch_internal(doris::vectorized::Block*) at /root/doris/be/src/olap/rowset/segment_v2/segment_iterator.cpp:2176
22# doris::segment_v2::SegmentIterator::next_batch(doris::vectorized::Block*) at /root/doris/be/src/olap/rowset/segment_v2/segment_iterator.cpp:2090
23# doris::segment_v2::LazyInitSegmentIterator::next_batch(doris::vectorized::Block*) at /root/doris/be/src/olap/rowset/segment_v2/lazy_init_segment_iterator.h:45
24# doris::BetaRowsetReader::next_block(doris::vectorized::Block*) at /root/doris/be/src/olap/rowset/beta_rowset_reader.cpp:372

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@airborne12
Copy link
Member Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H: Total hot run time: 40220 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 928e5aaa1cf1cce7953b93c2312b9726db8e9a6c, data reload: false

------ Round 1 ----------------------------------
q1	17622	4509	4299	4299
q2	2025	194	192	192
q3	10517	1148	1128	1128
q4	10199	779	794	779
q5	7537	2667	2578	2578
q6	220	137	135	135
q7	965	622	598	598
q8	9216	2059	2082	2059
q9	9030	6499	6458	6458
q10	8821	3674	3680	3674
q11	448	235	237	235
q12	492	232	229	229
q13	17772	2977	2965	2965
q14	268	216	226	216
q15	516	461	483	461
q16	548	376	375	375
q17	960	693	703	693
q18	8377	8031	7695	7695
q19	6676	1537	1490	1490
q20	676	335	327	327
q21	5107	3297	3329	3297
q22	417	337	338	337
Total cold run time: 118409 ms
Total hot run time: 40220 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4567	4555	4414	4414
q2	377	265	266	265
q3	3161	2919	2761	2761
q4	1919	1664	1699	1664
q5	5491	5525	5564	5525
q6	223	129	126	126
q7	2257	1809	1833	1809
q8	3266	3398	3407	3398
q9	8699	8643	8658	8643
q10	4060	3851	3896	3851
q11	592	487	482	482
q12	812	625	614	614
q13	16085	3185	3011	3011
q14	276	259	260	259
q15	514	473	486	473
q16	479	430	433	430
q17	1835	1524	1512	1512
q18	8205	7667	7735	7667
q19	1741	1534	1522	1522
q20	2893	1938	1887	1887
q21	6001	4865	4711	4711
q22	741	580	581	580
Total cold run time: 74194 ms
Total hot run time: 55604 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 172332 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 928e5aaa1cf1cce7953b93c2312b9726db8e9a6c, data reload: false

query1	794	390	367	367
query2	6332	2459	2348	2348
query3	6467	204	213	204
query4	22334	17081	17288	17081
query5	3701	478	484	478
query6	259	168	177	168
query7	4502	303	293	293
query8	326	295	307	295
query9	8604	2410	2391	2391
query10	599	304	273	273
query11	10465	10189	9914	9914
query12	141	84	80	80
query13	1609	361	366	361
query14	10081	6187	6935	6187
query15	222	187	185	185
query16	7144	266	261	261
query17	1083	534	522	522
query18	1912	276	277	276
query19	195	152	154	152
query20	94	83	82	82
query21	212	128	129	128
query22	4324	3987	4062	3987
query23	33881	33625	33585	33585
query24	11314	2809	2907	2809
query25	647	372	358	358
query26	1372	151	149	149
query27	3009	324	321	321
query28	7550	2034	2055	2034
query29	993	622	591	591
query30	253	149	151	149
query31	942	757	742	742
query32	92	52	53	52
query33	765	286	274	274
query34	900	487	473	473
query35	747	672	631	631
query36	1117	931	960	931
query37	148	76	95	76
query38	2856	2743	2717	2717
query39	853	813	805	805
query40	210	123	135	123
query41	57	53	52	52
query42	113	95	99	95
query43	587	546	562	546
query44	1175	732	728	728
query45	194	165	160	160
query46	1062	748	697	697
query47	1863	1759	1759	1759
query48	375	311	293	293
query49	855	414	407	407
query50	757	388	406	388
query51	6791	6696	6604	6604
query52	110	88	97	88
query53	353	302	298	298
query54	851	452	442	442
query55	75	75	79	75
query56	277	281	268	268
query57	1186	1033	1038	1033
query58	255	233	240	233
query59	3588	3138	3174	3138
query60	325	270	266	266
query61	94	92	93	92
query62	623	447	444	444
query63	322	297	297	297
query64	8943	2210	1761	1761
query65	3184	3116	3099	3099
query66	1489	328	328	328
query67	15164	14844	14912	14844
query68	4494	537	537	537
query69	458	315	312	312
query70	1104	1051	1136	1051
query71	405	268	267	267
query72	7444	5462	5503	5462
query73	732	324	323	323
query74	5875	5522	5471	5471
query75	3362	2655	2727	2655
query76	2189	988	924	924
query77	456	307	307	307
query78	10364	10075	10171	10075
query79	2280	532	540	532
query80	2141	460	453	453
query81	603	220	222	220
query82	825	113	110	110
query83	301	171	169	169
query84	280	92	85	85
query85	2176	282	283	282
query86	488	310	332	310
query87	3301	3034	3142	3034
query88	3853	2354	2365	2354
query89	484	388	386	386
query90	1883	189	191	189
query91	129	103	103	103
query92	66	48	52	48
query93	2484	511	509	509
query94	1301	200	261	200
query95	411	318	320	318
query96	593	270	266	266
query97	3228	3099	3060	3060
query98	219	201	202	201
query99	1251	831	847	831
Total cold run time: 274897 ms
Total hot run time: 172332 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.69 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 928e5aaa1cf1cce7953b93c2312b9726db8e9a6c, data reload: false

query1	0.04	0.04	0.03
query2	0.09	0.04	0.04
query3	0.23	0.05	0.05
query4	1.68	0.06	0.07
query5	0.49	0.50	0.48
query6	1.12	0.73	0.75
query7	0.02	0.02	0.01
query8	0.05	0.04	0.05
query9	0.55	0.50	0.48
query10	0.54	0.55	0.54
query11	0.15	0.11	0.12
query12	0.15	0.12	0.12
query13	0.60	0.58	0.60
query14	0.78	0.79	0.79
query15	0.82	0.80	0.84
query16	0.36	0.37	0.38
query17	1.00	1.04	1.01
query18	0.23	0.25	0.23
query19	1.79	1.68	1.68
query20	0.01	0.01	0.01
query21	15.40	0.65	0.64
query22	4.11	7.70	1.88
query23	18.44	1.33	1.25
query24	2.21	0.22	0.21
query25	0.14	0.08	0.08
query26	0.27	0.18	0.17
query27	0.08	0.07	0.08
query28	13.24	1.01	1.00
query29	12.71	3.28	3.24
query30	0.26	0.06	0.06
query31	2.84	0.40	0.38
query32	3.28	0.47	0.47
query33	2.90	2.88	2.99
query34	17.33	4.70	4.66
query35	4.78	4.61	4.49
query36	0.65	0.46	0.49
query37	0.17	0.16	0.15
query38	0.15	0.14	0.14
query39	0.04	0.04	0.04
query40	0.15	0.15	0.14
query41	0.09	0.05	0.04
query42	0.05	0.05	0.04
query43	0.05	0.04	0.04
Total cold run time: 110.04 s
Total hot run time: 30.69 s

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.48% (9003/24679)
Line Coverage: 28.02% (73742/263194)
Region Coverage: 27.49% (38307/139334)
Branch Coverage: 24.18% (19520/80730)
Coverage Report: http://coverage.selectdb-in.cc/coverage/928e5aaa1cf1cce7953b93c2312b9726db8e9a6c_928e5aaa1cf1cce7953b93c2312b9726db8e9a6c/report/index.html

Copy link
Contributor

@xiaokang xiaokang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jun 18, 2024
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@qidaye qidaye merged commit ab18c0f into apache:master Jun 18, 2024
airborne12 added a commit to airborne12/apache-doris that referenced this pull request Jun 18, 2024
…r::close (apache#36420)

Deleting entries in DorisCompoundReader::close() might lead to a
coredump in certain cases, especially with problematic inverted index
files.
airborne12 added a commit to airborne12/apache-doris that referenced this pull request Jun 18, 2024
…r::close (apache#36420)

Deleting entries in DorisCompoundReader::close() might lead to a
coredump in certain cases, especially with problematic inverted index
files.
airborne12 added a commit that referenced this pull request Jun 18, 2024
…leak (#36387)

## Proposed changes

Issue Number: close #xxx

Pick from #36146 #36420
dataroaring pushed a commit that referenced this pull request Jun 21, 2024
…r::close (#36420)

Deleting entries in DorisCompoundReader::close() might lead to a
coredump in certain cases, especially with problematic inverted index
files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.4-merged dev/3.0.0-merged p0_c reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants