Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

searchd crashes when using match() query with SENTENCING #187

Closed
vishnushettigar opened this issue Apr 25, 2019 · 33 comments

Comments

Projects
None yet
7 participants
@vishnushettigar
Copy link

commented Apr 25, 2019

Describe the environment

description:
product: ROG ZENITH EXTREME
vendor: ASUSTeK COMPUTER INC.
physical id: 0
version: Rev 1.xx
Type - 64 bits
RAM : 16GB
CPU : AMD Ryzen Threadripper 1950X 16-Core Processor

Manticore Search version:
Manticore 3.0.0 1f3ff17b@190423 dev

OS version:
Distributor ID: Ubuntu
Description: Ubuntu 18.04.2 LTS
Release: 18.04
Codename: bionic

Describe the problem

Once enough data is indexed , around 500000 rows are indexed in the rt_index , and I try match query with sentencing , searchd crashes.

Description of the issue:
This issue does not happen for all sentencing queries, only for those for which the match count is greater than 300 as per my observation.

Steps to reproduce:
We used the same data set you guys provided to do Bench-marking but instead of using Plain index we used RT index.

sphinx configuration :

index full            
{            
		type = rt            
		path = /media/m2/indexed_data/manticore3/            
        html_strip = 1            
        mlock = 1            
            
		rt_attr_uint = story_id            
		rt_attr_uint = comment_ranking            
		rt_attr_uint = author_comment_count            
		rt_attr_uint = story_comment_count            
            
		rt_attr_timestamp = story_time            
            
		rt_field = story_text            
		rt_field = story_author            
		rt_field = comment_author            
		rt_field = comment_text            
            
		rt_attr_uint = comment_id            
            
		rt_attr_string = story_author            
		rt_attr_string = comment_author            
		rt_attr_string = story_url            
}            
            
searchd            
{            
        listen = 9306:mysql41            
	    query_log = /var/log/query.log            
        log = /var/log/searchd.log            
        pid_file = /tmp/searchd.pid            
        binlog_path = /media/m2/indexed_data/manticore3/            
	    qcache_max_bytes = 0            
}            
            

This is the sql query we used :

SELECT * FROM full WHERE MATCH('node SENTENCE needs');show meta;             

Messsages from log files:

------- FATAL: CRASH DUMP -------            
[Thu Apr 25 12:08:25.728 2019] [ 1697]            
            
--- crashed SphinxQL request dump ---            
SELECT * FROM full WHERE MATCH('node SENTENCE needs')            
--- request dump end ---            
Manticore 3.0.0 1@1 dev            
Handling signal 11            
-------------- backtrace begins here ---------------            
Program compiled with 7            
Configured with flags: Configured by CMake with these definitions: -DCMAKE_BUILD_TYPE=RelWithDebInfo -DDL_UNIXODBC=1 -DUNIXODBC_LIB=libodbc.so.2 -DDL_EXPAT=1 -DEXPAT_LIB=libexpat.so.1 -DUSE_LIBICONV=1 -DDL_MYSQL=1 -DMYSQL_LIB=libmysqlclient.so.20 -DDL_PGSQL=1 -DPGSQL_LIB=libpq.so.5 -DDATADIR=/usr/local/var/data -DFULL_SHARE_DIR=/usr/local/share/manticore -DUSE_BISON=ON -DUSE_FLEX=ON -DUSE_SYSLOG=1 -DWITH_EXPAT=ON -DWITH_ICONV=ON -DWITH_MYSQL=1 -DWITH_ODBC=ON -DWITH_PGSQL=ON -DWITH_RE2=1 -DWITH_STEMMER=ON -DWITH_ZLIB=ON -DGALERA_SOVERSION=31            
Host OS is Linux mispl-wkstn-25 4.15.0-47-generic #50-Ubuntu SMP Wed Mar 13 10:44:52 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux            
Stack bottom = 0x7f45b4864ed7, thread stack size = 0x100000            
Trying manual backtrace:            
Something wrong with thread stack, manual backtrace may be incorrect (fp=0x562b7e02dcd0)            
Wrong stack limit or frame pointer, manual backtrace failed (fp=0x562b7e02dcd0, stack=0x7f45b4860000, stacksize=0x100000)            
Trying system backtrace:            
begin of system symbols:            
./searchd(_Z12sphBacktraceib 0xcb)[0x562b7dacc67b]            
./searchd(_ZN16SphCrashLogger_c11HandleCrashEi 0x18a)[0x562b7d9515da]            
/lib/x86_64-linux-gnu/libpthread.so.0( 0x12890)[0x7f45b65ff890]            
./searchd(_ZN11ExtRanker_TILb1EE15GetFilteredDocsEv 0xe2)[0x562b7dbc7be2]            
./searchd(_ZN17ExtRanker_State_TI24RankerState_Proximity_fnILb1ELb0EELb1EE10GetMatchesEv 0x7e3)[0x562b7dbd4423]            
./searchd(_ZNK13CSphIndex_VLN13MatchExtendedILb0ELb0ELb0EEEvP16CSphQueryContextPK9CSphQueryiPP15ISphMatchSorterP10ISphRankerii 0x83)[0x562b7da708f3]            
./searchd(_ZNK13CSphIndex_VLN16ParsedMultiQueryEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK9XQQuery_tP8CSphDictRK18CSphMultiQueryArgsP18CSphQueryNodeCacheRK20SphWordStatChecker_t 0x1000)[0x562b7da4f830]            
./searchd(_ZNK13CSphIndex_VLN10MultiQueryEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs 0x656)[0x562b7da5fa46]            
./searchd(_ZNK9RtIndex_c10MultiQueryEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs 0x893)[0x562b7dbfe223]            
./searchd(_ZNK9RtIndex_c12MultiQueryExEiPK9CSphQueryPP15CSphQueryResultPP15ISphMatchSorterRK18CSphMultiQueryArgs 0x79)[0x562b7dc00a79]            
./searchd(_ZN15SearchHandler_c16RunLocalSearchesEv 0x6df)[0x562b7d9948bf]            
./searchd(_ZN15SearchHandler_c9RunSubsetEii 0x27db)[0x562b7d9a8d9b]            
./searchd(_ZN15SearchHandler_c10RunQueriesEv 0xb5)[0x562b7d9a9635]            
./searchd(_Z17HandleMysqlSelectR14SqlRowBuffer_cR15SearchHandler_c 0x142)[0x562b7d9a9d82]            
./searchd(_ZN16CSphinxqlSession7ExecuteERK10CSphStringR16ISphOutputBufferRhR9ThdDesc_t 0xec2)[0x562b7d9daec2]            
./searchd( 0x202a40)[0x562b7d9afa40]            
./searchd(_ZN10ThdJobQL_t4CallEv 0x1e3)[0x562b7d9b0473]            
./searchd(_ZN11CSphThdPool4TickEPv 0x99)[0x562b7dad99e9]            
./searchd(_Z20sphThreadProcWrapperPv 0x35)[0x562b7dad71a5]            
/lib/x86_64-linux-gnu/libpthread.so.0( 0x76db)[0x7f45b65f46db]            
/lib/x86_64-linux-gnu/libc.so.6(clone 0x3f)[0x7f45b53b988f]            
-------------- backtrace ends here ---------------            
            

searchd.log

@githubmanticore githubmanticore added the bug label Apr 25, 2019

@glookka

This comment has been minimized.

Copy link
Contributor

commented Apr 25, 2019

Fixed in 850ae85

@glookka glookka closed this Apr 25, 2019

@vishnushettigar

This comment has been minimized.

Copy link
Author

commented Apr 25, 2019

@glookka , @githubmanticore with different data set we got somewhat same error with the new build

------- FATAL: CRASH DUMP -------
[Thu Apr 25 11:39:09.001 2019] [11148]

--- crashed SphinxQL request dump ---
select count(*) from some_rt_index where match('courts SENTENCE courts')
--- request dump end ---
Manticore 3.0.0 30e7e553@190425 dev
Handling signal 11
-------------- backtrace begins here ---------------
Program compiled with 7
Configured with flags: Configured by CMake with these definitions: -DCMAKE_BUILD_TYPE=RelWithDebInfo -DDL_UNIXODBC=1 -DUNIXODBC_LIB=libodbc.so.2 -DDL_EXPAT=1 -DEXPAT_LIB=libexpat.so.1 -DUSE_LIBICONV=1 -DDL_MYSQL=1 -DMYSQL_LIB=libmysqlclient.so.20 -DDATADIR=/usr/local/var/data -DFULL_SHARE_DIR=/usr/local/share/manticore -DUSE_BISON=ON -DUSE_FLEX=ON -DUSE_SYSLOG=1 -DWITH_EXPAT=ON -DWITH_ICONV=ON -DWITH_MYSQL=1 -DWITH_ODBC=ON -DWITH_RE2=1 -DWITH_ZLIB=ON -DGALERA_SOVERSION=31
Host OS is Linux mispl-wkstn-25 4.15.0-47-generic #50-Ubuntu SMP Wed Mar 13 10:44:52 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Stack bottom = 0x7f633f5dded7, thread stack size = 0x20000
Trying manual backtrace:
Something wrong with thread stack, manual backtrace may be incorrect (fp=0x561eefb58c10)
Wrong stack limit or frame pointer, manual backtrace failed (fp=0x561eefb58c10, stack=0x7f633f5e0000, stacksize=0x20000)
Trying system backtrace:
begin of system symbols:
./searchd(_Z12sphBacktraceib+0xcb)[0x561eef62bd9b]
./searchd(_ZN16SphCrashLogger_c11HandleCrashEi+0x18a)[0x561eef4ae0ea]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f633f1bb890]
./searchd(_ZN9ExtUnit_c12GetDocsChunkEv+0x35d)[0x561eef7b11fd]
./searchd(_ZN11ExtRanker_TILb1EE15GetFilteredDocsEv+0x58)[0x561eef728908]
./searchd(_ZN17ExtRanker_State_TI24RankerState_Proximity_fnILb1ELb1EELb1EE10GetMatchesEv+0x6ae)[0x561eef72e0fe]
./searchd(_ZNK9RtIndex_c10MultiQueryEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs+0x29ff)[0x561eef76111f]
./searchd(_ZNK9RtIndex_c12MultiQueryExEiPK9CSphQueryPP15CSphQueryResultPP15ISphMatchSorterRK18CSphMultiQueryArgs+0x79)[0x561eef761809]
./searchd(_ZN15SearchHandler_c16RunLocalSearchesEv+0x6df)[0x561eef4f29ef]
./searchd(_ZN15SearchHandler_c9RunSubsetEii+0x27db)[0x561eef506d4b]
./searchd(_ZN15SearchHandler_c10RunQueriesEv+0xb5)[0x561eef5075e5]
./searchd(_Z17HandleMysqlSelectR14SqlRowBuffer_cR15SearchHandler_c+0x142)[0x561eef507d32]
./searchd(_ZN16CSphinxqlSession7ExecuteERK10CSphStringR16ISphOutputBufferRhR9ThdDesc_t+0x704)[0x561eef538984]
./searchd(+0x1eb9f0)[0x561eef50d9f0]
./searchd(_ZN10ThdJobQL_t4CallEv+0x1e3)[0x561eef50e423]
./searchd(_ZN11CSphThdPool4TickEPv+0x99)[0x561eef639109]
./searchd(_Z20sphThreadProcWrapperPv+0x35)[0x561eef6368c5]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db)[0x7f633f1b06db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f633df7588f]
-------------- backtrace ends here ---------------
Please, create a bug report in our bug tracker (https://github.com/manticoresoftware/manticore/issues)
and attach there:
a) searchd log, b) searchd binary, c) searchd symbols.
Look into the chapter 'Reporting bugs' in the documentation
(http://docs.manticoresearch.com/latest/html/reporting_bugs.html)
[Thu Apr 25 11:39:09.194 2019] [11147] watchdog: got USR1, performing dump of child's stack
Will run gdb on /home/codaxtr_user/searchd, pid 11148
You can obtain the sources of this version from https://github.com/manticoresoftware/manticoresearch/archive/30e7e553.zip
and set up debug env with this shippet (select wget or curl version below):

  wget https://codeload.github.com/manticoresoftware/manticoresearch/zip/30e7e553 -O manticore.zip
  curl https://codeload.github.com/manticoresoftware/manticoresearch/zip/30e7e553 -o manticore.zip

Unpack the sources by command:
  mkdir -p /tmp/manticore && unzip manticore.zip -d /tmp/manticore

For comfortable debug also suggest to append a substitution def to your ~/.gdbinit file:
  set substitute-path "/home/mis/Projects/manticore" /tmp/manticore/manticoresearch-30e7e553
--- 1 active threads ---
thd 0, proto sphinxql, state query, command select
------- CRASH DUMP END -------
[Thu Apr 25 11:39:12.198 2019] [11147] watchdog: main process 11148 crashed via CRASH_EXIT (exit code 2), will be restarted
@glookka

This comment has been minimized.

Copy link
Contributor

commented Apr 25, 2019

Fixed this crash in dd7be98

@vishnushettigar

This comment has been minimized.

Copy link
Author

commented Apr 25, 2019

@glookka , @githubmanticore still getting same error with new build.
This issue does not happen for all sentencing queries, only for those for which the match count is greater than 300 as per my observation

------- FATAL: CRASH DUMP -------
[Thu Apr 25 13:54:50.336 2019] [32379]

--- crashed SphinxQL request dump ---
select count(*) from some_rt_index where match('courts SENTENCE courts')
--- request dump end ---
Manticore 3.0.0 eb386a09@190425 dev
Handling signal 11
-------------- backtrace begins here ---------------
Program compiled with 7
Configured with flags: Configured by CMake with these definitions: -DCMAKE_BUILD_TYPE=RelWithDebInfo -DDL_UNIXODBC=1 -DUNIXODBC_LIB=libodbc.so.2 -DDL_EXPAT=1 -DEXPAT_LIB=libexpat.so.1 -DUSE_LIBICONV=1 -DDL_MYSQL=1 -DMYSQL_LIB=libmysqlclient.so.20 -DDATADIR=/usr/local/var/data -DFULL_SHARE_DIR=/usr/local/share/manticore -DUSE_BISON=ON -DUSE_FLEX=ON -DUSE_SYSLOG=1 -DWITH_EXPAT=ON -DWITH_ICONV=ON -DWITH_MYSQL=1 -DWITH_ODBC=ON -DWITH_RE2=1 -DWITH_ZLIB=ON -DGALERA_SOVERSION=31
Host OS is Linux mispl-wkstn-25 4.15.0-47-generic #50-Ubuntu SMP Wed Mar 13 10:44:52 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Stack bottom = 0x7f6cdbca6ed7, thread stack size = 0x20000
Trying manual backtrace:
Something wrong with thread stack, manual backtrace may be incorrect (fp=0x55801817dc10)
Wrong stack limit or frame pointer, manual backtrace failed (fp=0x55801817dc10, stack=0x7f6cdbcb0000, stacksize=0x20000)
Trying system backtrace:
begin of system symbols:
./searchd(_Z12sphBacktraceib+0xcb)[0x558017c50d9b]
./searchd(_ZN16SphCrashLogger_c11HandleCrashEi+0x18a)[0x558017ad30ea]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f6cdb8a5890]
./searchd(_ZN9ExtUnit_c12GetDocsChunkEv+0x35d)[0x558017dd61ad]
./searchd(_ZN11ExtRanker_TILb1EE15GetFilteredDocsEv+0x58)[0x558017d4d908]
./searchd(_ZN17ExtRanker_State_TI24RankerState_Proximity_fnILb1ELb1EELb1EE10GetMatchesEv+0x6ae)[0x558017d530fe]
./searchd(_ZNK9RtIndex_c10MultiQueryEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs+0x29ff)[0x558017d8611f]
./searchd(_ZNK9RtIndex_c12MultiQueryExEiPK9CSphQueryPP15CSphQueryResultPP15ISphMatchSorterRK18CSphMultiQueryArgs+0x79)[0x558017d86809]
./searchd(_ZN15SearchHandler_c16RunLocalSearchesEv+0x6df)[0x558017b179ef]
./searchd(_ZN15SearchHandler_c9RunSubsetEii+0x27db)[0x558017b2bd4b]
./searchd(_ZN15SearchHandler_c10RunQueriesEv+0xb5)[0x558017b2c5e5]
./searchd(_Z17HandleMysqlSelectR14SqlRowBuffer_cR15SearchHandler_c+0x142)[0x558017b2cd32]
./searchd(_ZN16CSphinxqlSession7ExecuteERK10CSphStringR16ISphOutputBufferRhR9ThdDesc_t+0x704)[0x558017b5d984]
./searchd(+0x1eb9f0)[0x558017b329f0]
./searchd(_ZN10ThdJobQL_t4CallEv+0x1e3)[0x558017b33423]
./searchd(_ZN11CSphThdPool4TickEPv+0x99)[0x558017c5e109]
./searchd(_Z20sphThreadProcWrapperPv+0x35)[0x558017c5b8c5]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db)[0x7f6cdb89a6db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f6cda65f88f]
-------------- backtrace ends here ---------------
Please, create a bug report in our bug tracker (https://github.com/manticoresoftware/manticore/issues)
and attach there:
a) searchd log, b) searchd binary, c) searchd symbols.
Look into the chapter 'Reporting bugs' in the documentation
(http://docs.manticoresearch.com/latest/html/reporting_bugs.html)
[Thu Apr 25 13:54:50.717 2019] [32378] watchdog: got USR1, performing dump of child's stack
Will run gdb on /home/codaxtr_user/searchd, pid 32379
You can obtain the sources of this version from https://github.com/manticoresoftware/manticoresearch/archive/eb386a09.zip
and set up debug env with this shippet (select wget or curl version below):

  wget https://codeload.github.com/manticoresoftware/manticoresearch/zip/eb386a09 -O manticore.zip
  curl https://codeload.github.com/manticoresoftware/manticoresearch/zip/eb386a09 -o manticore.zip

Unpack the sources by command:
  mkdir -p /tmp/manticore && unzip manticore.zip -d /tmp/manticore

For comfortable debug also suggest to append a substitution def to your ~/.gdbinit file:
  set substitute-path "/home/mis/Projects/manticore" /tmp/manticore/manticoresearch-eb386a09
--- 1 active threads ---
thd 0, proto sphinxql, state query, command select
------- CRASH DUMP END -------
[Thu Apr 25 13:54:53.726 2019] [32378] watchdog: main process 32379 crashed via CRASH_EXIT (exit code 2), will be restarted

@githubmanticore

This comment has been minimized.

Copy link
Contributor

commented Apr 25, 2019

➤ Ilya Kuznetsov commented:

Could you upload your index and config to our write-only ftp

ftp: dev.manticoresearch.com
user: manticorebugs
pass: shithappens

as described here: https://github.com/manticoresoftware/manticoresearch/wiki/Write-only-FTP

@vishnushettigar

This comment has been minimized.

Copy link
Author

commented Apr 26, 2019

@githubmanticore @glookka We have uploaded the the necessary data to reproduce the bug in ftp server in directory - github-issue-187.

@githubmanticore

This comment has been minimized.

Copy link
Contributor

commented Apr 26, 2019

➤ Ilya Kuznetsov commented:

Should be fixed in 7e13d37

@vishnushettigar

This comment has been minimized.

Copy link
Author

commented Apr 26, 2019

➤ Ilya Kuznetsov commented:

Should be fixed in 7e13d37

Verified that it is fixed . Thanks.

@glookka glookka closed this Apr 26, 2019

@abhijo89-uc

This comment has been minimized.

Copy link

commented May 6, 2019

@glookka @vishnushettigar @manticoresearch Seems like this issue still exist

After indexing 55,000,000 records

------- FATAL: CRASH DUMP -------
[Mon May  6 05:09:01.236 2019] [35774]

--- crashed SphinxQL request dump ---
SELECT student_id, student_info_id, last_updated_ts, data.cc,SNIPPET(student_name,
 '(john)', 'limit=200', 'before_match=<b>', 'after_match=</b>', 'allow_empty=0','weight_order=1') 
as student_name_snippet, 
        weight() AS weight_sum FROM all_student WHERE MATCH('(@student_fullname(john 
SENTENCE "Computer"))') ORDER BY joining_date_ts DESC LIMIT 0,10 OPTION ranker=bm25,
 max_matches=1000;SHOW META
--- request dump end ---
Manticore 3.0.0 2373d4b2@190426 dev
Handling signal 11
-------------- backtrace begins here ---------------
Program compiled with 7
Configured with flags: Configured by CMake with these definitions: -DCMAKE_BUILD_TYPE=RelWithDebInfo -DDL_UNIXODBC=1 -DUNIXODBC_LIB=libodbc.so.2 -DDL_EXPAT=1 -DEXPAT_LIB=libexpat.so.1 -DUSE_LIBICONV=1 -DDL_MYSQL=1 -DMYSQL_LIB=libmysqlclient.so.20 -DDATADIR=/usr/local/var/data -DFULL_SHARE_DIR=/usr/local/share/manticore -DUSE_BISON=ON -DUSE_FLEX=ON -DUSE_SYSLOG=1 -DWITH_EXPAT=ON -DWITH_ICONV=ON -DWITH_MYSQL=1 -DWITH_ODBC=ON -DWITH_RE2=1 -DWITH_ZLIB=ON -DGALERA_SOVERSION=31
Host OS is Linux mispl-wkstn-25 4.15.0-47-generic #50-Ubuntu SMP Wed Mar 13 10:44:52 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Stack bottom = 0x7f67f0bd9ed7, thread stack size = 0x20000
Trying manual backtrace:
Something wrong with thread stack, manual backtrace may be incorrect (fp=0x5596558d0c10)
Wrong stack limit or frame pointer, manual backtrace failed (fp=0x5596558d0c10, stack=0x7f67f0be0000, stacksize=0x20000)
Trying system backtrace:
begin of system symbols:
/usr/bin/searchd(_Z12sphBacktraceib+0xcb)[0x5596553a3d9b]
/usr/bin/searchd(_ZN16SphCrashLogger_c11HandleCrashEi+0x18a)[0x5596552260ea]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f67f68b8890]
/usr/bin/searchd(_ZN9ExtUnit_c12GetDocsChunkEv+0x391)[0x5596555291e1]
/usr/bin/searchd(_ZN11ExtRanker_TILb1EE15GetFilteredDocsEv+0x58)[0x5596554a0908]
/usr/bin/searchd(_ZN21ExtRanker_WeightSum_cILb1EE10GetMatchesEv+0x17a)[0x5596554a479a]
/usr/bin/searchd(_ZNK13CSphIndex_VLN13MatchExtendedILb1ELb0ELb0EEEvP16CSphQueryContextPK9CSphQueryiPP15ISphMatchSorterP10ISphRankerii+0x8b)[0x559655349f8b]
/usr/bin/searchd(_ZNK13CSphIndex_VLN16ParsedMultiQueryEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK9XQQuery_tP8CSphDictRK18CSphMultiQueryArgsP18CSphQueryNodeCacheRK20SphWordStatChecker_t+0x1000)[0x5596553269d0]
/usr/bin/searchd(_ZNK13CSphIndex_VLN10MultiQueryEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs+0x656)[0x559655336b76]
/usr/bin/searchd(_ZNK9RtIndex_c10MultiQueryEPK9CSphQueryP15CSphQueryResultiPP15ISphMatchSorterRK18CSphMultiQueryArgs+0x883)[0x5596554d6fa3]
/usr/bin/searchd(_ZNK9RtIndex_c12MultiQueryExEiPK9CSphQueryPP15CSphQueryResultPP15ISphMatchSorterRK18CSphMultiQueryArgs+0x79)[0x5596554d9809]
/usr/bin/searchd(_ZN15SearchHandler_c16RunLocalSearchMTER13LocalSearch_tR13ThreadLocal_t+0x401)[0x55965522e9a1]
/usr/bin/searchd(_Z21LocalSearchThreadFuncPv+0x1b8)[0x55965522ebf8]
/usr/bin/searchd(_ZN16SphCrashLogger_c13ThreadWrapperEPv+0x47)[0x559655225827]
/usr/bin/searchd(_Z20sphThreadProcWrapperPv+0x35)[0x5596553ae8c5]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db)[0x7f67f68ad6db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f67f567288f]
-------------- backtrace ends here ---------------
@githubmanticore

This comment has been minimized.

Copy link
Contributor

commented May 6, 2019

➤ Ilya Kuznetsov commented:

How large is your index? Could you upload the index and config to our write-only ftp

ftp: dev.manticoresearch.com
user: manticorebugs
pass: shithappens

as described here: https://github.com/manticoresoftware/manticoresearch/wiki/Write-only-FTP

@abhijo89-uc

This comment has been minimized.

Copy link

commented May 6, 2019

@githubmanticore : Unfortunately it's confidential data which we cant share. We try to reproduce with different name it works but only when we do this query it fails. Might be "John" is a common name and matching 60% of data set. If you do some changes we can do the test from my end by building the binary .
Total index size is : ~ 400GB ( du -shc * in index folder shows )

@githubmanticore

This comment has been minimized.

Copy link
Contributor

commented May 6, 2019

➤ Ilya Kuznetsov commented:

Can you start searchd with a --coredump option and upload your binary and the core dump it generated after the crash?

@abhijo89-uc

This comment has been minimized.

Copy link

commented May 6, 2019

@githubmanticore : Which location i can find coredump file ?

@tomatolog

This comment has been minimized.

Copy link
Contributor

commented May 6, 2019

these should show you location of core dump file

cat /proc/sys/kernel/core_pattern

You could also change it.

You also need to check that ulimit -c is big enough for user that start searchd

@abhijo89-uc

This comment has been minimized.

Copy link

commented May 6, 2019

@tomatolog Seems like searchd is running as manticore user /proc/sys/kernel/core_pattern zero size

manticore@prod.search.2:/proc/sys/kernel$ ll
dr-xr-xr-x 1 root root 0 May  3 11:00 ./
dr-xr-xr-x 1 root root 0 May  3 11:00 ../
-rw-r--r-- 1 root root 0 May  6 08:32 acct
-rw-r--r-- 1 root root 0 May  6 08:32 acpi_video_flags
-rw-r--r-- 1 root root 0 May  6 08:32 auto_msgmni
-r--r--r-- 1 root root 0 May  6 08:32 bootloader_type
-r--r--r-- 1 root root 0 May  6 08:32 bootloader_version
-rw------- 1 root root 0 May  6 08:32 cad_pid
-r--r--r-- 1 root root 0 May  3 11:00 cap_last_cap
**-rw-r--r-- 1 root root 0 May  6 08:32 core_pattern**
-rw-r--r-- 1 root root 0 May  6 08:32 core_pipe_limit
-rw-r--r-- 1 root root 0 May  6 08:32 core_uses_pid
-rw-r--r-- 1 root root 0 May  6 08:32 ctrl-alt-del
-rw-r--r-- 1 root root 0 May  6 08:32 dmesg_restrict
-rw-r--r-- 1 root root 0 May  6 08:32 domainname
-rw-r--r-- 1 root root 0 May  6 08:32 ftrace_dump_on_oops
-rw-r--r-- 1 root root 0 May  6 08:32 ftrace_enabled
-rw-r--r-- 1 root root 0 May  6 08:32 hardlockup_all_cpu_backtrace
-rw-r--r-- 1 root root 0 May  6 08:32 hardlockup_panic
-rw-r--r-- 1 root root 0 May  3 11:00 hostname
``
manticore@prod.search.2:/proc/sys/kernel$ ulimit -c 
unlimited
@tomatolog

This comment has been minimized.

Copy link
Contributor

commented May 6, 2019

you might set core location like this

echo "/tmp/core.%e.%p" > /proc/sys/kernel/core_pattern

however work only till box restart and not sure daemon should be restarted after that change

To make it persistent you have to change /etc/sysctl.conf

kernel.core_pattern=/tmp/core.%e.%p
@abhijo89-uc

This comment has been minimized.

Copy link

commented May 6, 2019

I did all the steps but i could not generate coredump file .
When i try just for sleep it create Example

sleep 20 &
pkill -11 sleep 

this generate core dump file in /tmp folder for manticore user . (Means my system is working fine)

Any thing else i can do ?

@tomatolog

This comment has been minimized.

Copy link
Contributor

commented May 6, 2019

to save coredump on searchd crash you have to start daemon with --coredump option, like

searchd  --coredump -c sphinx.conf

Have you change your startup script?

@abhijo89-uc

This comment has been minimized.

Copy link

commented May 6, 2019

Yes. Did it in service file generator and reloaded the demon

@tomatolog

This comment has been minimized.

Copy link
Contributor

commented May 7, 2019

you might attach gdb prior to crash, like

$>gdb searchd
gdb>attach daemon_pid
gdb>continue

then issue query that cause the crash and gdb will pop up

gdb>thread apply all bt
gdb>generate-core-file core_path_name
@abhijo89-uc

This comment has been minimized.

Copy link

commented May 7, 2019

@manticoresearch With your new release i could reproduce the issue and generate coredump file
This file is big, around 148G.

no_user@demo2:/tmp$ ll -h
total 136G
drwxrwxrwt  9 root      root       12K May  7 07:43 ./
drwxr-xr-x 23 root      root      4.0K May  7 06:27 ../
drwxrwxrwt  2 root      root      4.0K May  7 06:27 .ICE-unix/
drwxrwxrwt  2 root      root      4.0K May  7 06:27 .Test-unix/
drwxrwxrwt  2 root      root      4.0K May  7 06:27 .X11-unix/
drwxrwxrwt  2 root      root      4.0K May  7 06:27 .XIM-unix/
drwxrwxrwt  2 root      root      4.0K May  7 06:27 .font-unix/
-rwxrwxrwx  1 manticore manticore 148G May  7 07:17 core.LocalSearch.3299*

We can't download and send it through ftp.

@tomatolog

This comment has been minimized.

Copy link
Contributor

commented May 7, 2019

we do not need the core file as it might be used only at box there it got generated.
From core file we need

gdb>thread apply all bt
gdb>info locals

and maybe other data later or provide access to your box there we could inspect info at generated core file

@abhijo89-uc

This comment has been minimized.

Copy link

commented May 7, 2019

@manticoresearch i attached thread apply all bt result but when i try for info locals says "No symbol table info available."

(gdb) info locals
No symbol table info available.

backtrace.txt

@tomatolog

This comment has been minimized.

Copy link
Contributor

commented May 7, 2019

could you load corefile you saved into gdb

gdb>core-file core_path_name

get up in stack to CSphIndex_VLN

gdb>thread 1
gdb> up 4

you should be at Thread 1 (Thread 0x7f62497ed700 (LWP 9595)) at this level CSphIndex_VLN::ParsedMultiQuery

gdb>p this->m_sFilename
gdb>p m_sFilename

it prints disk chunk path and file name that causes this crash then upload to our write only FTP only this disk chunk files, they should look somewhere similar to path/rt_name.13.sp*

path \ rt_name \ disk chunk number should be obtained from gdb p output

@abhijo89

This comment has been minimized.

Copy link
Contributor

commented May 10, 2019

@vishnushettigar will take forward from where @abhijo89-uc left .

@vishnu-uc

This comment has been minimized.

Copy link

commented May 10, 2019

@githubmanticore , @glookka , @tomatolog I have uploaded the data in ftp://dev.manticoresearch.com/github-issue-187/vishnu-uc/ this directory. Please check. The archive file contains configuration file as well.

@vishnu-uc

This comment has been minimized.

Copy link

commented May 10, 2019

@glookka , @githubmanticore , @tomatolog I have uploaded the query which causes the crash in /github-issue-187/vishnu-uc/query.txt file.

@tomatolog

This comment has been minimized.

Copy link
Contributor

commented May 10, 2019

you uploaded ram part of index however BT from crash shows it that crash was in disk chunk. To investigate this crash I need exact disk chunk that cause this crash, that disk chunk files look like data_index.XX.sp*

You might upload all your RT index (ram part along with disk chunks) however as you said it quite big I suggest to get disk chunk that cause the crash I described here

@tomatolog

This comment has been minimized.

Copy link
Contributor

commented May 10, 2019

Nevermind I just found that you uploaded query that cause crash at that RT index. I see the crash now and inform you on fix of this issue.

@vishnu-uc

This comment has been minimized.

Copy link

commented May 10, 2019

Nevermind I just found that you uploaded query that cause crash at that RT index. I see the crash now and inform you on fix of this issue.

@tomatolog Please let us know if you need more data to reproduce the bug.

@tomatolog

This comment has been minimized.

Copy link
Contributor

commented May 10, 2019

I've just fixed this crash at 055586a. You have to update daemon to get it fixed.

@tomatolog tomatolog added the Resolved label May 10, 2019

@vishnu-uc

This comment has been minimized.

Copy link

commented May 10, 2019

I've just fixed this crash at 055586a. You have to update daemon to get it fixed.

@tomatolog Great thanks a lot .Will update daemon. Thanks.

@abhijo89-uc

This comment has been minimized.

Copy link

commented May 15, 2019

We did testing for the fix . Looks good. Thanks @manticoresearch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.