Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spurious crash on use-after-delete in kqueue polling (master-agent) #1312

Closed
githubmanticore opened this issue Aug 1, 2023 · 1 comment
Closed

Comments

@githubmanticore
Copy link
Contributor

Recognized on the base of #2164 with these configs:

1-st instance - manticorel.conf

searchd 
{ 
    listen = 9312 
    listen = 9306:mysql 
 
    pid_file = searchd.pid 
    binlog_path = # disable logging 
    persistent_connections_limit = 30 
 
    log = searchd.log 
    query_log = query.log 
} 
 
 
index YoPPoY 
{ 
    type = template 
    charset_table   = non_cjk 
    morphology      = libstemmer_fr 
    stopwords       = fr 
} 
 
index WebPages0 : YoPPoY 
{ 
    type = plain 
    path        = data/WebPages0 
} 
 
index WebPages1 : YoPPoY 
{ 
    type = plain 
    path        = data/WebPages1 
} 
 
index WebPages2 : YoPPoY 
{ 
    type = plain 
    path        = data/WebPages2 
} 
 
index WebPages3 : YoPPoY 
{ 
    type = plain 
    path        = data/WebPages3 
} 
 
index WebPages 
{ 
    type    = distributed 
 
    local   = WebPages0 
    local   = WebPages1 
    local   = WebPages2 
    local   = WebPages3 
} 

2-nd instance (main one)

searchd 
{ 
    listen = 9307:mysql 
    pid_file = searchdd.pid 
    max_packet_size = 30M 
    log = searchdd.log 
    query_log = queryd.log 
} 
 
index WebPagesLB 
{ 
     type             = distributed 
     agent = localhost:9312:WebPages 
     ha_strategy      = nodeads 
} 

Run 1-st instance - ./searchd -c manticorel.conf

Then run main instance - ./searchd -c manticore.conf, and perform

mysql -h0 -P9307

SELECT Id, SNIPPET(Body, QUERY()) FROM WebPagesLB WHERE MATCH('maintenant'); 

Repeat last query several times (it will most probably return ERROR 1064 (42000): index WebPagesLB: agent localhost:9312: connect timed out - that is perfectly ok).
After several tries searchd crashes.
(That is best to run it under debugger to localize the problem).

Reason is that with kqueue we invoke 'minimal' syscall, as kqueue allows to do.
That is - we collect all changes to sockets polling, and then invoke one single kqueue() call, which performs everything, as - schedule all new conditions, then wait for new changes until timeout.

However for timed-outed connections it causes a race: we schedule removing of such connection, then finally delete it. And then, on actual invoking of kqueue, this scheduled connection is both deleted and closed. Invoking callback over it causes UB, which is in some cases falls to crash (and in other cases still be UB - that is - undefined behavior).

@sanikolaev
Copy link
Collaborator

Done in a06a553

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants