Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iproto.eval + box.session.storage == memory leak #3279

Closed
unera opened this issue Mar 25, 2018 · 2 comments
Closed

iproto.eval + box.session.storage == memory leak #3279

unera opened this issue Mar 25, 2018 · 2 comments
Assignees
Labels
app bug Something isn't working
Milestone

Comments

@unera
Copy link
Collaborator

unera commented Mar 25, 2018

my connector uses iproto.eval to add some service fucntions into box.session.storage.

So each version of my app uses its own functions.

Yesterday my app was crashed (see fault trace).

I wrote a testcase to reproduce the problem:

#!/usr/bin/tarantool

box.cfg{ listen = 3302 }
box.schema.user.create('test', { password = 'test', if_not_exists = true })
box.schema.user.grant('test', 'read,write,execute', 'universe', nil, {if_not_exists=true})

local r = require 'net.box'
local log = require 'log'
local fiber = require 'fiber'
local json = require 'json'

local uri = 'test:test@localhost:3302'

local no = 0
local prev_session = 0
while true do
    local c = r.connect(uri)

    local foo = string.format('box.session.storage.foo%s', no)
    no = no + 1

    local ev = string.format([[
        %s = function()
            local res = box.runtime.info()
            res.session = box.session.id()
            return res   
        end
    ]], foo)


    c:eval(ev)
    local info = c:call(foo)
    c:close()


    if prev_session == info.session then
        box.error("Someone broken")
    end
    prev_session = info.session


    if no % 1000 == 0 then
        log.info('%6d: Lua mem: %10d, used: %10d',
            no / 1000, info.lua, info.used)
    end

--  uncomment the following lines if You have socket problem while it is running
--     if no % 10000 == 0 then
--         fiber.sleep(10) -- TIME_WAIT sockets
--     end
    
    if no % 100000 == 0 then
        log.info('%6d: Collecting garbage', no / 1000)
        collectgarbage('collect')
    end
end

log of the script:

....
2018-03-25 12:51:08.351 [25765] main/101/leak.lua I>    546: Lua mem:  321882717, used:   12582912
2018-03-25 12:51:09.194 [25765] main/101/leak.lua I>    547: Lua mem:  366923731, used:   12582912
2018-03-25 12:51:10.341 [25765] main/101/leak.lua I>    548: Lua mem:  188786289, used:   12582912
2018-03-25 12:51:11.179 [25765] main/101/leak.lua I>    549: Lua mem:  232037116, used:   12582912
2018-03-25 12:51:11.842 [25765] main/101/leak.lua I>    550: Lua mem:  277078550, used:   12582912
2018-03-25 12:51:12.489 [25765] main/101/leak.lua I>    551: Lua mem:  322513200, used:   12582912
2018-03-25 12:51:13.319 [25765] main/101/leak.lua I>    552: Lua mem:  367557574, used:   12582912
2018-03-25 12:51:14.416 [25765] main/101/leak.lua I>    553: Lua mem:  234899432, used:   12582912
2018-03-25 12:51:15.288 [25765] main/101/leak.lua I>    554: Lua mem:  230267603, used:   12582912
2018-03-25 12:51:15.888 [25765] main/101/leak.lua I>    555: Lua mem:  275312397, used:   12582912
2018-03-25 12:51:16.537 [25765] main/101/leak.lua I>    556: Lua mem:  320746207, used:   12582912
2018-03-25 12:51:17.355 [25765] main/101/leak.lua I>    557: Lua mem:  365786801, used:   12582912
.....
2018-03-25 12:55:10.114 [25765] main/101/leak.lua I>    804: Lua mem:  416299232, used:   12582912
2018-03-25 12:55:10.906 [25765] main/101/leak.lua I>    805: Lua mem:  461340666, used:   12582912
2018-03-25 12:55:12.299 [25765] main/101/leak.lua I>    806: Lua mem:  507172312, used:   12582912
2018-03-25 12:55:13.272 [25765] main/101/leak.lua I>    807: Lua mem:  552213746, used:   12582912
2018-03-25 12:55:14.748 [25765] main/101/leak.lua I>    808: Lua mem:  275439349, used:   12582912
2018-03-25 12:55:15.531 [25765] main/101/leak.lua I>    809: Lua mem:  318816348, used:   12582912
2018-03-25 12:55:16.186 [25765] main/101/leak.lua I>    810: Lua mem:  364251838, used:   12582912
2018-03-25 12:55:17.247 [25765] main/101/leak.lua I>    811: Lua mem:  409292432, used:   12582912
2018-03-25 12:55:18.098 [25765] main/101/leak.lua I>    812: Lua mem:  454334286, used:   12582912
2018-03-25 12:55:19.078 [25765] main/101/leak.lua I>    813: Lua mem:  500165176, used:   12582912
2018-03-25 12:55:20.025 [25765] main/101/leak.lua I>    814: Lua mem:  545206610, used:   12582912
2018-03-25 12:55:21.267 [25765] main/101/leak.lua I>    815: Lua mem:  361140658, used:   12582912
2018-03-25 12:55:22.897 [25765] main/101/leak.lua I>    816: Lua mem:  306307372, used:   12582912
....
2018-03-25 13:13:24.420 [25765] main/101/leak.lua I>   1657: Lua mem: 1076951992, used:   12582912
2018-03-25 13:13:25.925 [25765] main/101/leak.lua I>   1658: Lua mem: 1121998266, used:   12582912
2018-03-25 13:13:27.675 [25765] main/101/leak.lua I>   1659: Lua mem: 1161697045, used:   12582912
2018-03-25 13:13:30.134 [25765] main/101/leak.lua I>   1660: Lua mem:  557027099, used:   12582912
2018-03-25 13:13:31.401 [25765] main/101/leak.lua I>   1661: Lua mem:  587848737, used:   12582912
2018-03-25 13:13:32.226 [25765] main/101/leak.lua I>   1662: Lua mem:  633289907, used:   12582912
2018-03-25 13:13:32.951 [25765] main/101/leak.lua I>   1663: Lua mem:  678334921, used:   12582912
2018-03-25 13:13:33.667 [25765] main/101/leak.lua I>   1664: Lua mem:  723380355, used:   12582912
2018-03-25 13:13:34.402 [25765] main/101/leak.lua I>   1665: Lua mem:  769213481, used:   12582912
2018-03-25 13:13:35.372 [25765] main/101/leak.lua I>   1666: Lua mem:  814261015, used:   12582912
....

The bug is not in net.box, because I stubmled on the bug through perl connector.

@unera
Copy link
Collaborator Author

unera commented Mar 25, 2018

box.session.storage destructor is broken:

> c = 0 for sid in pairs(getmetatable(box.session).aggregate_storage) do if not box.session.exists(sid) then c = c + 1 end end
---
...

> c
---
- 12602

@unera
Copy link
Collaborator Author

unera commented Mar 25, 2018

cleanup test case:

#!/usr/bin/tarantool

box.cfg{ listen = 3302 }
box.schema.user.create('test', { password = 'test', if_not_exists = true })
box.schema.user.grant('test', 'read,write,execute', 'universe', nil, {if_not_exists=true})

local r = require 'net.box'
local log = require 'log'
local fiber = require 'fiber'
local json = require 'json'

local uri = 'test:test@localhost:3302'

local no = 0
while true do
    local c = r.connect(uri)

    c:eval [[
        for i = 1, 10000 do
            box.session.storage[i] = math.random()
        end
    ]]
    
    
    no = no + 1

    if no % 1000 == 0 then
        local info = box.runtime.info()
        log.info('%6d: Lua mem: %10d, used: %10d',
            no / 1000, info.lua, info.used)
    end
    
    if no % 10000 == 0 then
        log.info('%6d: Collecting garbage', no / 1000)
        collectgarbage('collect')
    end
    
    c:close()
end

result:

2018-03-25 18:47:57.889 [29750] main/101/leak.lua C> version 1.7.2-388-g51c2d55
2018-03-25 18:47:57.889 [29750] main/101/leak.lua C> log level 5
2018-03-25 18:47:57.889 [29750] main/101/leak.lua I> mapping 1073741824 bytes for tuple arena...
2018-03-25 18:47:57.890 [29750] main/101/leak.lua I> recovery start
2018-03-25 18:47:57.890 [29750] main/101/leak.lua I> recovering from `./00000000000000000000.snap'
2018-03-25 18:47:57.893 [29750] main/101/leak.lua I> recover from `./00000000000000000000.xlog'
2018-03-25 18:47:57.893 [29750] main/101/leak.lua I> done `./00000000000000000000.xlog'
2018-03-25 18:47:57.893 [29750] main/101/leak.lua I> recover from `./00000000000000000002.xlog'
2018-03-25 18:47:57.894 [29750] main/101/leak.lua I> done `./00000000000000000002.xlog'
2018-03-25 18:47:57.894 [29750] main/102/hot_standby I> recover from `./00000000000000000002.xlog'
2018-03-25 18:47:57.894 [29750] iproto/102/iproto I> binary: started
2018-03-25 18:47:57.895 [29750] iproto/102/iproto I> binary: bound to 0.0.0.0:3302
2018-03-25 18:47:57.895 [29750] main/101/leak.lua I> done `./00000000000000000002.xlog'
2018-03-25 18:47:57.895 [29750] main/101/leak.lua I> ready to accept requests
2018-03-25 18:47:59.846 [29750] main/101/leak.lua I>      1: Lua mem:  133597306, used:   12582912
2018-03-25 18:48:01.670 [29750] main/101/leak.lua I>      2: Lua mem:  276262724, used:   12582912
2018-03-25 18:48:03.347 [29750] main/101/leak.lua I>      3: Lua mem:  451157223, used:   12582912
2018-03-25 18:48:05.201 [29750] main/101/leak.lua I>      4: Lua mem:  569303284, used:   12582912
2018-03-25 18:48:07.041 [29750] main/101/leak.lua I>      5: Lua mem:  744340031, used:   12582912
2018-03-25 18:48:09.111 [29750] main/101/leak.lua I>      6: Lua mem:  819163529, used:   12582912
2018-03-25 18:48:11.013 [29750] main/101/leak.lua I>      7: Lua mem:  994217584, used:   12582912
2018-03-25 18:48:12.689 [29750] main/101/leak.lua I>      8: Lua mem: 1168973874, used:   12582912
2018-03-25 18:48:14.510 [29750] main/101/leak.lua I>      9: Lua mem: 1344191076, used:   12582912
2018-03-25 18:48:16.696 [29750] main/101/leak.lua I>     10: Lua mem: 1342927643, used:   12582912
2018-03-25 18:48:16.696 [29750] main/101/leak.lua I>     10: Collecting garbage
2018-03-25 18:48:18.643 [29750] main/101/leak.lua I>     11: Lua mem: 1487792468, used:   12582912
2018-03-25 18:48:20.410 [29750] main/101/leak.lua I>     12: Lua mem: 1662641843, used:   12582912
2018-03-25 18:48:22.167 [29750] main/101/leak.lua I>     13: Lua mem: 1837843241, used:   12582912
2018-03-25 18:48:23.571 [29750] main/101/leak.lua F> not enough memory

@kostja kostja added this to the 1.9.1 milestone Mar 25, 2018
@kostja kostja added bug Something isn't working app labels Mar 25, 2018
GeorgyKirichenko pushed a commit that referenced this issue Mar 27, 2018
* session_run_on_disconnect_triggers is called only if there are
corresponding triggers so move session_storage_cleanup to
session_destroy.
* fix session storage cleanup path: use
"box.session.aggregate_storage[sid]" instead of
"session.aggregate_storage[sid]" (what was wrong)

Fixed #3279
@kostja kostja closed this as completed in cd48321 Mar 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
app bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants