New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Table persistence #7120
Table persistence #7120
Conversation
In any case the idea is implement I've implemented a few just to showcase: |
Looks alright, no real comment. |
Looks sane at a glance. |
I think Played a bit with the code yesterday to see how to handle bytecode dumps and ended up with a single file that does what we want. local bitser = require("bitser")
local dump = require("dump")
local codecs = {
-- bitser: binary form, fast encode/decode, low size. Not human readable.
bitser = {
id = "Bitser serializer",
serialize = function(t, file)
print(t, file)
local ok, str = pcall(bitser.dumps, t)
if not ok then
return nil, "cannot serialize " .. tostring(t)
end
return str
end,
deserialize = function(file)
local f, err, str
f, err = io.open(file, "rb")
if not f then
return nil, err
end
str, err = f:read("*a")
f:close()
if not str then
return nil, err
end
local ok, t = pcall(bitser.loads, str)
if not ok then
return nil, "malformed serialized data"
end
return t
end,
},
-- dump: human readable, pretty printed, fast enough for most user cases.
dump = {
id = "Dump serializer",
-- by default we store plain text, but we can store lua bytecode instead
serialize = function(t, file, as_bytecode)
local content, err = dump(t)
if not content then
return nil, string.format("cannot serialize table %s: %s", t, err)
end
local str
if as_bytecode then
str, err = load("return " .. content)
if not str then
print("cannot convert table to bytecode: %s, ignoring", err)
else
print(string.format("file %s will contain bytecode", file))
str = string.dump(str, true)
end
end
if not str then
print(string.format("file %s will contain plain text", file))
str = "return " .. content
end
return str
end,
deserialize = function(file)
local ok, t, err = pcall(dofile, file)
if not ok then
return nil, err
end
return t
end,
}
}
local Persistence = {}
function Persistence:new(o)
o = o or {}
setmetatable(o, self)
self.__index = self
return o:init(o.path, o.codec)
end
function Persistence:init(path, codec)
if type(path) ~= "string" then
return nil, "path is required"
end
self.path = path
self.codec = codec or "dump"
return self
end
function Persistence:stats()
local lfs = require("lfs")
local size = lfs.attributes(self.path, "size") or -1
local exists = size ~= -1
local timestamp = "NYI"
return self.path, self.codec.id, exists, size, timestamp
end
function Persistence:loadFile()
return codecs[self.codec]:deserialize(self.path) or {}
end
function Persistence:saveFile(t, as_bytecode)
local key = self.codec
local str = codecs[key].serialize(t, self.path, as_bytecode)
self:writeFile(str)
end
function Persistence:writeFile(str)
local f, err = io.open(self.path, "wb")
if not f then
return nil, err
end
f:write(str)
f:close()
return true
end
return Persistence Still sounds a bit weak, because I would prefeer to name the file "cache" than "persistance". Anyhow, that's how it works: -- binary persistance
local bin_cache = require("persistance"):new{
path = "myfile.dat",
codec = "bitser",
}
bin_cache:saveFile(table)
print(bin_cache:stats())
local t = bin_cache:loadFile()
-- plain text persistance
local plain_cache = require("persistance"):new{
path = "otherfile.what",
}
-- write plain text
plain_cache:saveFile(table)
-- try bytecode first, it will fallback to plain text if it fails
plain_cache:saveFile(table, true) |
As far as naming goes, something along ser/deser would be fine with me (e.g., I kinda like the name of rust's https://github.com/serde-rs/serde). And I'm strongly refraining from the obvious "desser(t)" pun ;p. |
I like this one-file module. |
ferment |
Or a smaller |
Well, here's my shitty skeuomorphic take: |
lacticAcidify() -- ferment
sulfurDioxidize() -- inhibit fermentation |
I see it's going to be hard to reach name consensus 😄 I like |
frontend/persist.lua
Outdated
end | ||
end | ||
|
||
function Persist:writeFile(str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this one be public ?
If not, either rename it to Persist:_writeFile()
so we know it is not - or inline it in :saveFile()
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a legacy workaround while I played with bytecode. No longer needed. I'm going to inline it in saveFile
Tried to move Both calibre and the font list work as fine as before, but now we're able to load a table with the metadata of hundred of thousands of books in a few hundred milliseconds. |
(Beware with the settings and history files, they have some security features to avoid losing them, like "rename previous to .old and use it if current is unreadable" and ffiutil.fsyncOpenedFile() - so, I think it's better to leave them untouched :) (Unless you want to add these security features as options to Persist :) but yes, later.) |
frontend/fontlist.lua
Outdated
|
||
local t, err = cache:loadFile() | ||
if not t then | ||
print("error loading from cache", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logger.info(cache.path, err, ": initializing new one")
or something like that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing to add to @poire-z's comments right now.
@poire-z: I think I addressed all your comments. Except this one:
Does make sense to have functions for table to string serialize and string to table deserialize? a la Or maybe a single function that returns the table with a "codec" and doesn't need to create a new object, to be used like: local ser = Persist:getCodec("bitser")
local str = ser.serialize(t)
local nt = ser.deserialize(str) or local binarifier = Persist:getCodec("bitser").serialize
print(binarifier(t)) |
frontend/persist.lua
Outdated
str, err = load("return " .. content) | ||
if not str then | ||
local bytecode, err = load("return " .. dump(t)) | ||
if not bytecode then | ||
print("cannot convert table to bytecode: %s, ignoring", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logger.warn ? or is that just fine because it will never happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will never happen in the context of fontlist but it will happen if the table to serialize has + 65536 nested tables.
Done 👍
This one was solved by naming it persist.lua :) I wasn't asking for a serializer/deserializer. But if you're at it:
Yes, that's just perfect, and allows flexibility for the caller. |
It turns out that the dump "deserializer" was pretty hardcoded to be used from a file, as it was using |
Is this ready ? and not to risky for 2021.01 ? |
Yeah (as long as it passes your reviews)
Nope, but I would like to merge it at least a couple of nightlies before release.
It is safe to bump base even without merging this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No more comments, looks fine at the API level.
(I must say I skipped a bit the local ok, str =
and local t, err =
checks and flows - not fun to get into :) so trusting you on this.)
Looks sane enough at a glance, so merging to get in a few days of testing. |
A test to show the difference in encoding/decoding speed and the result cache size
This change is