Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current cache size: -1630142464 B #9860

Closed
gerhaher opened this issue Dec 3, 2022 · 17 comments
Closed

Current cache size: -1630142464 B #9860

gerhaher opened this issue Dec 3, 2022 · 17 comments

Comments

@gerhaher
Copy link

gerhaher commented Dec 3, 2022

  • KOReader version: 2022.11
  • Device: Kobo Libra2

Issue

+10 000 books & Mosaic with cover images.

InkedFileManager_2022-12-03_092007

@Frenzie
Copy link
Member

Frenzie commented Dec 3, 2022

That's strange, I wonder if it could be an issue in lfs?

-- DB management
function BookInfoManager:getDbSize()
local file_size = lfs.attributes(self.db_location, "size") or 0
return util.getFriendlySize(file_size)
end

koreader/frontend/util.lua

Lines 1147 to 1167 in fa9f0ac

--- Gets human friendly size as string
---- @int size (bytes)
---- @bool right_align (by padding with spaces on the left)
---- @treturn string
function util.getFriendlySize(size, right_align)
local frac_format = right_align and "%6.1f" or "%.1f"
local deci_format = right_align and "%6d" or "%d"
size = tonumber(size)
if not size or type(size) ~= "number" then return end
if size > 1000*1000*1000 then
return T(C_("Data storage size", "%1 GB"), string.format(frac_format, size/1000/1000/1000))
end
if size > 1000*1000 then
return T(C_("Data storage size", "%1 MB"), string.format(frac_format, size/1000/1000))
end
if size > 1000 then
return T(C_("Data storage size", "%1 kB"), string.format(frac_format, size/1000))
else
return T(C_("Data storage size", "%1 B"), string.format(deci_format, size))
end
end

@NiLuJe
Copy link
Member

NiLuJe commented Dec 3, 2022

lfs returns a lua_Integer (e.g., ptrdiff_t, signed), and we then squash that into a Lua number, so there may very well be an overflow somewhere since we don't have 64bits of precision available anywhere.

What's the actual size of the db as returned by stat?

@Frenzie
Copy link
Member

Frenzie commented Dec 3, 2022

An overflow is an obvious thought, but could the cache size really be much larger than a few hundred MB?

@NiLuJe
Copy link
Member

NiLuJe commented Dec 3, 2022

Err, true, plus, FAT32 means we can't blow past 2GB anyway, which would be the actual overflow threshold for INT32_MAX in bytes.

The fact that the size check in getFriendlySize appears to point to the "bytes" branch is doubly weird, and probably hints at the overflow happening either directly in lfs during the lua_Integer push, or in the tonumber cast?

@gerhaher
Copy link
Author

gerhaher commented Dec 4, 2022

What's the actual size of the db as returned by stat?

Where can I see this info?

@Frenzie
Copy link
Member

Frenzie commented Dec 4, 2022

I think it's the settings/bookinfo_cache.sqlite3 file.

@gerhaher
Copy link
Author

gerhaher commented Dec 4, 2022

OK, that file is 2 674 468 kB.

@Frenzie
Copy link
Member

Frenzie commented Dec 4, 2022

In that case I take back what I said, although I thought Lua numbers were closer to 64-bit regardless of architecture.

@NiLuJe
Copy link
Member

NiLuJe commented Dec 4, 2022

Yup, that'll overflow the lua_Integer cast on 32-bit arches.

@NiLuJe
Copy link
Member

NiLuJe commented Dec 4, 2022

I mean, we could just add 0x0100000000 to the value returned by lfs if it's negative (as, yeah, lua_Number is wider than lua_Integer), but that verges on undefined behavior ;p.

@NiLuJe
Copy link
Member

NiLuJe commented Dec 4, 2022

Or, better yet, fix lfs to use lua_pushnumber instead of lua_pushinteger ;).

@NiLuJe
Copy link
Member

NiLuJe commented Dec 4, 2022

I mean, we could just add 0x0100000000 to the value returned by lfs if it's negative (as, yeah, lua_Number is wider than lua_Integer), but that verges on undefined behavior ;p.

Well, maybe not so crazy after all...

https://github.com/coreutils/coreutils/blob/aaa306ad76d48b27efd75101d57b6ec5bf90362c/src/stat.c#L1498-L1502

@NiLuJe
Copy link
Member

NiLuJe commented Dec 4, 2022

See also https://git.busybox.net/busybox/tree/coreutils/stat.c ;).

I'll push an lfs patch to see how it fares first, as it should be slightly less intrusive, ironically ;).

@NiLuJe
Copy link
Member

NiLuJe commented Dec 4, 2022

Namely,

diff --git a/src/lfs.c b/src/lfs.c
index e5e5ee4..150396e 100644
--- a/src/lfs.c
+++ b/src/lfs.c
@@ -29,6 +29,7 @@
 #include <stdio.h>
 #include <string.h>
 #include <stdlib.h>
+#include <inttypes.h>
 #include <time.h>
 #include <sys/stat.h>
 
@@ -830,74 +831,74 @@ static void push_st_mode(lua_State * L, STAT_STRUCT * info)
 /* device inode resides on */
 static void push_st_dev(lua_State * L, STAT_STRUCT * info)
 {
-  lua_pushinteger(L, (lua_Integer) info->st_dev);
+  lua_pushnumber(L, (uintmax_t) info->st_dev);
 }
 
 /* inode's number */
 static void push_st_ino(lua_State * L, STAT_STRUCT * info)
 {
-  lua_pushinteger(L, (lua_Integer) info->st_ino);
+  lua_pushnumber(L, (uintmax_t) info->st_ino);
 }
 
 /* number of hard links to the file */
 static void push_st_nlink(lua_State * L, STAT_STRUCT * info)
 {
-  lua_pushinteger(L, (lua_Integer) info->st_nlink);
+  lua_pushnumber(L, (unsigned long) info->st_nlink);
 }
 
 /* user-id of owner */
 static void push_st_uid(lua_State * L, STAT_STRUCT * info)
 {
-  lua_pushinteger(L, (lua_Integer) info->st_uid);
+  lua_pushnumber(L, (unsigned long) info->st_uid);
 }
 
 /* group-id of owner */
 static void push_st_gid(lua_State * L, STAT_STRUCT * info)
 {
-  lua_pushinteger(L, (lua_Integer) info->st_gid);
+  lua_pushnumber(L, (unsigned long) info->st_gid);
 }
 
 /* device type, for special file inode */
 static void push_st_rdev(lua_State * L, STAT_STRUCT * info)
 {
-  lua_pushinteger(L, (lua_Integer) info->st_rdev);
+  lua_pushnumber(L, (unsigned long) info->st_rdev);
 }
 
 /* time of last access */
 static void push_st_atime(lua_State * L, STAT_STRUCT * info)
 {
-  lua_pushinteger(L, (lua_Integer) info->st_atime);
+  lua_pushnumber(L, (long) info->st_atime);
 }
 
 /* time of last data modification */
 static void push_st_mtime(lua_State * L, STAT_STRUCT * info)
 {
-  lua_pushinteger(L, (lua_Integer) info->st_mtime);
+  lua_pushnumber(L, (long) info->st_mtime);
 }
 
 /* time of last file status change */
 static void push_st_ctime(lua_State * L, STAT_STRUCT * info)
 {
-  lua_pushinteger(L, (lua_Integer) info->st_ctime);
+  lua_pushnumber(L, (long) info->st_ctime);
 }
 
 /* file size, in bytes */
 static void push_st_size(lua_State * L, STAT_STRUCT * info)
 {
-  lua_pushinteger(L, (lua_Integer) info->st_size);
+  lua_pushnumber(L, (uintmax_t) info->st_size);
 }
 
 #ifndef _WIN32
 /* blocks allocated for file */
 static void push_st_blocks(lua_State * L, STAT_STRUCT * info)
 {
-  lua_pushinteger(L, (lua_Integer) info->st_blocks);
+  lua_pushnumber(L, (uintmax_t) info->st_blocks);
 }
 
 /* optimal file system I/O blocksize */
 static void push_st_blksize(lua_State * L, STAT_STRUCT * info)
 {
-  lua_pushinteger(L, (lua_Integer) info->st_blksize);
+  lua_pushnumber(L, (unsigned long) info->st_blksize);
 }
 #endif

(casts stolen^Wheavily inspired from busybox, but basically the uintmax_t ones are are all the types with a 64 variants).

@Frenzie
Copy link
Member

Frenzie commented Dec 4, 2022

Yeah, a double seems like it makes more sense. Files larger than 2 GB are rather common. I distinctly remember switching to NTFS back in '04 or '05 because I kept running into the 2 GB limit, so in my view they were hardly rare back then either. ;-)

@NiLuJe
Copy link
Member

NiLuJe commented Dec 4, 2022

I was mistaken earlier, the FAT32 limit is actually 4GB, but the Lua_Integer overflow happens at 2GB, which explains the fact that we're seeing this at all ;).

@Frenzie
Copy link
Member

Frenzie commented Dec 4, 2022

In that case I was running into the 4 GB limit in '04 or '05, making my remark all the more relevant. ;-P

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants