Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ffi/koptcontext: fix language handling #1613

Merged

Conversation

benoit-pierre
Copy link
Contributor

@benoit-pierre benoit-pierre commented May 17, 2023

Don't set context field from a string that could be garbage collected…


This change is Reviewable

@benoit-pierre
Copy link
Contributor Author

This is enough to fix this flaky test:

assert.are.same(kc_table.language, new_kc_table.language)

(But valgrind still complains about a number of uninitialized memory tests).

@NiLuJe
Copy link
Member

NiLuJe commented May 17, 2023

That would imply that ffi.new("char[?]", luastring) doesn't actually create a copy but just a pointer to the string data?

That's... weird o_O.

@benoit-pierre
Copy link
Contributor Author

No, it's a copy alright, but which can be garbage collected too, no?

@benoit-pierre benoit-pierre force-pushed the pr/fix_koptcontext_language_handling branch from 02a767b to 4e384a5 Compare May 17, 2023 23:31
@NiLuJe
Copy link
Member

NiLuJe commented May 17, 2023

Right, yeah; but it shouldn't be an issue in theory, I think? (at least outside of busted wrecking havoc on LuaJIT/FFI ;p).

(IIRC, it's also similar to what @ezdiy settled on in his original mupdf bump branch, FWIW).

@benoit-pierre
Copy link
Contributor Author

It is if the pointer is to be longer lived than the LUA FFI object.

@NiLuJe
Copy link
Member

NiLuJe commented May 17, 2023

I mean, unless something's killing the ref via a self.language = nil somewhere, the very fact that's it's inside that object pins it to that object lifecycle at worst.

@benoit-pierre
Copy link
Contributor Author

FWIU the assignment to the context language field does not create a reference to the FFI object.

@NiLuJe
Copy link
Member

NiLuJe commented May 17, 2023

I'd understand this if we were doing something like self.language = ffi.new("const char*", luastring) (I'm noteven sure if this actually works, can't check rn), because at that point, yes, we're creating a simple cdata pointer to actual Lua data that might get GC'ed before that cdata object itself (and, in the context above, will definitely be collectable as soon as the function returns); but with self.language = ffi.new("char[?]", luastring, #luastring + 1), my understanding is that it's a proper cdata byte array, with a null-terminated copy of the string inside, tracked by the GC; so, as long as any refs to it are alive, it's not released. And self.language should be the only ref to it, so as long as it's not replaced (or self itself is replaced/destroyed), the actual memory should be pinned.

(I don't have anything against the fix, I'd just like to understand what's happening ^^).

@benoit-pierre
Copy link
Contributor Author

From luajit's documentation:

Garbage Collection of cdata Objects

All explicitly (ffi.new(), ffi.cast() etc.) or implicitly (accessors) created cdata objects are garbage collected. You need to ensure to retain valid references to cdata objects somewhere on a Lua stack, an upvalue or in a Lua table while they are still in use. Once the last reference to a cdata object is gone, the garbage collector will automatically free the memory used by it (at the end of the next GC cycle).

Please note, that pointers themselves are cdata objects, however they are not followed by the garbage collector. So e.g. if you assign a cdata array to a pointer, you must keep the cdata object holding the array alive as long as the pointer is still in use:

@NiLuJe
Copy link
Member

NiLuJe commented May 17, 2023

Yeah, but, despite C arrays decaying to pointers, this isn't a pointer, it's an array ;).

@benoit-pierre
Copy link
Contributor Author

The koptcontext field is a pointer, the code is setting a FFI structure member.

@NiLuJe
Copy link
Member

NiLuJe commented May 17, 2023

Oh, dammit.

char *language;

Yup, never even thought to double-check that.

@NiLuJe
Copy link
Member

NiLuJe commented May 17, 2023

I keep forgetting that this thing is a struct, this isn't the first time it's tripped me up >_<".

(FWIW, all that I wrote above was based on the false assumption that self was a plain Lua table).

@benoit-pierre
Copy link
Contributor Author

And those 2 patches take care of the (many) uninitialized memory access Valgrind complains about:

  • leptonica (similar fix in upstream's master):
diff --git a/src/pix2.c b/src/pix2.c
index 6c1b06f4..e5b3a9a4 100644
--- a/src/pix2.c
+++ b/src/pix2.c
@@ -1838,7 +1838,7 @@ PIX     *pixd;
     pixGetDimensions(pixs, &ws, &hs, &d);
     wd = ws + left + right;
     hd = hs + top + bot;
-    if ((pixd = pixCreateNoInit(wd, hd, d)) == NULL)
+    if ((pixd = pixCreate(wd, hd, d)) == NULL)
         return (PIX *)ERROR_PTR("pixd not made", procName, NULL);
     pixCopyResolution(pixd, pixs);
     pixCopyColormap(pixd, pixs);
@@ -1917,7 +1917,7 @@ PIX     *pixd;
         return (PIX *)ERROR_PTR("width must be > 0", procName, NULL);
     if (hd <= 0)
         return (PIX *)ERROR_PTR("height must be > 0", procName, NULL);
-    if ((pixd = pixCreateNoInit(wd, hd, d)) == NULL)
+    if ((pixd = pixCreate(wd, hd, d)) == NULL)
         return (PIX *)ERROR_PTR("pixd not made", procName, NULL);
     pixCopyResolution(pixd, pixs);
     pixCopySpp(pixd, pixs);
  • libk2pdfopt:
diff --git a/k2pdfoptlib/k2proc.c b/k2pdfoptlib/k2proc.c
index 9052cd68..36f13e89 100644
--- a/k2pdfoptlib/k2proc.c
+++ b/k2pdfoptlib/k2proc.c
@@ -1579,6 +1579,7 @@ printf("Creating single wrmap.\n");
 #endif
             wrmaps0=&_wrmaps0;
             wrectmaps_init(wrmaps0);
+            memset(&wrmap, 0, sizeof (wrmap));
             wrmap.srcpageno=newregion->pageno;
             wrmap.srcwidth=newregion->bmp->width;
             wrmap.srcheight=newregion->bmp->height;
diff --git a/k2pdfoptlib/textrows.c b/k2pdfoptlib/textrows.c
index 88dcf712..2155fcce 100644
--- a/k2pdfoptlib/textrows.c
+++ b/k2pdfoptlib/textrows.c
@@ -529,6 +529,7 @@ printf("    row[%2d].lc/h5050/cap = %3d %3d %3d\n",i,textrows->textrow[i].lcheig
         {
         int lch,rh,c1new,partial,r0,nrt;
         int *prowthresh,*rthresh;
+        prowthresh=NULL;
 
         lch=0;
         if (i>0 && textrows->textrow[i].capheight > textrows->textrow[i-1].capheight*1.8)
diff --git a/k2pdfoptlib/wrapbmp.c b/k2pdfoptlib/wrapbmp.c
index de01921a..adb754c0 100644
--- a/k2pdfoptlib/wrapbmp.c
+++ b/k2pdfoptlib/wrapbmp.c
@@ -235,6 +235,7 @@ exit(10);
         WRECTMAP _wrmap,*wrmap;
 
         wrmap=&_wrmap;
+        memset(wrmap, 0, sizeof (WRECTMAP));
         wrmap->srcpageno = region->pageno;
         wrmap->srcwidth = region->bmp8->width;
         wrmap->srcheight = region->bmp8->height;
@@ -318,6 +319,7 @@ k2printf("3.  wbh=%d x %d, tmp=%d x %d x %d, new_base=%d, wbbase=%d\n",wrapbmp->
     WRECTMAP _wrmap,*wrmap;
 
     wrmap=&_wrmap;
+    memset(wrmap, 0, sizeof (WRECTMAP));
     wrmap->srcpageno = region->pageno;
     wrmap->srcwidth = region->bmp8->width;
     wrmap->srcheight = region->bmp8->height;

@NiLuJe
Copy link
Member

NiLuJe commented May 17, 2023

this isn't the first time it's tripped me up >_<".

Which is probably where that fixme above it came from ^^. Which also may not entirely be well thought-through in terms of scoping ;p.

@benoit-pierre
Copy link
Contributor Author

this isn't the first time it's tripped me up >_<".

Which is probably where that fixme above it came from ^^. Which also may not be entirely be well thought-through in terms of scoping ;p.

Right, should I get rid of it?

@NiLuJe
Copy link
Member

NiLuJe commented May 17, 2023

Right, should I get rid of it?

Yup, and I would definitely add a reminder that this is a struct to explain the roundabout strdup, as I or others will surely forget that fact again in the future ;o).

@benoit-pierre
Copy link
Contributor Author

    if language then
        -- As `self` is a struct, and `self.language` a `char *`,
        -- we need to make a C copy of the language string.
        self.language = C.strdup(language)
    end

?

@NiLuJe
Copy link
Member

NiLuJe commented May 18, 2023

Yup, perfect!

Don't set context field from a string that could be garbage collected…
@benoit-pierre benoit-pierre force-pushed the pr/fix_koptcontext_language_handling branch from 4e384a5 to f785523 Compare May 18, 2023 00:06
@Frenzie Frenzie merged commit 27e7c5c into koreader:master May 18, 2023
@Frenzie
Copy link
Member

Frenzie commented May 18, 2023

I keep forgetting that this thing is a struct, this isn't the first time it's tripped me up >_<".

Same. :-)

@benoit-pierre benoit-pierre deleted the pr/fix_koptcontext_language_handling branch May 18, 2023 09:08
Frenzie added a commit to Frenzie/koreader that referenced this pull request May 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants