Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] mupdf, mupdf-third: static libs or dynamic libs needed? ? #1396

Closed
Mark-Joy opened this issue Nov 12, 2021 · 17 comments
Closed

[Question] mupdf, mupdf-third: static libs or dynamic libs needed? ? #1396

Mark-Joy opened this issue Nov 12, 2021 · 17 comments

Comments

@Mark-Joy
Copy link

When compiling pymupdf from source, it needs mupdf and mupdf-third libs. Could you please clarify which kind of libs they are?
As per my experience, mupdf-third is static lib while mupdf is dynamic lib.

@JorjMcKie
Copy link
Collaborator

I am not sure embarrassingly. All I know is that MuPDF needs to be built with CLAGS="-fPIC". Sounds like dynamic to me.

@Mark-Joy
Copy link
Author

It seems that from 1.18, mupdf no longer provide libmupdf-third.so (dynamic lib): https://www.mail-archive.com/pld-cvs-commit@lists.pld-linux.org/msg475799.html
So it may not be actually required by pymupdf.
But you can still get libmupdf-third.a by running static build of mupdf.

In my build, to get pymupdf compiled successfully, I had to build libmupdf-third.a separately from libmupdf.so

If it may not be required, could you please remove mupdf-third from setup.py?

@Mark-Joy
Copy link
Author

Since 1.18.0, mupdf no longer produced libmupdf-third.so in its build, so I tried removing mupdf-third in setup.py of PyMuPDF.
mupdf 1.19.0 + PyMuPDF 1.19.1 installed successfully and seems to work fine without mupdf-third.

@JorjMcKie
Copy link
Collaborator

Ok, thanks for digging this out.
I will remove that lib from the setup.py then.

@JorjMcKie
Copy link
Collaborator

Just tried to compile under Linux: that does not work:

x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.8/fitz/fitz_wrap.o -lmupdf -o build/lib.linux-x86_64-3.8/fitz/_fitz.cpython-38-x86_64-linux-gnu.so
//usr/local/lib/libmupdf.a(load-jpx.o): In function `jpx_read_image':
load-jpx.c:(.text.jpx_read_image+0x83): undefined reference to `opj_set_default_decoder_parameters'
load-jpx.c:(.text.jpx_read_image+0xa3): undefined reference to `opj_create_decompress'
load-jpx.c:(.text.jpx_read_image+0xbd): undefined reference to `opj_set_info_handler'
load-jpx.c:(.text.jpx_read_image+0xcf): undefined reference to `opj_set_warning_handler'
load-jpx.c:(.text.jpx_read_image+0xe1): undefined reference to `opj_set_error_handler'
load-jpx.c:(.text.jpx_read_image+0xf1): undefined reference to `opj_setup_decoder'
load-jpx.c:(.text.jpx_read_image+0x103): undefined reference to `opj_stream_default_create'
load-jpx.c:(.text.jpx_read_image+0x131): undefined reference to `opj_stream_set_read_function'
load-jpx.c:(.text.jpx_read_image+0x140): undefined reference to `opj_stream_set_skip_function'
load-jpx.c:(.text.jpx_read_image+0x14f): undefined reference to `opj_stream_set_seek_function'
load-jpx.c:(.text.jpx_read_image+0x161): undefined reference to `opj_stream_set_user_data'
load-jpx.c:(.text.jpx_read_image+0x16c): undefined reference to `opj_stream_set_user_data_length'
...

@JorjMcKie
Copy link
Collaborator

Tried with this setup.py:

DEFAULT = [
    "mupdf",
    # "mupdf-third",
]

ALPINE = DEFAULT + [
    "jbig2dec",
    "jpeg",
    "openjp2",
    "harfbuzz",
]
ARCH_LINUX = DEFAULT + [
    "jbig2dec",
    "openjp2",
    "jpeg",
    "freetype",
    "gumbo",
]
OPENSUSE = ARCH_LINUX + [
    "harfbuzz",
    "png16",
]
FEDORA = ARCH_LINUX + [
    "harfbuzz",
    "leptonica",
    "tesseract",
]
NIX = ARCH_LINUX + ["harfbuzz"]
LIBRARIES = {
    "default": DEFAULT,
    "ubuntu": DEFAULT,
    "arch": ARCH_LINUX,
    "manjaro": ARCH_LINUX,
    "artix": ARCH_LINUX,
    "opensuse": OPENSUSE,
    "fedora": FEDORA,
    "alpine": ALPINE,
    "nix": NIX,
}

...

@Mark-Joy
Copy link
Author

Mark-Joy commented Nov 13, 2021

What I did are:

  • Removed all previous version of mupdf libs in /usr/lib/: libmupdf.a, libmupdf.so, libmupdf-third.a
  • Compiled mupdf 1.19.0 shared lib with below command:
    make HAVE_X11=no HAVE_GLUT=no shared=yes build=release install
    It gave me libmupdf.so (dynamic lib) in /usr/lib/
    In your build error, libmupdf.a seems to be a static lib.
  • Modified setup.py, then went to pymupdf folder and did: pip install .

@JorjMcKie
Copy link
Collaborator

But this is a major change:
Your installation seems to end up with an additional runtime dependency *.so file or am I wrong?
You cannot distribute the result in one wheel.

What I am doing is creating one self-contained binary.

@Mark-Joy
Copy link
Author

I have a workaround in my system. To avoid compiling static lib libmupdf-third.a, I simply create a symlink

ln -s libmupdf.so libmupdf-third.so

Now I don't have to modify setup.py.

I want to confirm one thing though. Could you try deleting /usr/lib/libmupdf.so, then do python3 -c "import fitz"
It is to check if fitz is independent of libmupdf.so
For me, doing so I got the below error:

python3 -c "import fitz"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/data/data/com.termux/files/usr/lib/python3.10/site-packages/fitz/__init__.py", line 10, in <module>
    from fitz.fitz import *
  File "/data/data/com.termux/files/usr/lib/python3.10/site-packages/fitz/fitz.py", line 17, in <module>
    from . import _fitz
ImportError: dlopen failed: library "libmupdf.so" not found

@JorjMcKie
Copy link
Collaborator

Confirmed:
In my Linux installation (Windows WSL), mupdf is not installed at all and so /usr/lib/ also does not contain libmupdf.so nor any other mupdf component, but pymupdf can be imported just fine.

@sedimentation-fault
Copy link

sedimentation-fault commented Dec 8, 2022

I got bitten by the "missing mupdf-third library" bug too! I kept on getting

creating build
creating build/temp.linux-x86_64-3.10
creating build/temp.linux-x86_64-3.10/fitz
x86_64-pc-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -O2 -march=x86-64-v3 -pipe -ftree-vectorize -fwrapv -fPIC -I/usr/include/mupdf -I/usr/local/include/mupdf -I/usr/include/freetype2 -I/usr/include/python3.10 -c fitz/fitz_wrap.c -o build/temp.linux-x86_64-3.10/fitz/fitz_wrap.o -Wno-incompatible-pointer-types -Wno-pointer-sign -Wno-sign-compare
creating build/lib.linux-x86_64-3.10
creating build/lib.linux-x86_64-3.10/fitz
x86_64-pc-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,--as-needed build/temp.linux-x86_64-3.10/fitz/fitz_wrap.o -L/usr/lib64 -lmupdf -lmupdf-third -ljbig2dec -lopenjp2 -ljpeg -lfreetype -lgumbo -o build/lib.linux-x86_64-3.10/fitz/_fitz.cpython-310-x86_64-linux-gnu.so
/usr/lib/gcc/x86_64-pc-linux-gnu/11.2.0/../../../../x86_64-pc-linux-gnu/bin/ld: cannot find -lmupdf-third
collect2: error: ld returned 1 exit status
error: command '/usr/bin/x86_64-pc-linux-gnu-g++' failed with exit code 1

I am on Gentoo. Finally, I had to look into setup.py...and I realized it was...heuristic - to put it nicely. Many questions arose:

  • Why does it use "ARCH_LINUX" for OSes that are not ARCH?
  • Why does it need /etc/os-release (this file was a dangling symlink, because it used relative paths, and my /etc is itself a symbolic link to somewhere deeper in the file hierarchy, so I had to spend half a day chasing dangling symlinks in my /etc ... :roll:)?
  • Why would "Fedora" need "leptonica" and "tesseract", while I, on Gentoo would not? I do have both leptonica and tesseract installed and I do use them for deskewing and OCRing respectively...
  • If 'linux' is defined as linux = sys.platform.startswith( 'linux') and 'openbsd' is defined as sys.platform.startswith( 'openbsd') , then why is 'darwin' defined the same way? Is "darwin" on the same "level" as "Linux" or "openBSD? Well, no, it is an OS variant, in this case of "Ubuntu", so the variable should be better named 'ubuntu_darwin'. Inconsistent naming brings great confusion if you approach something with an outsider's mind. To me, "darwin" is nothing. Never heard of an OS with this name.
  • Why is 'gentoo' missing from the list?

Questions over questions...To make it short, I just commented

`# "mupdf-third",`

from the DEFAULT list and moved on. I can only hope that this is not going to bite me when I actually try to run PyMuPDF!

[Edit: Just formatted it nicely, didn't add or remove anything].

@sedimentation-fault
Copy link

Just saw this

https://en.wikipedia.org/wiki/Darwin_(operating_system)

so it's that Darwin, after all, not "Ubuntu Darwin"...so I take my comment regarding naming of the 'darwin' variable back.

@julian-smith-artifex-com
Copy link
Collaborator

Please post the exact command (including any environmental variable settings) you are using to build PyMuPDF. It would also be useful to know the values of sys.platform and os.uname()[0] on your system.

[As of PyMuPDF-1.20.0, setup.py defaults to downloading and building MuPDF locally, links it directly into PyMuPDF, and does not use the various distribution-specific variables such as DEFAULT, ALPINE, ARCH_LINUX, FEDORA and LIBRARIES etc.]

@sedimentation-fault
Copy link

Please post the exact command (including any environmental variable settings) you are using to build PyMuPDF.

Indeed, that slipped off, sorry...so the command I used was

PYMUPDF_SETUP_MUPDF_BUILD='' python setup.py install

the reason behind it being that I already have the latest (1.21.0) mupdf installed, so I saw no reason to violate the DRY (Don't Repeat Yourself) principle.

It would also be useful to know the values of sys.platform and os.uname()[0] on your system.

Right again, sorry for the omission. Here they are:

python -c 'import sys; import os; import fitz; print(sys.version, "\n", sys.platform, "\n", os.uname()[0] , "\n", fitz.__doc__)'
3.10.0 (default, Feb 11 2022, 00:50:04) [GCC 11.2.0] 
 linux 
 Linux 
 
PyMuPDF 1.21.0rc2: Python bindings for the MuPDF 1.21.0 library.
Version date: 2022-11-07 00:00:01.
Built for Python 3.10 on linux (64-bit).

@julian-smith-artifex-com
Copy link
Collaborator

Thanks for that information.

So my understanding is:

  • We're trying to link PyMuPDF with the system mupdf on Gentoo, and ld is failing to find libmupdf-third.a.
  • There is some evidence that omitting -l mupdf-third from the link command can makes things work.

However, for example, https://packages.debian.org/sid/amd64/libmupdf-dev/filelist includes /usr/lib/libmupdf-third.a, so we probably require -l mupdf-third on some systems, and so can't simply remove all mention of -l mupdf-third when linking with the system mupdf. Instead we'll probably need to use an environmental variable to control things.

Before doing this, could you try removing -l mupdf-third from the link command by editing your setup.py, and verify that this gives a PyMuPDF that works on your system?

@sedimentation-fault
Copy link

sedimentation-fault commented Dec 13, 2022

Sorry for the delay...As I said intially, I just deleted that -l mupdf-third , compilation worked - and I never looked back. :-)

In #1396 (comment) @Mark-Joy said that starting 1.18.0 mupdf stopped building mupdf-third. So it seems that the correct way to do it is to check the version of installed mupdf and act accordingly, either by specifying mupdf-third, or by dropping it.

[Edit: Formatting only.]

@JorjMcKie JorjMcKie removed their assignment Feb 10, 2023
julian-smith-artifex-com added a commit to ArtifexSoftware/PyMuPDF-julian that referenced this issue Apr 25, 2023
julian-smith-artifex-com added a commit that referenced this issue Apr 25, 2023
@JorjMcKie
Copy link
Collaborator

I think this no longer an open issue and can safely be transferred to "Discussions".

@pymupdf pymupdf locked and limited conversation to collaborators May 24, 2023
@JorjMcKie JorjMcKie converted this issue into discussion #2421 May 24, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
None yet
Development

No branches or pull requests

4 participants