Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visual debugging fails -- ImportError: MagickWand shared library not found #81

Closed
cjwinchester opened this issue Sep 10, 2018 · 13 comments

Comments

@cjwinchester
Copy link

Hello! First, thanks to everyone who contributes to this library -- we use it all the time and it is fantastic.

My problem: I'm getting an import error when I try to use the to_image() method on a PDF page.

Here's the code that fails:

import pdfplumber as pp

FILE = 'path/to/my/file.pdf'

with pp.open(FILE) as pdf:
    test = pdf.pages[0]
    crop = (0, 100, 295, test.height-100)
    cropped_page = test.crop(crop)
    cropped_page.to_image()

Here's the traceback -- I think it's throwing up when the code calls import wand.image:

OSError                                   Traceback (most recent call last)
~/.virtualenvs/bga-ttQ8tdNt/lib/python3.7/site-packages/wand/api.py in <module>()
    179 try:=-
--> 180     libraries = load_library()
    181 except (OSError, IOError):

~/.virtualenvs/bga-ttQ8tdNt/lib/python3.7/site-packages/wand/api.py in load_library()
    134         return libwand, libmagick
--> 135     raise IOError('cannot find library; tried paths: ' + repr(tried_paths))
    136 

OSError: cannot find library; tried paths: []

During handling of the above exception, another exception occurred:

ImportError                               Traceback (most recent call last)
<ipython-input-37-4c7bb7edc98c> in <module>()
      3     crop = (0, 100, 295, test.height - 100)
      4     cropped_page = test.crop(crop)
----> 5     cropped_page.to_image()
      6 

~/.virtualenvs/bga-ttQ8tdNt/lib/python3.7/site-packages/pdfplumber/page.py in to_image(self, resolution)
    214         For conversion_kwargs, see http://docs.wand-py.org/en/latest/wand/image.html#wand.image.Image
    215         """
--> 216         from .display import PageImage, DEFAULT_RESOLUTION
    217         res = resolution or DEFAULT_RESOLUTION
    218         return PageImage(self, resolution=res)

~/.virtualenvs/bga-ttQ8tdNt/lib/python3.7/site-packages/pdfplumber/display.py in <module>()
      1 import PIL.Image
      2 import PIL.ImageDraw
----> 3 import wand.image
      4 import sys, os
      5 from io import BytesIO

~/.virtualenvs/bga-ttQ8tdNt/lib/python3.7/site-packages/wand/image.py in <module>()
     18 
     19 from . import compat
---> 20 from .api import MagickPixelPacket, libc, libmagick, library
     21 from .color import Color
     22 from .compat import (binary, binary_type, encode_filename, file_types,

~/.virtualenvs/bga-ttQ8tdNt/lib/python3.7/site-packages/wand/api.py in <module>()
    204     raise ImportError('MagickWand shared library not found.\n'
    205                       'You probably had not installed ImageMagick library.\n'
--> 206                       'Try to install:\n  ' + msg)
    207 
    208 #: (:class:`ctypes.CDLL`) The MagickWand library.

ImportError: MagickWand shared library not found.
You probably had not installed ImageMagick library.
Try to install:
  brew install freetype imagemagick

I'm on a Macbook Pro running High Sierra and Python 3.7. I'm using pipenv to manage dependencies (but I get the same error when I try it in a basic virtualenv).

$ pipenv shell
$ python --version
Python 3.7.0

Here's my Pipfile:

$ cat Pipfile
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[packages]
jupyter = "*"
pdfplumber = "*"
wand = "*"

[dev-packages]

[requires]

I've confirmed that imagemagick and freetype are installed and working:

$ brew list freetype --versions
freetype 2.9.1 2.9

$ brew list imagemagick --versions
imagemagick 7.0.7-28 7.0.7-21 7.0.7-27 7.0.6-10 7.0.8-11_1 7.0.7-32 7.0.7-25 7.0.7-23 7.0.5-4

Any advice would be greatly appreciated. Thanks again!

@jsvine
Copy link
Owner

jsvine commented Sep 11, 2018

Thanks for the kind words, Cody, and a special thanks for the helpful details.

I recall the following fix having worked for me: https://stackoverflow.com/questions/37011291/python-wand-image-is-not-recognized/41772062#41772062

Does it work for you? If so, I can add a note to the readme.

@cjwinchester
Copy link
Author

Progress! Now I'm getting a wand PolicyError:

PolicyError                               Traceback (most recent call last)
<ipython-input-6-474b1ba4e342> in <module>()
      3     crop = (0, 100, 295, test.height-100)
      4     cropped_page = test.crop(crop)
----> 5     cropped_page.to_image()

~/.virtualenvs/bga-ttQ8tdNt/lib/python3.7/site-packages/pdfplumber/page.py in to_image(self, resolution)
    216         from .display import PageImage, DEFAULT_RESOLUTION
    217         res = resolution or DEFAULT_RESOLUTION
--> 218         return PageImage(self, resolution=res)
    219 
    220 class DerivedPage(Page):

~/.virtualenvs/bga-ttQ8tdNt/lib/python3.7/site-packages/pdfplumber/display.py in __init__(self, page, original, resolution)
     42                 page.pdf.stream.name,
     43                 page.page_number - 1,
---> 44                 resolution
     45             )
     46         else:

~/.virtualenvs/bga-ttQ8tdNt/lib/python3.7/site-packages/pdfplumber/display.py in get_page_image(pdf_path, page_no, resolution)
     23     """
     24     page_path = "{0}[{1}]".format(pdf_path, page_no)
---> 25     with wand.image.Image(filename=page_path, resolution=resolution) as img:
     26         if img.alpha_channel:
     27             img.background_color = wand.image.Color('white')

~/.virtualenvs/bga-ttQ8tdNt/lib/python3.7/site-packages/wand/image.py in __init__(self, image, blob, file, filename, format, width, height, depth, background, resolution)
   2742                     self.read(blob=blob, resolution=resolution)
   2743                 elif filename is not None:
-> 2744                     self.read(filename=filename, resolution=resolution)
   2745                 # clear the wand format, otherwise any subsequent call to
   2746                 # MagickGetImageBlob will silently change the image to this

~/.virtualenvs/bga-ttQ8tdNt/lib/python3.7/site-packages/wand/image.py in read(self, file, filename, blob, resolution)
   2820             r = library.MagickReadImage(self.wand, filename)
   2821         if not r:
-> 2822             self.raise_exception()
   2823 
   2824     def close(self):

~/.virtualenvs/bga-ttQ8tdNt/lib/python3.7/site-packages/wand/resource.py in raise_exception(self, stacklevel)
    220             warnings.warn(e, stacklevel=stacklevel + 1)
    221         elif isinstance(e, Exception):
--> 222             raise e
    223 
    224     def __enter__(self):

PolicyError: not authorized `PDF' @ error/constitute.c/IsCoderAuthorized/408

Here's what's in my policy file at /usr/local/Cellar/imagemagick@6/6.9.10-11_1/etc/ImageMagick-6/policy.xml:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE policymap [
  <!ELEMENT policymap (policy)+>
  <!ATTLIST policymap xmlns CDATA #FIXED ''>
  <!ELEMENT policy EMPTY>
  <!ATTLIST policy xmlns CDATA #FIXED '' domain NMTOKEN #REQUIRED
    name NMTOKEN #IMPLIED pattern CDATA #IMPLIED rights NMTOKEN #IMPLIED
    stealth NMTOKEN #IMPLIED value CDATA #IMPLIED>
]>
<!--
  Configure ImageMagick policies.

  Domains include system, delegate, coder, filter, path, or resource.

  Rights include none, read, write, execute and all.  Use | to combine them,
  for example: "read | write" to permit read from, or write to, a path.

  Use a glob expression as a pattern.

  Suppose we do not want users to process MPEG video images:

    <policy domain="delegate" rights="none" pattern="mpeg:decode" />

  Here we do not want users reading images from HTTP:

    <policy domain="coder" rights="none" pattern="HTTP" />

  The /repository file system is restricted to read only.  We use a glob
  expression to match all paths that start with /repository:

    <policy domain="path" rights="read" pattern="/repository/*" />

  Lets prevent users from executing any image filters:

    <policy domain="filter" rights="none" pattern="*" />

  Any large image is cached to disk rather than memory:

    <policy domain="resource" name="area" value="1GP"/>

  Define arguments for the memory, map, area, width, height and disk resources
  with SI prefixes (.e.g 100MB).  In addition, resource policies are maximums
  for each instance of ImageMagick (e.g. policy memory limit 1GB, -limit 2GB
  exceeds policy maximum so memory limit is 1GB).

  Rules are processed in order.  Here we want to restrict ImageMagick to only
  read or write a small subset of proven web-safe image types:

    <policy domain="delegate" rights="none" pattern="*" />
    <policy domain="filter" rights="none" pattern="*" />
    <policy domain="coder" rights="none" pattern="*" />
    <policy domain="coder" rights="read|write" pattern="{GIF,JPEG,PNG,WEBP}" />
-->
<policymap>
  <!-- <policy domain="system" name="shred" value="2"/> -->
  <!-- <policy domain="system" name="precision" value="6"/> -->
  <!-- <policy domain="system" name="memory-map" value="anonymous"/> -->
  <!-- <policy domain="system" name="max-memory-request" value="256MiB"/> -->
  <!-- <policy domain="resource" name="temporary-path" value="/tmp"/> -->
  <policy domain="resource" name="memory" value="256MiB"/>
  <policy domain="resource" name="map" value="512MiB"/>
  <policy domain="resource" name="width" value="16KP"/>
  <policy domain="resource" name="height" value="16KP"/>
  <!-- <policy domain="resource" name="list-length" value="128"/> -->
  <policy domain="resource" name="area" value="128MB"/>
  <policy domain="resource" name="disk" value="1GiB"/>
  <!-- use curl -->
  <policy domain="delegate" rights="none" pattern="URL" />
  <policy domain="delegate" rights="none" pattern="HTTPS" />
  <policy domain="delegate" rights="none" pattern="HTTP" />
  <!--
    Imagemagick does not need to have been explicitly built against Ghostscript
    to be vulnerable to Ghostscript-related vulnerabilities. convert will
    happily use tools it can find at runtime, regardless of build options.
    http://seclists.org/oss-sec/2018/q3/142
    https://www.kb.cert.org/vuls/id/332928
    If you need to use Ghostscript functionality you should comment out the
    below line, at your own risk.
  -->
  <policy domain="coder" rights="none" pattern="{EPS,PS2,PS3,PS,PDF,XPS}" />
  <!-- <policy domain="resource" name="file" value="768"/> -->
  <!-- <policy domain="resource" name="thread" value="4"/> -->
  <!-- <policy domain="resource" name="throttle" value="0"/> -->
  <!-- <policy domain="resource" name="time" value="3600"/> -->
  <!-- <policy domain="coder" rights="none" pattern="MVG" /> -->
  <!-- in order to avoid to get image with password text -->
  <policy domain="path" rights="none" pattern="@*"/>
  <!-- <policy domain="cache" name="memory-map" value="anonymous"/> -->
  <!-- <policy domain="cache" name="synchronize" value="True"/> -->
  <!-- <policy domain="cache" name="shared-secret" value="passphrase" stealth="true"/> -->
</policymap>

If I remove PDF from this list --

<policy domain="coder" rights="none" pattern="{EPS,PS2,PS3,PS,PDF,XPS}" />

-- and restart my notebook, I no longer get an error, but the image doesn't display. I think it's there, tho?

# this does nothing
with pp.open(FILE) as pdf:
    test = pdf.pages[0]
    crop = (0, 100, 295, test.height-100)
    cropped_page = test.crop(crop)
    im = cropped_page.to_image()
    im
# printing shows the object exists, tho
with pp.open(FILE) as pdf:
    test = pdf.pages[0]
    crop = (0, 100, 295, test.height-100)
    cropped_page = test.crop(crop)
    im = cropped_page.to_image()
    print(im)

# <pdfplumber.display.PageImage object at 0x104e2fb00>

I'll keep Googling around and trying stuff, but I am all ears if anyone has a solution. (There's a similar open wand issue from someone pulling a PDF from S3, and some discussion here about changing the policy.xml file.)

@jsvine
Copy link
Owner

jsvine commented Sep 12, 2018

Hmmm, that's a new one to me. Thanks for flagging. Does the error occur with every PDF you try, or just certain PDFs?

@cjwinchester
Copy link
Author

All of 'em -- I've tried a half-dozen so far. A comment I found somewhere suggested that default policies were updated after ImageTragick, but I'm not sure if this is related?

Once I delete PDF from the list of patterns for that policy, the PageImage methods seem to work fine -- I can e.g. im.save() -- it's just not getting returned to show up inline.

I'll keep poking around, will update if I find anything useful. Thanks for taking a look!

@HiCraigChen
Copy link

HiCraigChen commented Nov 5, 2018

I had PolicyError: not authorized PDF when I am reading PDF from wand.image as well.
However, I solved this problem when I added
<policy domain="coder" rights="read" pattern="PDF" /> into my policy.xml

@jsvine
Copy link
Owner

jsvine commented Jan 27, 2021

(Closing this issue, but keeping it pinned so that other people can benefit from it.)

@jsvine jsvine closed this as completed Jan 27, 2021
@OK-JH
Copy link

OK-JH commented Apr 8, 2021

I have the same problem, and my solution is: uninstall the lasted version of imageimagick and install the ImageMagick-6.9.12-6-Q16-x86-dll.exe (besides , my python is 32bit.)

@Cppowboy
Copy link

Cppowboy commented Jun 8, 2021

I had PolicyError: not authorized PDF when I am reading PDF from wand.image as well.
However, I solved this problem when I added
<policy domain="coder" rights="read" pattern="PDF" /> into my policy.xml

Could you please tell me where is that plocy.xml file?

@jsvine
Copy link
Owner

jsvine commented Jun 8, 2021

@Cppowboy: I believe the location of policy.xml will depend on your operating system and the manner in which you installed ImageMagick. Hopefully a web search with those details can point you in the correct direction.

@Cppowboy
Copy link

Cppowboy commented Jun 9, 2021

@Cppowboy: I believe the location of policy.xml will depend on your operating system and the manner in which you installed ImageMagick. Hopefully a web search with those details can point you in the correct direction.

Thank you.

@anjanvb
Copy link

anjanvb commented Nov 12, 2022

In my case the policy file was in /etc/ImageMagick-6/policy.xml

And then I had to uncomment this line

<policy domain="module" rights="none" pattern="{PS,PDF,XPS}" />

And change the rights to read|write in this line towards the bottom of the file

<policy domain="coder" rights="none" pattern="PDF" />

to

<policy domain="coder" rights="read|write" pattern="PDF" />

@touchwolf
Copy link

touchwolf commented Mar 13, 2023

My laptop is an M1 MacBook Pro, and my original installation of Homebrew is based on arm64 architecture. To proceed, I need to install the x86 version of Homebrew. After installation, I should add the line
alias ibrew='arch -x86_64 /usr/local/bin/brew' to my ~/.zshrc file.
It's important to remember to change the permissions of /usr/local/bin/brew to 755.

After these steps, I can run ibrew install freetype imagemagick@6 and
echo 'export PATH="/usr/local/opt/imagemagick@6/bin:$PATH"' >> ~/.zshrc to ensure that my $PATH has been updated. Finally, I need to add line in ~/.zshrc : $MAGICK_HOME=/usr/local/opt/imagemagick@6.

@jsvine jsvine unpinned this issue Jul 17, 2023
@rangyf
Copy link

rangyf commented Dec 4, 2023

Thanks for the kind words, Cody, and a special thanks for the helpful details.

I recall the following fix having worked for me: https://stackoverflow.com/questions/37011291/python-wand-image-is-not-recognized/41772062#41772062

Does it work for you? If so, I can add a note to the readme.

Save my day
I kept asking Copilot how to fix that problem and tried a lot of methods and all failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants