Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot open the file because of jpeg file #197

Open
wawrzek opened this issue Oct 12, 2019 · 2 comments
Open

Cannot open the file because of jpeg file #197

wawrzek opened this issue Oct 12, 2019 · 2 comments

Comments

@wawrzek
Copy link

wawrzek commented Oct 12, 2019

read_epub fails on an cover image. If I check the content of the epub, there is a cover but with jpg not jpeg extension.
I can see 3 files with word Cover in content.xml:

[niewod@manila] /home/niewod/Dropbox/Books/Humble_Books/SF #>grep Cover content.opf
    <item href="Images/RobotDreams5x8Coverv3.0Front.jpg" id="RobotDreams5x8Coverv3.0Front.jpg" media-type="image/jpeg" />
    <item href="Images/Cover.jpg" id="Cover.jpg" media-type="image/jpeg" />
    <item href="Images/Cover.jpeg" id="Cover.jpeg" media-type="image/jpeg" />
    <reference href="Text/titlepage.xhtml" title="Cover" type="cover" />

Does library try to open each of them? Cover.jpeg doesn't exists.

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-19-f48e74a49f71> in <module>
----> 1 book =epub.read_epub('robotdreams.epub')

/usr/lib/python3.7/site-packages/ebooklib/epub.py in read_epub(name, options)
   1737     reader = EpubReader(name, options)
   1738
-> 1739     book = reader.load()
   1740     reader.process()
   1741

/usr/lib/python3.7/site-packages/ebooklib/epub.py in load(self)
   1395
   1396     def load(self):
-> 1397         self._load()
   1398
   1399         return self.book

/usr/lib/python3.7/site-packages/ebooklib/epub.py in _load(self)
   1684     def _load(self):
   1685         try:
-> 1686             self.zf = zipfile.ZipFile(self.file_name, 'r', compression=zipfile.ZIP_DEFLATED, allowZip64=True)
   1687         except zipfile.BadZipfile as bz:
   1688             raise EpubException(0, 'Bad Zip file')

/usr/lib/python3.7/zipfile.py in __init__(self, file, mode, compression, allowZip64, compresslevel)
   1202             while True:
   1203                 try:
-> 1204                     self.fp = io.open(file, filemode)
   1205                 except OSError:
   1206                     if filemode in modeDict:

FileNotFoundError: [Errno 2] No such file or directory: 'robotdreams.epub'

In [20]: book =epub.read_epub('robotvisions.epub')

In [21]: cd Classic/
/media/dropbox/Dropbox/Books/Humble_Books/SF/Classic

In [22]: book =epub.read_epub('robotdreams.epub')
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-22-f48e74a49f71> in <module>
----> 1 book =epub.read_epub('robotdreams.epub')

/usr/lib/python3.7/site-packages/ebooklib/epub.py in read_epub(name, options)
   1737     reader = EpubReader(name, options)
   1738
-> 1739     book = reader.load()
   1740     reader.process()
   1741

/usr/lib/python3.7/site-packages/ebooklib/epub.py in load(self)
   1395
   1396     def load(self):
-> 1397         self._load()
   1398
   1399         return self.book

/usr/lib/python3.7/site-packages/ebooklib/epub.py in _load(self)
   1692         # 1st check metadata
   1693         self._load_container()
-> 1694         self._load_opf_file()
   1695
   1696         self.zf.close()

/usr/lib/python3.7/site-packages/ebooklib/epub.py in _load_opf_file(self)
   1662
   1663         self._load_metadata()
-> 1664         self._load_manifest()
   1665         self._load_spine()
   1666         self._load_guide()

/usr/lib/python3.7/site-packages/ebooklib/epub.py in _load_manifest(self)
   1531                     ei.file_name = unquote(r.get('href'))
   1532                     ei.media_type = media_type
-> 1533                     ei.content = self.read_file(zip_path.join(self.opf_dir, ei.get_name()))
   1534             else:
   1535                 # different types

/usr/lib/python3.7/site-packages/ebooklib/epub.py in read_file(self, name)
   1402         # Raises KeyError
   1403         name = zip_path.normpath(name)
-> 1404         return self.zf.read(name)
   1405
   1406     def _load_container(self):

/usr/lib/python3.7/zipfile.py in read(self, name, pwd)
   1426     def read(self, name, pwd=None):
   1427         """Return file bytes for name."""
-> 1428         with self.open(name, "r", pwd) as fp:
   1429             return fp.read()
   1430

/usr/lib/python3.7/zipfile.py in open(self, name, mode, pwd, force_zip64)
   1465         else:
   1466             # Get info object for name
-> 1467             zinfo = self.getinfo(name)
   1468
   1469         if mode == 'w':

/usr/lib/python3.7/zipfile.py in getinfo(self, name)
   1393         if info is None:
   1394             raise KeyError(
-> 1395                 'There is no item named %r in the archive' % name)
   1396
   1397         return info

KeyError: "There is no item named 'OEBPS/Images/Cover.jpeg' in the archive"
@wawrzek
Copy link
Author

wawrzek commented Oct 12, 2019

Removing
<item href="Images/Cover.jpeg" id="Cover.jpeg" media-type="image/jpeg" /> from content.xml fixes the problem. I wonder how book reader deals with such problem.

@aerkalov
Copy link
Owner

Yeah, the problem is that library reads content for every item in the manifest. Maybe to put flag ignore which would set content to None if it can not be read or maybe ignore that item in the manifest at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants