Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wallabag: epubs being converted to htm in 2024.03+ #11528

Closed
dlgoodr opened this issue Mar 9, 2024 · 11 comments · Fixed by #11532
Closed

Wallabag: epubs being converted to htm in 2024.03+ #11528

dlgoodr opened this issue Mar 9, 2024 · 11 comments · Fixed by #11532
Milestone

Comments

@dlgoodr
Copy link

dlgoodr commented Mar 9, 2024

  • KOReader version: 2024.03
  • Device: android (leaf2 and inkpalm) (can test kindle if needed)

Issue

EPUB files downloaded through the wallabag plugin are being converted to HTML and lots of the cruft wallabag strips is restored, along with guesses about chapter marks.

Steps to reproduce

Install latest, download article epub, weep. Restore to 2024.01 for the expected behavior.

This isn't a crash, just an unwelcome change in experience that I can't figure out how to revert without going back to 2024.01

@poire-z
Copy link
Contributor

poire-z commented Mar 9, 2024

Probably #11492 (not using Wallabag, so no idea of the context).
Try reverting manually the small changes in 3e7ab19.

@poire-z poire-z changed the title epubs being converted to htm in 2024.03+ Wallabag: epubs being converted to htm in 2024.03+ Mar 9, 2024
@dlgoodr
Copy link
Author

dlgoodr commented Mar 9, 2024

Based on my poor understanding of LUA, we shouldn't be affected, we're correctly returning application/epub+zip for the file:

direct wget of epub
dlg@manila ~ % wget https://wb.example.org/api/entries/1668524955/export.epub
--2024-03-09 10:09:13--  https://wb.example.org/api/entries/1668524955/export.epub
Resolving wb.example.org (wb.example.org)... 100.12.12.12
Connecting to wb.example.org (wb.example.org)| 100.12.12.12 |:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 890444 (870K) [application/epub+zip]
Saving to: ‘export.epub’

export.epub         100%[===================>] 869.57K   711KB/s    in 1.2s    

2024-03-09 10:09:15 (711 KB/s) - ‘export.epub’ saved [890444/890444]

dlg@manila ~ % file export.epub 
export.epub: EPUB document
dlg@manila ~ % 

but reverting 3e7ab19 definitely fixes the issue for me.

@poire-z
Copy link
Contributor

poire-z commented Mar 9, 2024

Quick reading, and it feels this change negate the original aim:

-    if article.mimetype ~= "text/html" then -- different from text/html
+    if type(article.mimetype) == "string" and article.mimetype:find("^text/html") then -- starts/equal to text/html

@Frenzie ?

@Frenzie
Copy link
Member

Frenzie commented Mar 9, 2024 via email

@dlgoodr
Copy link
Author

dlgoodr commented Mar 10, 2024

Just adding a not does not fix the issue.

dlgoodr added a commit to dlgoodr/koreader that referenced this issue Mar 10, 2024
…than exactly (koreader#11492)"

Introduces regression, fixes koreader#11528

This reverts commit 3e7ab19.
@Frenzie
Copy link
Member

Frenzie commented Mar 10, 2024

@dlgoodr Could you share a URL where you see this issue and your Wallabag version for good measure?

@dlgoodr
Copy link
Author

dlgoodr commented Mar 11, 2024

@Frenzie
Copy link
Member

Frenzie commented Mar 11, 2024

@dlgoodr That's an export. Please provide the original URL.

@dlgoodr
Copy link
Author

dlgoodr commented Mar 12, 2024

@Frenzie
Copy link
Member

Frenzie commented Mar 13, 2024

Curious, my Wallabag gives text/html; utf-8 for that. In any case, so will have simply been the obvious case of an exact text/html match, thanks.

Frenzie added a commit that referenced this issue Mar 13, 2024
…11532)

A typo snuck in #11492, which should've read `not type(article.mimetype) == "string" or type(article.mimetype) == "string" and not article.mimetype:find("^text/html")`. But in most cases the behavior would've been identically broken because of the same underlying issue: Wallabag mimetype is actually HTTP content-type.

Fixes #11528.

Also introduces a new setting associated with the behavior in case people have different preferences.
@Frenzie Frenzie added this to the 2024.04 milestone Mar 13, 2024
@Frenzie
Copy link
Member

Frenzie commented Mar 13, 2024

Tests welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants