Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wallabag can't download some articles #5025

Closed
mwoz123 opened this issue May 13, 2019 · 14 comments · Fixed by #5026 or #5036
Closed

Wallabag can't download some articles #5025

mwoz123 opened this issue May 13, 2019 · 14 comments · Fixed by #5026 or #5036

Comments

@mwoz123
Copy link
Contributor

mwoz123 commented May 13, 2019

  • KOReader version: v2019.04-80-g4ba7f98_2019-05-05
  • Device: Inbook Prime (android)

Issue

Can download couple articles from wallabag. Names:
Inicjatywa Tato.Net | 22 narzędzia, które pomogą Ci wygrać ojcostwo
» Czy policja może namierzyć sprawców fałszywych alarmów bombowych? -- Niebezpiecznik.pl…

Could it be | character in first case? coudn't be stored on fat fs ?
but what it could be in second one? I don't think it's » as couple similar (same source) were processed e.g. this one was processed:
» KeePass — jak zacząć swoją przygodę z managerem haseł? -- Niebezpiecznik.pl --

Steps to reproduce

Add below urls to wallabag :
https://tato.net/czytelnia/22-narzedzia-ktore-pomoga-ci-wygrac-ojcostwo/
https://niebezpiecznik.pl/post/czy-policja-moze-namierzyc-sprawcow-falszywych-alarmow-bombowych/?more
and then try download them on koreader.

@Frenzie
Copy link
Member

Frenzie commented May 14, 2019

Wallabag's own export produces the filename 22 narzędzia, które pomogą Ci wygrać ojcostwo.epub.

Looks like that isn't used at all, but the logic looks sound enough either way:

local title = util.replaceInvalidChars(article.title)
local local_path = self.directory .. article_id_preffix .. article.id .. article_id_postfix .. title:sub(1,30) .. ".epub"

The URL https://tato.net/czytelnia/22-narzedzia-ktore-pomoga-ci-wygrac-ojcostwo/ processes just fine on my H2O (which also uses FAT32).

I can confirm https://niebezpiecznik.pl/post/czy-policja-moze-namierzyc-sprawcow-falszywych-alarmow-bombowych/?more to result in failure.

I can't properly investigate atm.

@Frenzie
Copy link
Member

Frenzie commented May 14, 2019

the logic looks sound enough

Actually, I think it's not UTF-8 aware, so you get an invalid filename like [w-id_1955191] » Czy policja może namierzy�.epub. It's not entirely clear to me why this only sometimes breaks down, but that's probably besides the point.

@Frenzie
Copy link
Member

Frenzie commented May 14, 2019

@poire-z You've played around with this UTF-8 stuff a bit. Is there a more elegant solution than this?

diff --git a/plugins/wallabag.koplugin/main.lua b/plugins/wallabag.koplugin/main.lua
index a6aeee68..0638f787 100644
--- a/plugins/wallabag.koplugin/main.lua
+++ b/plugins/wallabag.koplugin/main.lua
@@ -298,7 +298,9 @@ function Wallabag:download(article)
     local skip_article = false
     local item_url = "/api/entries/" .. article.id .. "/export.epub"
     local title = util.replaceInvalidChars(article.title)
-    local local_path = self.directory .. article_id_preffix .. article.id .. article_id_postfix .. title:sub(1,30) .. ".epub"
+    title = title:sub(1,30)
+    title = util.fixUtf8(title, "")
+    local local_path = self.directory .. article_id_preffix .. article.id .. article_id_postfix .. title .. ".epub"
     logger.dbg("Wallabag: DOWNLOAD: id: ", article.id)
     logger.dbg("Wallabag: DOWNLOAD: title: ", article.title)
     logger.dbg("Wallabag: DOWNLOAD: filename: ", local_path)

@poire-z
Copy link
Contributor

poire-z commented May 14, 2019

May be using util.splitToChars() (or util.splitToWords() if that can make some sense) to get a table of utf8 chars, and loop thru that table to append these utf8 chars until we reach the max nb of bytes allowed for that string?

@Frenzie
Copy link
Member

Frenzie commented May 14, 2019

Btw, this problem should apply equally to all places where util.replaceInvalidChars() is used. It might make the most sense to do it in that function, with a default (adjustable) cut-off point of ~240-250.

I'm a bit conflicted about it. On the one hand it seems to be outside the scope of the function, while on the other hand any characters after 255 are invalid by definition on the lowest common denominator of FAT32.

@poire-z
Copy link
Contributor

poire-z commented May 14, 2019

May be another function with a more adequate filename like util.makeSafeFilename(maxlength=240) (where you could remove other unnacepatble chars like , and possibly keep util.replaceInvalidChars() for where we want to do that on larger text?

From some old notes of mine:

# Not accepted under Windows : " * / : < > ? \ |
# Not accepted under Linux :  /
# Not accepted under MacOSX :  :

@Frenzie
Copy link
Member

Frenzie commented May 14, 2019

for where we want to do that on larger text?

That scenario doesn't make sense to me. I think the current function should either be renamed as suggested and expanded, or made local to util and then used in the safeFilename function.

@poire-z
Copy link
Contributor

poire-z commented May 14, 2019

^ fine with me, it indeed seems every place that use util.replaceInvalidChars() is for making (part of) a filename. So, possibly a few places to adapt and unclutter to make use of a supersafe util.safeFilename() :)

Frenzie added a commit to Frenzie/koreader that referenced this issue May 14, 2019
Fixes koreader#5025

The OPDS browser was doing some fancier stuff in a way that should be abstracted away in util (because it applies anywhere files will be saved):

https://github.com/koreader/koreader/blob/eace8d25c1cbf9bd13e98220098494e8fb63c18f/frontend/ui/widget/opdsbrowser.lua#L482-L491
@Frenzie Frenzie added this to the 2019.06 milestone May 14, 2019
Frenzie added a commit that referenced this issue May 14, 2019
Fixes #5025

The OPDS browser was doing some fancier stuff in a way that should be abstracted away in util (because it applies anywhere files will be saved):

https://github.com/koreader/koreader/blob/eace8d25c1cbf9bd13e98220098494e8fb63c18f/frontend/ui/widget/opdsbrowser.lua#L482-L491
@mwoz123
Copy link
Contributor Author

mwoz123 commented May 21, 2019

Can we reopen this issue?
Today I checked with koreader version
*_2019-05-18

And first article is now downloading fine but the second one :
https://niebezpiecznik.pl/post/czy-policja-moze-namierzyc-sprawcow-falszywych-alarmow-bombowych/?more
Still doesn't.

I've even deleted it and re-added (with this application https://play.google.com/store/apps/details?id=fr.gaulupeau.apps.InThePoche) to my framabag server.
But still can't download 1 article.

@Frenzie Frenzie reopened this May 22, 2019
@Frenzie
Copy link
Member

Frenzie commented May 22, 2019

Is it possible to provide some logs? What does your /proc/mounts say?

@mwoz123
Copy link
Contributor Author

mwoz123 commented May 22, 2019

You're probably looking for this:
/dev/block/vold/31:9 /mnt/sdcard vfat rw,dirsync,nosuid,nodev,noexec,relatime,uid=1000,gid=1015,fmask=0002,dmask=0002,allow_utime=0020,codepage=cp437,iocharset=iso8859-1,shortname=mixed,utf8,errors=remount-ro 0 0

full log:
terminal_output.txt

@Frenzie
Copy link
Member

Frenzie commented May 22, 2019

Okay, so then it should be detected as FAT. What do your logs say?

@mwoz123
Copy link
Contributor Author

mwoz123 commented May 22, 2019

Which logs? Can you paste command?

@Frenzie
Copy link
Member

Frenzie commented May 22, 2019

Your regular adb logcat, or more targeted (if desired) something like adb logcat KOReader:V. But I've spotted the problem; I simply forgot to pass the path.

The logic of the util.getSafeFilename() function should be reversed. Assume the worst unless we know it's safe. (Or always assume the worst, but for now I'd prefer not to. ;-) )

Frenzie added a commit to Frenzie/koreader that referenced this issue May 22, 2019
Frenzie added a commit that referenced this issue May 22, 2019
mwoz123 pushed a commit to mwoz123/koreader that referenced this issue Mar 29, 2020
…der#5026)

Fixes koreader#5025

The OPDS browser was doing some fancier stuff in a way that should be abstracted away in util (because it applies anywhere files will be saved):

https://github.com/koreader/koreader/blob/eace8d25c1cbf9bd13e98220098494e8fb63c18f/frontend/ui/widget/opdsbrowser.lua#L482-L491
mwoz123 pushed a commit to mwoz123/koreader that referenced this issue Mar 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants