New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add WikiReader plugin #9534
base: master
Are you sure you want to change the base?
Add WikiReader plugin #9534
Conversation
Kudos for having build a standalone plugin and having it hack into KOReader's core to get it to work :) But the ReaderUI/ReaderLink hack is probably not acceptable if this plugin becomes part of KOReader. If it would be part of KOReader, you could just add support into ReaderLink for links like: So, the question is: do we want this to be the main/official way to support local wikipedia (and co) dumps - which needs to have the more common ZIM files (that I know nothing about) post-processed by this tool to get them to be usable. Does your 2.4 Gb archive contain images ? Or would it be way bigger if it had images ? |
supporting zim files is easy using libzim but requires ICU, which is a big library to bundle. Also there's the limitation of 4GB files in fat32. Most zim archives with images are way bigger than that. |
Thanks, yes I chose to do it this way with the Reader link patch so all changes are restricted to the plugin files, but it is too hacky. Bundling libzim would be a nicer solution I guess, but I do not know how with all the c dependencies, that's why I chose to build a small tool to convert it into stuff KOReader already bundles. The DB I linked is without images because I did not add support for those yet, as I wanted some feedback on how this plugin should be done "properly". I was not aware of the unofficial plugins repo, I'd like it if those plugins were accessible/findable in the native KOReader UI to download and install. Otherwise it would be nice IMO if this plugin could evolve into something that could be included in the KOreader package. |
Adding a reference to #2333. |
I refactored the plugin to use the event as @poire-z recommended, which turned out to be quite easy. I think a native |
That's indeed better. Ideally, we would have
I think nearly all (Kobo, Kindle, Pocketbook) except Android must use fat32.
I'd like to have more feedback from other users: would you use/try this? I'm quite an avid reader of Wikipedia articles on KOReader, but having done and maintained our "on-line" Wikipedia lookup and "Save as EPUB" features, I think I would be very frustrated with this plugin:
So, ok, it may be better than nothing for when you are on the road with no wifi. So, it feels a bit like you may end up being the single user of this plugin :) And I'm not fond of shipping "good enough in the meantime" stuff, I like us to ship perfect and stable stuff :) |
end | ||
|
||
-- Encode the db path in the URL, URL escaping doesn't work for some reason, so use base64 | ||
local prefix = "kolocalwiki://" .. escape.base64_encode(self.db_path) .. "#" -- Title will be after the hashtag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
base642bin
from the sha2
module (it's in base) is liable to be much faster than this, FWIW.
(I haven't actually checked turbo, but sha2 is awesome).
Okay that is fair, if I would be the only one that is using it then of course it shouldn't be bundled. Perhaps we can wait if there are people that are interested, and if so we could look into how to do this properly. Oh and btw, 50000 articles covered a lot more than I thought, I encourage people that are interested into downloading the preconverted db and playing around. |
User feedback here: I'd definitely use it. I actually wanted for something like this a few months ago, before I knew about KOReader, to have an offline copy of wikivoyage on my ereader during a trip. I had one based on kiwix on my phone, but the battery life wasn't good enough, nor the screen big enough for it to be comfortable. This feature would have been optimal. |
Same here. I also linked to this from a Mobileread forum at https://www.mobileread.com/forums/showthread.php?p=4267358#post4267358 . I bought a Kobo partly for the ZIM feature, only to find out that it mostly no longer works. This would be fantastic! |
IMHO this is the way to go.
I'm working on a package manager to make this happen. Isn't ready yet and won't be ready for a few more months.
IMHO we have too many plugins already in this repo. Adding new ones should be considered carefully, as they will need a maintainer, some documentation and will generate new tickets that would need to be addressed.
I'm not oposed to merge this particular plugin. But having some more users doesn't make the thing more appealing to me. Doing a quick
|
I wonder how hard it is to write a minimal zim parser in Lua to avoid the conversion step. I gather the C++ lib is as big as it is because of the write support, which this does not need. The format is somewhat documented and doesn't look more excessively complex. |
What is missing for a merge? |
omg this exists?! i want this so much! for me the zim converter step is a minor issue, i need to do some hoolahoops to get teh files on the kobo anyways, converting them doesn't seem like a big deal, at least short term. of course, it would be best if we could just read the ZIM files directly, but from what i gathered, that's typically done with kiwix tools and a web browser, not sure that's something we want to embark on here... |
A "web browser" we already have; the problem when I last checked is that the relevant libs were rather substantial in size. |
"libs"?
if we already have a web browser in koreader, is there *already* a way
to just fire that up against a running kiwix server? because that would
serve me perfectly fine in the short term, i can hack that out myself!
where *is* that web browser?
|
libzim (a few hundred kB) has ICU as a dependency (much more). I could be mistaken or misremembering but I don't think the necessary dependencies would go for any less than a couple of MB.
I was only referring to the rendering part, but you can trivially pull in anything over HTTP and stuff it in an HtmlBoxWidget. Pictures might be a bit harder to get in. Though if you're thinking of taking that approach I'd be remiss if I didn't point out you can simply run Lynx/Elinks/Links2 in the terminal emulator plugin. |
On 2024-04-26 13:10:14, Frans de Jonge wrote:
> "libs"?
libzim (a few hundred kB) has ICU as a dependency (much more). I could be mistaken or misremembering but I don't think the necessary dependencies would go for any less than a couple of MB.
right, i was thinking of outsourcing the libzim stuff to the
operator... i've read guides like:
https://phire.cc/Offline-Wikipedia-on-the-Kobo.html
... which basically do this: download the zim file, fire up the kiwix
server somehow and get nickel to load the files locally. they actually
patch the nickel.so to tweak some things which i find utterly bunkers,
but kudos to them on that...
so i figured hey, if *we* have a working web browser, we can't we just
load http://localhost:kiwix/ and be done with it?
> if we already have a web browser in koreader, is there *already* a way to just fire that up against a running kiwix server? because that would serve me perfectly fine in the short term, i can hack that out myself!
I was only referring to the rendering part, but you can trivially pull in anything over HTTP and stuff it in an HtmlBoxWidget. Pictures might be a bit harder.
oh right, by "web browser" you mean "we have a web browser rendering
widget for EPUB files anyway", right? :) like we don't have a full thing
in there...
Though if you're thinking of taking that approach I'd be remiss if I didn't point out you can simply run Lynx/Elinks/Links2 in the terminal emulator plugin.
omg, we ship lynx?
|
Right, I meant Zim files can be read and whatever HTML is in them can be rendered.
lol no, just the terminal plugin but you can run e.g. Alpine or Debian in chroot or just run it straight up. It works quite well. |
So, just for the record, the reason I'm looking into this again is that I just bought a 512GiB microSD card. For 80$CAD. That's 60$USD. It's crazy. I can not only fit the "all maxi" wikipedia en zim file in there (which is all of english wikipedia, with images), it can fit MULTIPLE TIMES. It's bonkers. So yeah, we've passed the threshold where you can store all of wikipedia in a tiny little thing the size of my fingernail. I think arguments about "oh, but this is too big" are kind of moot at this point. The question is: how well would the converter work on a 100GB ZIM file, and would the resulting sqlite database be even useable? (It also puts into perspective concerns about ICU's size, IMHO. I don't know about how it would impact koreader, but here on Debian, the package is 37... megabytes...) |
That depends. You can spend 80$CAD on an SD card. Not everyone can. As long as its optional, do whatever, but the moment you start needing to change hardware you start leaving a large number of people out. In fact, a lot of the people you would leave out is exactly the reason why Wikipedia packages ZIM files in the first place, people living in poor countries under dictatorships, like North Korea.
E-readers are a piece of tech that doesn't get obsoleted just because the web got fatter, I'd very much like for them to stay that way. I have currently no reason to switch my old Boox ML67. I already lost the ability to update KOReader there and it's OK, but notice some devices are more limited than whatever you have. Also, some systems (no idea if that's also true for the currently supported ones) can't install applications outside of main storage, which can be quite small. That's a limitation a bigger SD card won't fix. |
Hi!
The WikiPedia plugin of KOReader is wonderful, but does not support offline files. In order to fix this and a few other things, I build a new plugin called WikiReader. This plugin aims to fill this gap. It allows reading a database from local disk that contains HTML files, and showing them in the default Reader. This is build for reading WikiPedia articles offline, but in principle more sources are compatible. Some of the features supported are:
This database is SQLite DB and uses
zstd
to compress articles. In order to get this database, ZIM files can be converted. For this purpose, I created another small repo here: https://github.com/Bartvelp/zim-converter. There you can also find an example database.I think that the code of this plugin does not use the absolute best practices, but shouldn't interfere with anything if simply not used. I needed to use some hacks to allow hyperlinks and used a global variable to maintain a single plugin instance.
Some notable things that are not yet included are:
Perhaps this plugin is not ready as-is, but I hope you agree this plugin adds nice features and show promise.
I tested it on my laptop (Linux AppImage), Android, and a PocketBook Touch HD 3.
This change is