Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kiwix-serve: start even when one ZIM is damaged #658

Closed
schuellerf opened this issue Jan 20, 2024 · 13 comments · Fixed by #666
Closed

kiwix-serve: start even when one ZIM is damaged #658

schuellerf opened this issue Jan 20, 2024 · 13 comments · Fixed by #666

Comments

@schuellerf
Copy link
Contributor

Please add a command line switch, maybe "--skip-broken" to skip broken ZIM files and continue startup

I guess the command line option just has to change this to a warning and skip the exit(1)
https://github.com/kiwix/kiwix-tools/blob/main/src/server/kiwix-serve.cpp#L353

@kelson42
Copy link
Contributor

@schuellerf Whatbis the use case? Why such an option is necessary?
@rgaudin @mgautierfr @veloman-yunkan Domyou think this is a good idea to implement such a feature?

@kelson42
Copy link
Contributor

kelson42 commented Jan 20, 2024

Use case has been given in the linked PR: skip files which are downlaoding. Seems a weak reason to me, problem can be simpler an better solved other ways.

@schuellerf
Copy link
Contributor Author

@kelson42
I usually start it as a Docker with *.zim.
How could I solve this better in other ways?

@kelson42
Copy link
Contributor

@kelson42

I usually start it as a Docker with *.zim.

How could I solve this better in other ways?

Use a downloader which makes the file renaming at the end... or do it yourself.

@veloman-yunkan
Copy link
Collaborator

@rgaudin @mgautierfr @veloman-yunkan Domyou think this is a good idea to implement such a feature?

The only argument against accepting this feature proposal from a user who also kindly implemented it is that the short option names are a limited resource that should be expended sparingly. That's all I was able to come up with.

@mgautierfr
Copy link
Member

I agree with @veloman-yunkan, I don't see real reason to not accept this feature.
We could discuss the naming or the implementation (if applicable) but I have nothing against the feature.

@rgaudin
Copy link
Member

rgaudin commented Jan 29, 2024

I believe the presented use case does not justify it: kiwix-serve expects ZIM files and @schuellerf knowingly feeds it with files that are not ZIM (yet).
As @kelson42 mentioned, this is fixing in kiwix-serve something that has nothing to do with it. That's exactly why most download softwares writes to a differently named file (xxx.part) then renames the completed file upon success.

That said, I believe having the ability to start kiwix-serve ignoring broken ZIMs (whatever the reason) would be valuable. I don't know about our end users but found myself several times in need of moving around erroneous ZIMs or playing with the CLI to get it to start.

It's easy to implement as suggested but many many kiwix-serve instances are unattended and simply printing a warning on stderr seems insuffiscient in those cases. Now adding an endpoint for this means design, code and maintenance… 🤷‍♂️

@schuellerf
Copy link
Contributor Author

@rgaudin exactly my point - a "background service" should try it's best to just start. I can have a quick look if it is sane (in terms of design, code and potential maintenance) to keep track of broken files and show them in the UI? Maybe as disabled buttons with this short explanation as alt-text or similar 🤔

@mgautierfr
Copy link
Member

That's exactly why most download softwares writes to a differently named file (xxx.part) then renames the completed file upon success.

Which ones ?

  • aria2c download directly to foo.zim
  • wget also
  • curl write on stdout by default
  • firefox download to a tmp file foo.<random_number>.zim.part but create and empty foo.zim to "save" the file slot on fs.

They all have option to change the output filename, but you will have to rename the file to .zim by hand after.

@rgaudin
Copy link
Member

rgaudin commented Jan 29, 2024

They all have option to change the output filename, but you will have to rename the file to .zim by hand after.

If you're using CLI ; just use appropriate flags and/or rename on success. What does it have to do with kiwix-serve?

Interactive softwares expect valid files on opening. Daemons may behave differently as already discussed.

As I wrote, I think it's safe to assume that broken ZIMs make it to Kiwix reader and kiwix-serve especially. That's no reason to pass it files that are being downloaded… as it will anyway only appear on restart.

@mgautierfr
Copy link
Member

If you're using CLI ; just use appropriate flags and/or rename on success. What does it have to do with kiwix-serve?

It has nothing to do with kiwix-serve. Original PR is not about downloader. You (and @kelson42) say that user is using the wrong tool and that it should use the right one instead of changing kiwix-serve. I'm asking you which one.

Interactive softwares expect valid files on opening.

oocalc (open office calc), evince (pdf), totem (video) correctly open valid files if we pass them valid and invalid files on the command line.
Even kiwix-desktop opens valid zim file if you pass it both a valid and broke file on the command line.

but many many kiwix-serve instances are unattended and simply printing a warning on stderr seems insuffiscient in those cases.

The PR add an option to skip invalid zim and continue opening valid ones.
By default, kiwix-serve will stop, not just print a warning on stderr.
If user pass an option explicitly asking to skipBroken, we should assume it is enough to print a warning.

I believe having the ability to start kiwix-serve ignoring broken ZIMs (whatever the reason) would be valuable.
[...] Now adding an endpoint for this means design, code and maintenance… 🤷‍♂️
As I wrote, I think it's safe to assume that broken ZIMs make it to Kiwix reader and kiwix-serve especially. That's no reason to pass it files that are being downloaded… as it will anyway only appear on restart.

I'm not sure to understand if you are for or against this feature.

@rgaudin
Copy link
Member

rgaudin commented Jan 29, 2024

I'm not sure to understand if you are for or against this feature.

  • I am in-favor because broken ZIM are to be expected.
  • I think passing downloading ZIM to kiwix-serve is wrong. It should be possible (see point 1.) but if it were just for this use case (which is not possible), I'd be against 😉
  • I think implementing as suggested (printing on stderr) is better than current situation, but having access to this information from the web-server would be useful (because most of the time, special users only have access to it via its web interface)

@kelson42 kelson42 removed the question label Jan 30, 2024
@kelson42
Copy link
Contributor

@schuellerf We finally agreed to integrate the feature. I don't do technical code review but one thing is missing: the man page update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment