Skip to content

delete_callback should support to ignore silently the fact that fpath is missing #192

@benoit74

Description

@benoit74

We have a helper delete_callback at

This delete_callback is meant to be used as a callback when adding an item to the ZIM, typically to delete original file once it has been added to the ZIM.

There are some edge cases where the file to delete might already be gone, typically when the scraper encounters an Exception and decides to stop ZIM creation and delete all temporary files on exit. Due to the concurrency of these operations, the deletion of temporary files on exit might complete before the delete_callback is invoked.

I think that we should enhance the delete_callback to either:

  1. always silently ignore when the file to delete is already gone
  2. provide a parameter to activate the silent ignore of missing file

Solution 2 would help in the sense that some scrapers might have silent bugs if we silently ignore deletion issues, leading to huge disk space consumption for nothing. However the edge cases causing the issue mentioned where file is already gone are pretty rare and difficult to produce, so I'm more in favor of solution 1 since I would assume most scraper developers will never realize such situations might appear until hit by them (probably in production).

This issue has been originally unveiled by @dan-niles in Youtube scraper, kudos.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions