More details on errors #78

rgaudin · 2021-01-14T11:20:11Z

Just realized that our fondamentaux ZIM was drastically smaller than it should (and previously was): 2.35 GB and 2.54 GB instead of 9.36 GB.
Those numbers were from three different runs a few days apart (Nov 4th – large one, Nov 6th and Nov 10th).

With an exit-code of 0, we had no idea those newer ZIMs were problematic.

Understanding we can't fail on every single error when scraping a generic website, we could still be a little smarter by recording and exposing the number of failed fetches so that our QA process can evaluate whether the output is OK or not.

@ikreymer, how realistic is adding a count of succeeded/failed fetches? I think the error count in stdout only regards the webpages, right ? Those runs didn't had any 1513 / 1513 (100.00%), errors: 0 (0.00%).

The text was updated successfully, but these errors were encountered:

stale · 2021-03-19T23:55:24Z

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

rgaudin · 2023-02-01T11:50:57Z

This now exists.

rgaudin added enhancement upstream labels Jan 14, 2021

rgaudin mentioned this issue Jan 14, 2021

Videos missing webrecorder/browsertrix-crawler#4

Closed

stale bot added the stale label Mar 19, 2021

rgaudin self-assigned this Feb 1, 2023

rgaudin closed this as completed Feb 1, 2023

stale bot removed the stale label Feb 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More details on errors #78

More details on errors #78

rgaudin commented Jan 14, 2021

stale bot commented Mar 19, 2021

rgaudin commented Feb 1, 2023

More details on errors #78

More details on errors #78

Comments

rgaudin commented Jan 14, 2021

stale bot commented Mar 19, 2021

rgaudin commented Feb 1, 2023