Releases: scrapinghub/product-extraction-benchmark
Releases · scrapinghub/product-extraction-benchmark
1.0.0
We compare the quality of product extraction between Zyte Automatic Extraction (Zyte), Diffbot, and open-source tools (extruct) for the following attributes: price, availability (whether the product is in-stock or out-of-stock), SKU.
Attached are:
- snapshots in WARC format, which can be served by pywb:
dataset-warc.zip
- screenshots of pages (before snapshot creation):
dataset-jpeg.zip
See README for more details.