Skip to content
Danny Lin edited this page Apr 15, 2023 · 21 revisions

Differences between WebScrapBook and ScrapBook X

Here's a quick review of key differences of WebScrapBook (0.87.0) and it's predecessor, ScrapBook X (1.14.7).

Improvements

  • Cross platform support:
    • Besides Firefox Desktop, Chromium-based browsers and mobile browsers (Firefox for Android, Kiwi browser, Yandex browser, etc.) are also supported.
    • Added markings in a captured page can be viewed normally with most browsers, including mobile browsers and those without WebScrapBook installed.
    • Fulltext search for generated static site pages is available without browser configuration in prior.
  • Remote access:
    • Remote access of scrapbooks from multiple devices is supported, as long as a backend server is properly configured.
  • More customizable capture features:
    • Support capture options such as link, remove, save current, and save used for images, media, styles, etc.
    • Support capure options for CSS images, favicon, canvas, embed, object elements, and shadow roots, etc.
    • Support pre-processing or per-site customization of the web page being captured using the capture helper(s).
  • Improved batch capture:
    • Support batch capture for selected tabs or links.
    • Support customization of capture mode or options for all or individual task in the batch capture manager.
  • More flexible save format and data structure:
    • ID of the captured page is based on UTC and captures from different timezones won't conflict with each other.
    • A web page can be saved as different patterns of filename, such as datetime, title, source domain, UUID, and can be saved in a subfolder.
    • A web page can be saved as a ZIP package (HTZ or MAFF) or single HTML file.
  • More flexible scrapbook:
    • An item of any type in a scrapbook can own child items.
    • An item in a scrapbook can be a child of multiple items.
    • An item deleted from the scrapbook is put into the recycle bin and can be recovered in the future, for better security.
    • Support more operation for multi-selected items, such as open, view source page, browse in the file manager, search within, sort, and capture again.
    • Support cross-scrapbook item locating, fulltext search, and hinting or finding captured pages.
  • Improved fulltext cache:
    • Performance of fulltext caching is greatly improved with the help of backend server.
    • Auto-update fulltext cache when a page is captured or edited.

Limitations

  • WebExtension framework used by WebScrapBook is more restricted than XUL/XPCOM used by ScrapBook X, and some features are not natively supported and requires a corresponding backend server, leading to suboptimal performance, such as organizing the scrapbook(s). (This may be insignificant due to the performance improvement of most modern browsers, though.)

  • The following features are theoretically implementable, but there are still lots of technical issues that are not yet resolved, and they will not be implemented in the near future:

    • Interactive in-depth capture
    • Combine wizard

Compatibility

Data structure of ScrapBook X is different from WebScrapBook and cannot be used interchangeably. An automatic conversion is available using the conversion tool of PyWebScrapBook.

Install Python and PyWebScrapBook, and then run the command in CLI to convert the data into WebScrapBook-compliant format:

wsb convert sb2wsb /path/to/scrapbook /path/to/webscrapbook

An incomplete conversion from ScrapBook X to WebScrapBook previously can be further migrated using the command:

wsb convert migrate /path/to/webscrapbook [/path/to/output]

This tool supports in-place conversion by omitting the output path. Though it's recommended to output to another path or manage with a version control tool to prevent a potential error.

Backporting WebScrapBook to ScrapBook X is also available:

wsb convert wsb2sb /path/to/webscrapbook /path/to/scrapbook

As the data structure is more flexible in WebScrapBook, conversion from ScrapBook X to WebScrapBook is lossless, but some information may lose after conversion from WebScrapBook to ScrapBook, such as:

  • item appended to multiple parents (preserve only the first occurence)
  • item in the recycle bin
  • data file paths will change, and inter-links between items may break

Also note that compatibility validation of this tool is targeting ScrapBook X, and there may be minor compatibility issues if the output scrapbook is used by a legacy ScrapBook implementation without features introduced by ScrapBook X, such as:

  • note: legacy ScrapBook handles special chars (e.g. <&>) inconsistently, while ScrapBook X accepts plain text only.
  • note pages
  • container item whose type property is not "folder"
  • file with special or non-ASCII chars in filename