Skip to content
Danny Lin edited this page Oct 5, 2020 · 1 revision

Generate site index (WebScrapBook < 0.79)

A site indexer is built in and can be used to generate a list of the captured web pages. It can also be used to import data from ScrapBook (X).

Usage

  • Enter the site indexer (toolbar button > Generate site index), drag and drop the ScrapBook folder into its tab (or pick it using the pick a folder or pick ZIP files), and wait for the indexing to complete.

    Example: If <default download folder> is C:\User\Test\Downloads, and <scrapbook folder> is default WebScrapBook/data, it should look like this after you have captured some pages:

    C:\User\Test\Downloads\WebScrapBook
    C:\User\Test\Downloads\WebScrapBook\data
    

    Drag C:\User\Test\Downloads\WebScrapBook into the indexer if you want to generate index for the scrapbook folder.

  • When the indexing process completes, a ZIP file will be generated and downloaded. Extract its contents into the original ScrapBook folder and open the generated tree/map.html or tree/frame.html to view the list of captured pages.

    Example: Following the above cace, your scrapbook folder should look like this after you have merged the ZIP contents into it:

    C:\User\Test\Downloads\WebScrapBook
    C:\User\Test\Downloads\WebScrapBook\data
    C:\User\Test\Downloads\WebScrapBook\tree
    C:\User\Test\Downloads\WebScrapBook\tree\favicon
    C:\User\Test\Downloads\WebScrapBook\tree\feed.atom
    C:\User\Test\Downloads\WebScrapBook\tree\frame.html
    C:\User\Test\Downloads\WebScrapBook\tree\fulltext.js
    C:\User\Test\Downloads\WebScrapBook\tree\index.html
    C:\User\Test\Downloads\WebScrapBook\tree\map.html
    C:\User\Test\Downloads\WebScrapBook\tree\meta.js
    C:\User\Test\Downloads\WebScrapBook\tree\toc.js
    

    favicon, feed.atom, fulltext.js, and index.html may not exist depending on the options.

  • Files such as tree/meta.js and tree/toc.js, for metadata and table of contents (TOC) respectively, can be manually edited, as long as the data format is kept correct.

  • The site indexer can be run repeatedly for a ScrapBook folder. Pages that are created, updated, or deleted after last indexing will be detected and the updated index files will be generated. A corresponding backup will be created for each file that should be updated (e.g. tree.bak/meta.js for tree/meta.js), and you can use the backup to restore it if something goes wrong. A zero-sized file will be created for each file that should be deleted, you can delete it or just leave it there (tip: locate them using the file manager with size sorting).

  • After new pages are captured or metadata is edited, run this indexer repeatedly to add newly captured pages or check for potential errors, and refresh tree/map.html or tree/frame.html to view the updated result.

  • If backend server is hosted, click Index for server to index all scrapbooks on the server, and related files will be automatically updated through the backend server, and manual unzipping and merging of the ZIP file(s) is not needed.

Tips

  • You can add the generated tree/map.html or tree/frame.html to browser bookmarks for later use.

  • You can create multiple ScrapBook folders as needed. To transfer data between them, move the item (a directory or archive file) from one ScrapBook folder to another, and run the site indexer for both ScrapBook folders to update the index. Additionally, you can drop multiple ScrapBook folders into the site indexer to index all of them at once.

Clone this wiki locally