Skip to content

Chrome extension

Sigit Dewanto edited this page Dec 31, 2019 · 6 revisions

Installation

You can install the extension from Chrome Web Store or build them on your own (see instruction below).

Build the extension files

  1. Install NodeJS and NPM
  2. Clone Webdext repository git clone git@github.com:seagatesoft/webdext.git
  3. Enter Webdext directory and run npm install
  4. Run gulp build-chrome and the extension files will be built into build directory

Install the extension

Install Chrome extension

  1. Open chrome://extensions/ on your Chrome/Chromium
  2. Check "Developer mode" checkbox
  3. Click "Load unpacked extension" button
  4. Browse to the directory where the extension files are saved and click "Open"

Usage

Intelligent extraction

Intelligent extraction Intelligent extraction result

  1. Open a web page containing list of data records.
  2. Click Webdext icon on the toolbar.
  3. Click "Intelligent Extract" button, wait for a few seconds.
  4. Extracted data records will be displayed in a new tab.
  5. Intelligent extraction can find more than 1 data region. Use pagination to get your data region of interest.
  6. You can give label to the column and remove unnecessary column.
  7. You can export the data to CSV/JSON format by clicking "Export Data" button.
  8. You can create an XPath extractor (wrapper) and store it for the next usage. Webdext will learn the XPath extractor based on currently displayed data region. You could do it by clicking "Save Extractor" button and then type the name of your extractor. Note that you must keep the original web page tab open to create the extractor. Extraction using XPath extractor is faster than extraction using Intelligent Extraction.

Extraction using existing extractor (wrapper)

  1. Open a web page containing list of data records and using template that can be processed by an existing extractor.
  2. Click Webdext icon on the toolbar.
  3. Click "Use Existing Extractor" button.
  4. Click "Extract" button below your extractor of choice.
  5. Extracted data records will be displayed in a new tab.
  6. You can export the data to CSV/JSON format by clicking "Export Data" button.

Extractors management

  1. Click Webdext icon on the toolbar.
  2. Click "List of Existing Extractors" button.
  3. List of existing extractors will be displayed on a new tab.
  4. You can view the internal details (XPaths) of an extractor by clicking "Show" button.
  5. You can also delete an extractor and create a new one manually.