Overview of how MWS can be used to scrape data from a website: 1) Highlighted website data corresponding to the scraped data table, 2) red circle which represents the user’s detected gaze, 3) webcam feed showing how user’s gaze is being detected, 4) MWS feedback dialog (also uttered via speech synthesis) used to respond to user commands and guide user actions, 5) user feedback dialog used to display what voice command was detected by the system
MWS is a Chrome browser extension that enables web scraping via gaze detection and voice commands in a few simple steps. Once initiated on a website, MWS detects the point on the website where the user is looking, through their webcam feed, and continuously determines and highlights the data table that that point belongs to. When the desired data table is highlighted, a user can use voice commands to stop the gaze detection and save the highlighted data table, filter which of the data table’s columns to keep or remove and download the data table as a JSON file.
Team Members: Kapaya Katongo
Paper: https://github.com/Kapaya/multimodal-web-scraping/blob/main/paper.pdf
Demo: https://youtu.be/Ee34bGjcYgU
Code: https://github.com/Kapaya/multimodal-web-scraping
I've mostly tested MWS on Google Scholar because its CSS definitions to do not interfere with MWS's interface unlike most other websites. Here are the steps required to run it:
- Clone the Github repository or downloaded the provided zip
- Open Chrome and load its browser extensions page by pasting "chrome://extensions" into the search bar
- Once on the browser extensions page, make sure "Developer mode", in the top right, is toggled on
- Click "Load unpacked" and select the root of the cloned repository folder or unzipped folder
- Navigate to https://scholar.google.com/citations?user=G-HM2ikAAAAJ&hl=en, right click to open the browser context menu and click on MWS
- You will be prompted to give access to your camera and microphone. Please allow access to both
- The MWS interface will appear in the bottom right of the page. WebGazer.js sometimes takes about a minute to initialize the first time it is run on a browser. Please wait until you see the webcam feed appear in the interface. If a minute passes without the webcam feed appearing, please refresh the page and retry.
- Run through the tasks shown in the demo video and described in the User Study section of the paper. The available voice commands can be viewed by saying "help" or referencing the table in the paper.
MWS's code is laid out as follows:
- modules/boostrap: contains boostrap code used for styling
- modules/underscore: contains underscore library which is used to debounce the voice recognition event listener
- modules/webgazer: contains WebGazer.js source files and dependencies
- modules/calibration.js: contains code for the in-website calibration mechanism
- modules/constants.js: central definition of variables used across modules
- modules/controls.js: contains code that render's MWS interface and implements its functionality
- modules/dom-helpers.js: contains DOM-related helper methods used across modules
- modules/gaze-recognition.js: contains code that implements the gaze detection using WebGazer.js
- modules/gaze-utils.js: contains code that contains gaze-related helper methods used across modules
- modules/scraper.js: contains code that implements the web scraping after the web scraping programming is generated
- modules/visual-feedback.js: contains code that implements visual feedback such is highlighting of row and column DOM elements
- modules/voice-recognition.js: contains code that implements voice recognition using the Web Speech API
- modules/voice-synthesis.js: contains code that implements speech synthesis using the Web Speech API
- modules/wrapper-induction.js: contains code that implements the wrapper induction algorithm to generate a web scraping program
- background.js: contains code that initiates MWS from the browser context menu
- index.css: contains CSS that styles various parts of MWS's interface
- index.js: entry point of MWS
- manifest.json: contains configuration for defining MWS's behavior as a Chrome extension