Skip to content

fredericvergnaud/extractify

Repository files navigation

Important : to mention Extractify in a publication, please use the following : « Extractify. Frederic Vergnaud, Mines Paris, PSL University, Centre for the Sociology of Innovation, i3 CNRS, France, https://github.com/fredericvergnaud/extractify »

Presentation

Extractify is a free extension for Chromium, developed in JavaScript under Atom, whose purpose is to scrap structured data on the web. It is particularly designed for collecting online comments or online conversations such as forums.

It allows you to:

  1. Select structured information on a web page (like tables with rows and columns), by direct selection on the web page, or manual selection by entering HTML tags and related CSS code
  2. Select the pagination of pages with the same structure and level
  3. Repeat the process as many times as desired for lower levels
  4. Scrape the whole selection
  5. Finally, obtain a file in json format that can be easily imported in other software, in L@ME for example.

What it does not allow: everything else!

Manual installation for Chrome

  1. Press the green « Clone or download » button on this page to download the latest version
  2. Unzip the downloaded archive
  3. In Chrome adress bar, go to extensions page by typing « chrome://extensions/ »
  4. Switch to « Developper mode » in the upper right corner
  5. Finally load the folder extractify-master as an « unpacked extension »

Usage

Go to the wiki to see how to use Extractify.

Love it ? Tell me !

Found a bug ? Don’t be afraid to open an issue.