Scraper for Industry Canada's list of Federally registered Corporations
Perl
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
README
ca_corps.csv
scrape.pl

README

This repository contains code to scrape Industry Canada's Federal Corporation search tool.

https://www.ic.gc.ca/app/scr/cc/CorporationsCanada/fdrlCrpSrch.html

Results are scraped by searching by Corporation Number as a 6 digit integer. The search string is prefix matched to the Corporation Number, resulting in up to 10 results in an HTML table. This table is scraped into a CSV file.

Requires: Scrappy Perl scraping library

Installing Dependencies: The easiest way is to install cpanm ("CPAN minus" - a great little Perl package manager) and then use it to install Scrappy:

  As root: curl -L http://cpanmin.us | perl - --self-upgrade
  Then: cpanm --sudo Scrappy