Skip to content
No description, website, or topics provided.
Branch: master
Clone or download

README.md

mcn-source-ct

Part of my MCN (make clean no)-project.

Scripts for downloading and extracting .no domains from the data of the commoncrawl.org project.

Howto:

  • git submodule init
  • git submodule update
  • sudo apt install python-bs4 parallel
  • ./get-indexes.sh
  • ./verify-indexes.sh
  • ./list_domains.sh

Source: http://commoncrawl.org Description: Looks for domains in data from the Common Crawl project. Credit: This result uses data from the Common Crawl Foundation, their term of service may be found here http://commoncrawl.org/terms-of-use/

You can’t perform that action at this time.