Skip to content

0xboz/alexa_top_sites_scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Alexa.com Top Sites Scraper

What is this all about?

Alexa.com is an internet traffic ranking company which provides premium data for its customers. It also offers a limited list of top sites on the web for free.

This script was created to scrape those top sites and store them in a text file. It simply goes through all sub-catagories, retrieve the sites in each table and remove any duplicates.

And of course, if you need the premium data from Alexa.com, you can find more information here.

Pricing: $0.0025 per URL returned (e.g. $.25 for 100 URLs) at the time of writing.

Prerequisite

Virtualenv or similar (such as, pipenv) is recommended before running the script.

Python 3.6.7

pip install requests==2.21.0
pip install beautifulsoup4==4.6.3

How to use the script

git clone https://github.com/0xboz/alexatopsites.git
cd alexatopsites
python alexatopsites.py

A folder named as the current date will be created when you run the script each time. All results will be stored in a text file.

About

Alexa.com top sites scraper

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages