- Designed by fht.im
- Data from http://commoncrawl.org/
Visit url.fht.im
make sure you've installed python3 and virtualenv.
virtualenv venv -p /usr/bin/python3 # or use which python find your python3 path
source venv/bin/active
cd super-Django-CC && pip install -r requirements.txt
python manager.py runserver 127.0.0.1:8001
Then visit localhost:8001 you will get a preview.
get the code && docker build && docker run
git clone https://github.com/imfht/super-Django-CC && cd super-Django-CC && docker build . -t super_django_cc
Run it
docker run -p8001:8001 -d super_django_cc
Then visit localhost:8001 you will get a preview.
- What is this?
show how many urls and websites was exposed to web crawls. - Why I get very few result for my site?
all the data is from commoncrawl.org, throght it crawled loooots of pages in the internet. But crawl all website's page is impossable. - TOS & Rate limiting
TOS of the site as same as http://commoncrawl.org/terms-of-use/. Respectful robots is welcome. Respectful means the max rate is 5 req/s. If you wanner increase it please use commoncrawl's open data or contact me.