Skip to content

imfht/super-Django-CC

master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 

About

Online site

Visit url.fht.im

Preview

Build

install from source

make sure you've installed python3 and virtualenv.

1. create virtual work directory and active it.

virtualenv venv -p /usr/bin/python3 # or use which python find your python3 path
source venv/bin/active

2. install requirements

cd super-Django-CC && pip install -r requirements.txt

3. Run it

python manager.py runserver 127.0.0.1:8001

Then visit localhost:8001 you will get a preview.

build by docker

get the code && docker build && docker run

git clone https://github.com/imfht/super-Django-CC && cd super-Django-CC && docker build . -t super_django_cc 

Run it

docker run -p8001:8001 -d super_django_cc

Then visit localhost:8001 you will get a preview.

Q&A

  1. What is this?
    show how many urls and websites was exposed to web crawls.
  2. Why I get very few result for my site?
    all the data is from commoncrawl.org, throght it crawled loooots of pages in the internet. But crawl all website's page is impossable.
  3. TOS & Rate limiting
    TOS of the site as same as http://commoncrawl.org/terms-of-use/. Respectful robots is welcome. Respectful means the max rate is 5 req/s. If you wanner increase it please use commoncrawl's open data or contact me.

About

super-Django-CC is a simle web interface for commoncrawl.org

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published