Skip to content

Its a small python program that scrapes a website and its needed pages to obtain all schools(primary and secondary) in Tanzania. Warning: Scrape according to website terms of service, dont just overload any server.

Notifications You must be signed in to change notification settings

AnoRebel/template-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

template-scraper

Requisites

We need the following to run this script successfully:

  1. I'm using python 3.5.2, so i don't know about other versions, but i think it would work on any python 3.x version(www.python.org)
  2. Also used 'json' library which comes by default with most python versions
  3. Either the 'requests' module(pip insall requests) or the 'urllib2' module(pip install yieldfrom.urllib.requests)
  4. The time function which comes with python by default, we use this to time our program so it doesn't overload the server with our requests
  5. BeautifulSoup4, i use this for the scraping bussiness but you can use any scraper you're used to, (pip install bs4) or (pip install beautifulsoup4)
    You need a good internet connection to make it run fast.. This version is an alternative, (using json instead of mongodb,) to my school-template-scraper(http://www.github.com/anorebel/school-template-scraper)
    Also, the general.py is a combination of both the primary.py and secondary.py, you can run the two or you can just run general.py.. Incase of any problem, contact me.

About

Its a small python program that scrapes a website and its needed pages to obtain all schools(primary and secondary) in Tanzania. Warning: Scrape according to website terms of service, dont just overload any server.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages