Skip to content

Latest commit

 

History

History
8 lines (5 loc) · 379 Bytes

README.md

File metadata and controls

8 lines (5 loc) · 379 Bytes

scrap yp.com.hk

one-off crawling company info from yp.com.hk

Using BeautifulSoup and Requests

run get_url_pagination.py to get all url of search result(~15000) or directly use the csv file, run main.py to get things working. (Crawling name,tel,fax and email)

Because website structure is ever-changing, I recommend to use this piece of work as a learning material only.