Skip to content

bahtou/TechCrunch-HomePage-Spider

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Spider that crawls the home page of TechCrunch (http://techcrunch.com/)

Scrapy framework is used to scrape information from the homepages of TechCrunch.
Data on who posted, posters link, headline, headline link and time posted are extracted.

The data is then dumped into MySQLdb.

Checkout:
    http://scrapy.org/

About

Spider that crawls the home page of TechCrunch (http://techcrunch.com/)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages