Skip to content

Scraping bloomberg.com to extract article text and save it to a mongodb collection.

Notifications You must be signed in to change notification settings

har777/bloomberg_scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Uses Scrapy to crawl bloomberg.com recursively for article text. The data is stored realtime to your local mongodb instance(currently configured to localhost, db:data, collection:items). Cd into the directory and do "scrapy crawl bloomberg_spider" to run the crawler.

About

Scraping bloomberg.com to extract article text and save it to a mongodb collection.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages