- In this project, news from Turkish newspapers are gathered and dumped in to MongoDB.
- Currently supported newspapers:
- Cumhuriyet
- Milliyet
- Posta
- Sabah
- Star
- Download the binary for your operating system from releases:
gazeteci [--mongo=MONGO_DB_URL] [--db=MONGO_DB] [--coll=MONGO_COLLECTION]
- For every newspaper, RSS feeds are checked for getting news URLs and then these URLs are used for scraping full texts of news.
- In order to add a new newspaper you need to implement
NewsFeed
interface and register your new newspaper bygazeteci.Register
.