Skip to content

disinfoRG/ArticleParser

Repository files navigation

ArticleParser

ArticleParser is part of 0archive project. Its purpose is to

  1. pull raw data from several scraper databases,
  2. translate raw data into a standardized format and save it to the database of ArticleParser, and then
  3. publish the resulting dataset in the database to several places for storage.

An example of upstream scraper is ZeroScraper of 0archive. Dataset publishing can currently output to local files or Google Drive folders.

There is a diagram of 0archive system architecture to which you can refer.

The code runs on Python 3.7 or above. The system is tested on MariaDB 10.3.