The Wayback Machine contains more than 600 billion archives today, these huge amounts of web pages need to be analyzed. Our goals are to produce the reports about hosts and domains of the archives in order to help inform web archiving efforts.
This repository is being developed as part of the 2018 Google Summer of Code.