• A chrome browser extension

    JavaScript 67 102 Updated Oct 19, 2018
  • JavaScript 7 13 1 issue needs help Updated Oct 19, 2018
  • One webpage for every book ever published!

    Python 856 266 14 issues need help Updated Oct 19, 2018
  • brozzler - distributed browser-based web crawler

    Python 192 43 Apache-2.0 Updated Oct 18, 2018
  • page diff for wayback machine

    JavaScript 3 1 Updated Oct 17, 2018
  • Python Client Library for the Archive.org OpenLibrary API

    Python 32 14 AGPL-3.0 1 issue needs help Updated Oct 17, 2018
  • JavaScript 1 AGPL-3.0 Updated Oct 16, 2018
  • HTML 8 3 AGPL-3.0 1 issue needs help Updated Oct 16, 2018
  • Swift Updated Oct 14, 2018
  • MIRROR of upstream IA repository

    Rust 1 Updated Oct 12, 2018
  • Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

    Java 1,306 576 Updated Oct 12, 2018
  • The Internet Archive BookReader

    JavaScript 412 171 AGPL-3.0 Updated Oct 12, 2018
  • JavaScript 3 4 AGPL-3.0 Updated Oct 11, 2018
  • WARC writing MITM HTTP/S proxy

    Python 142 30 Updated Oct 11, 2018
  • web access control (exclusion oracle) tools for optional use with wayback machine

    JavaScript 8 Updated Oct 9, 2018
  • A Python 3.4 application that calculates and returns simhash values for Internet Archive's snapshots

    Python 2 Updated Oct 8, 2018
  • Internet Archive utility which converts abbyy to epub3

    Python 2 AGPL-3.0 Updated Oct 2, 2018
  • Internet Archive Decentralized Web Common API

    JavaScript 27 9 AGPL-3.0 Updated Oct 2, 2018
  • IA's public Wayback Machine (moved from SourceForge)

    Java 259 148 Updated Oct 1, 2018
  • Decentralized web Gateway for Internet Archive

    Python 16 3 AGPL-3.0 Updated Oct 1, 2018
  • rethinkdb python library

    Python 7 3 Apache-2.0 Updated Sep 28, 2018
  • surt

    Forked from rajbot/surt

    Sort-friendly URI Reordering Transform (SURT) python module

    Python 20 11 AGPL-3.0 Updated Sep 27, 2018
  • A queue-controlled browser automation tool for improving web crawl quality

    Python 45 22 Apache-2.0 Updated Sep 25, 2018
  • A repository of cleanup bots implementing the openlibrary-client

    Python 1 5 2 issues need help Updated Sep 23, 2018
  • Python 17 22 Updated Sep 23, 2018
  • Trough: Big data, small databases.

    Python 9 2 BSD-2-Clause Updated Sep 20, 2018
  • Archive.org OPDS Bookserver - A standard for digital book distribution

    Python 55 10 AGPL-3.0 Updated Sep 14, 2018
  • JavaScript 2 1 AGPL-3.0 Updated Sep 11, 2018
  • Python script to create CDX index files of WARC data

    Arc 11 14 AGPL-3.0 Updated Sep 10, 2018
  • Python 4 8 Updated Aug 30, 2018