Block or report user

Popular repositories

  1. ArchiveSpark

    An Apache Spark framework that facilitates access to Web Archives, enables easy data extraction as well as derivation, developed by the Internet Archive and L3S Research Center.

    Jupyter Notebook 45 7

  2. HadoopConcatGz

    A Splitable Hadoop InputFormat for Concatenated GZIP Files and *.(w)arc.gz

    Java 6 2

  3. Exspec

    Don't write specs anymore, just save 'em while testing your code interactively. Specs will become a byproduct.

    Ruby 5

  4. Web2Warc

    An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)

    Scala 4 2

  5. internetarchive-transfer-scripts

    Scripts to transfer archive.org collections, using https://github.com/jjjake/internetarchive

    Python 4 2

  6. IABooksOnArchiveSpark

    Analyze digitized books from the Internet Archive remotely with ArchiveSpark

    Scala 3

104 contributions in the last year

May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr Mon Wed Fri

Contribution activity First issue First repository Joined GitHub

April 2017

Seeing something unexpected? Take a look at the GitHub profile guide.