Source code and scripts for the Webis Web Archiver
You need to have Docker installed.

Then, on a Unix machine:

  • run src-bash/ for archiving web pages. It will display usage hints.
  • run src-bash/ for reproducing from an archive. It will display usage hints.

The scripts will automatically download and run the image (2GB+ due to all the fonts).

For other OSes, have a look at the shell scripts and adjust the call to docker run accordingly.

Custom user simulation scripts

  • Write a class that extends InteractionScript.

  • You can use the ScrollDownScript as an example, or extend it.

  • The utility class Windows offers static helper methods for frequently used interactions.

  • Compile your script with the binaries in the class path and create a JAR from it.

  • Place the JAR into a directory named "scriptname-1.0.0", where you replace "scriptname" by the name of your script.

  • Create a file "script.conf" with the following content and put it into the same directory

    script = packages.of.your.ScriptClass; =
    environment.version = 1.0.0

    where you replace "packages.of.your.ScriptClass" accordingly. For the example ScrollDownScript, that would be

    script = de.webis.webarchive.environment.scripts.ScrollDownScript
  • When running or, specify the directory that contains the new directory with "--scriptsdirectory" and give the script name (as in the new directory) with "--script".


