Skip to content
Permalink
Branch: master
Commits on Jun 19, 2019
Commits on May 23, 2019
  1. updating code to run on python3.5

    Fernando-Melo authored and root committed May 23, 2019
Commits on May 21, 2019
  1. Update IndexArcs

    igobranco authored and root committed May 21, 2019
  2. Update gitignore.

    igobranco authored and root committed May 21, 2019
Commits on May 20, 2019
  1. Update hadoop version

    igobranco committed May 20, 2019
  2. Freeze pywb version

    igobranco committed May 20, 2019
Commits on Jun 27, 2018
  1. Now only indexing cdx requests that correspond to http status codes o…

    Fernando-Melo committed Jun 27, 2018
    …f 100s, 200s and 300s. Changed hadoop input type to lineinput to run faster in the new cluster
Commits on Dec 12, 2016
Commits on Dec 5, 2016
  1. index arcs to cdx

    Fernando-Melo committed Dec 5, 2016
  2. Index Arcs to CDX

    root
    root committed Dec 5, 2016
Commits on Feb 22, 2016
  1. indexwarcsjob.py: update hard-coded path to be correct (TODO: load fr…

    ikreymer committed Feb 22, 2016
    …om env)
    
    reqs: update to correct python-hadoop git
Commits on Nov 24, 2015
Commits on Oct 9, 2015
  1. build local zipnum: add line numbering to final index (summary), add …

    ikreymer committed Oct 9, 2015
    …option to specify cdx lines per block
Commits on Jun 11, 2015
  1. Merge pull request ikreymer#2 from machawk1/patch-1

    ikreymer committed Jun 11, 2015
    Fixed various MD weirdnesses
  2. Fixed various MD weirdnesses

    machawk1 committed Jun 11, 2015
Commits on Mar 29, 2015
  1. minor fixes from latest CC index build:

    ikreymer committed Mar 29, 2015
    - fix typos
    - update reducer start for cluster job to be much later
    - update reqs
Commits on Mar 13, 2015
  1. readme tweaks

    ikreymer committed Mar 13, 2015
  2. readme tweaks

    ikreymer committed Mar 13, 2015
  3. Add local cluster build info

    ikreymer committed Mar 13, 2015
Commits on Mar 12, 2015
Commits on Mar 11, 2015
  1. readme update

    ikreymer committed Mar 11, 2015
  2. make scripts runnable

    ikreymer committed Mar 11, 2015
  3. update path to index_env.sh

    ikreymer committed Mar 11, 2015
  4. fix typos

    ikreymer committed Mar 11, 2015
  5. tweak README

    ikreymer committed Mar 11, 2015
  6. Update README after refactor

    ikreymer committed Mar 11, 2015
  7. add .gitignore and LICENSE

    ikreymer committed Mar 11, 2015
Commits on Mar 10, 2015
  1. refactoring: use cmdline options instead of fixed constants!

    ikreymer committed Mar 10, 2015
    rename job files to end in job
    add integrated samplecdx script for running samplecdxjob and converting to sequencefile
    add seqfileutils
Older
You can’t perform that action at this time.