Skip to content
Commits on Jun 23, 2008
Commits on Apr 18, 2008
  1. TAG 1.1

    Stuart Sierra committed Apr 19, 2008
  2. Omits XHTML header/footer & encloses in div.prohtml.

    Stuart Sierra committed Apr 19, 2008
Commits on Mar 27, 2008
  1. Makefile.in additions for altlaw_parse_pdf.

    Stuart Sierra committed Mar 27, 2008
Commits on Feb 16, 2008
  1. @dpkp

    Altlaw PDF Parser - v1.0

    dpkp committed Feb 16, 2008
    Parses court opinion in pdf format and outputs in the bulk.resource.org XHTML format.
Commits on Feb 3, 2008
  1. @dpkp

    DAR features 0.1 - dpkp

    dpkp committed Feb 4, 2008
Commits on Jan 18, 2008
  1. @dpkp

    PDF Parser updates - v0.02

    dpkp committed Jan 18, 2008
Commits on Jan 14, 2008
  1. @dpkp

    Initial Import - Altlaw PDF Parser

    dpkp committed Jan 14, 2008
Commits on Nov 20, 2007
  1. Altered pdftohtml to output em-based units.

    Stuart Sierra committed Nov 20, 2007
    Running "pdftohtml file.pdf" with no other arguments will create
    file.html, containing HTML divs with the minimum necessary CSS
    styling information to position the text with HTML em units.
    This allows browser font-resizing to work correctly.
    No font information is included in the HTML file.
    
    The output is not a complete HTML file, but rather a fragment
    of UTF-8 encoded HTML to be included in an HTML template.
    
    This is a messy hack, and breaks some command line arguments,
    notably "-xml".  But "-stdout" still works.
  2. @dpkp

    Fixed more word-spacing issues.

    dpkp committed Nov 20, 2007
    Message-ID: <cfe1bfbd0711182230k948b4d0g964fa0e9893ff21e@mail.gmail.com>
    Date: Sun, 18 Nov 2007 22:30:38 -0800
    From: "Dana Powers" <dana.powers@gmail.com>
    To: lawcommons-dev@lawcommons.org
    Subject: Re: [lawcommons-dev] Improved PDF-to-HTML rendering
    
    I've fixed a few more bugs in the pdf rendering code.  I've only
    tested the patch against 0.6.1 (0.6.2 was released recently), but I
    think it should work with either.
    These should look better after the patch (they had strings of
    mashed-together words):
    http://www.altlaw.org/v1/cases/194306
    http://www.altlaw.org/v1/cases/194316
    http://www.altlaw.org/v1/cases/194317
    I've also improved the footnote parsing, so that should look better everwhere.
  3. Imported upstream poppler-0.6.2

    Stuart Sierra committed Nov 20, 2007
Commits on Nov 7, 2007
  1. @dpkp
  2. TAG upstream 0.6.1

    Stuart Sierra committed Nov 7, 2007
  3. Imported upstream poppler-0.6.1.

    Stuart Sierra committed Nov 7, 2007
Something went wrong with that request. Please try again.