Permalink
Switch branches/tags
Nothing to show
Commits on Jun 15, 2011
  1. Merge pull request #2 from mala/patch-1

    fix regexp for InvalidEncoding
    committed Jun 15, 2011
  2. fix regexp for InvalidEncoding

    mala committed Jun 15, 2011
Commits on Jun 13, 2011
  1. Checking in changes prior to tagging of version 0.29_01.

    Changelog diff is:
    
    diff --git a/Changes b/Changes
    index b4636be..0d407a6 100644
    --- a/Changes
    +++ b/Changes
    @@ -1,5 +1,31 @@
     Revision history for Perl extension XML::Liberal
    
    +0.29_01  Mon Jun 13 09:10:24 PDT 2011
    +        Bug Fixes:
    +        - InvalidEncoding: update error-message regex
    +        - HTMLEntity: ignore case in `&`
    +        - XHTMLEmptyTag: accept all HTML 4 elements
    +        - LowAsciiChars: fix all chars [RT#57958]; accept leading zeroes
    +        - ControlCode: don't claim to handle errors it doesn't fix [RT#57500]
    +        - UndeclaredNS: fix a weird Unicode edge case
    +        - EntityRef: leave `&` untouched within CDATA sections
    +        - UnquotedAttribute: fix missing attribute value
    +        - StandaloneAttribute: allow arbitrary whitespace
    +
    +        New Features:
    +        - UnclosedHTML: allow missing end tags for HTML-like elements
    +        - TrailingDoctype: allow a `<!DOCTYPE>` declaration after root element
    +        - TrailingElements: allow more elements after root element
    +        - NestedCDATA: fix nested CDATA sections
    +
    +        Improvements:
    +        - improve some error messages
    +        - remove uses of `$&` [RT#59237]
    +        - make remedies fully pluggable
    +        - fix location reporting for XML documents with long lines
    +        - add missing dependencies
    +        - depend on HTML::Entities, not ...::Numbered
    +
     0.22  Sat Oct  3 04:38:52 PDT 2009
             - BAD-euc.xml doesn't fail with recent libxml2 apparently(?)
    committed Jun 13, 2011
  2. Add missing depencies to Makefile.PL

    arc committed Jun 12, 2011
  3. Make error locations correct for >80-character lines

    It seems that libxml2 doesn't report column location in a reliable way: it
    left-truncates the context to 80 characters.  Since the context is the only
    way of determining the column, we have to scan the line containing the error
    for the context.
    arc committed Jun 12, 2011
  4. Make StandaloneAttribute remedy handle more cases

    Now allows any amount of whitespace between a malformed attribute and
    whatever comes next.
    arc committed Jun 12, 2011
  5. Make EntityRef remedy smarter about CDATA sections

    If the document contains any CDATA sections (or PIs), it fixes
    broken-ampersand errors one at a time, as discovered by the parser.  This
    preserves the expected content of those CDATA sections and PIs.
    
    If the document doesn't contain any, we bulk-fix all ampersands for
    efficiency, just as before.
    arc committed Jun 11, 2011
  6. New NestedCDATA remedy

    Fixes documents which contain a <![CDATA[...]]> section that's had another
    XML CDATA section blindly pasted into it.
    arc committed Jun 11, 2011
  7. Rework the large-scale logic

    - A new class XML::Liberal::Error encapsulates the message and location of a
      parser error thrown by an underlying parser
    
    - A driver class is responsible for taking a parser exception and generating
      a suitable XML::Liberal::Error instance representing it
    
    - Each remedy class has a class method `apply` taking a driver, an error
      object, and a reference to the XML source text; it returns a boolean
      indicating whether it managed to fix the error in question
    
    This change allows us to write remedies which need to see the whole of the
    XML source text in order to determine whether they can fix the error.
    arc committed Jun 11, 2011
  8. Use Class::Accessor, not hand-hacked accessor

    We already have C::A in the inheritance graph, so there's no downside to
    this.
    arc committed Jun 10, 2011
  9. Remove unused import

    arc committed Jun 10, 2011
  10. Add tests that remedies make the expected fix

    Rather than merely some fix that happens to make the document well-formed.
    arc committed Jun 9, 2011
  11. Fix Unicode edge case in UndeclaredNS remedy

    Something was, effectively, silently flipping the SvUTF8 flag on the
    document being parsed.
    arc committed Jun 9, 2011
  12. New TrailingElements remedy

    Fixes documents which have one or more elements after the end of the root
    element, by attempting to move them into the root element itself.
    
    There is also a test which ensures that TrailingDoctype and TrailingElements
    remedies can cooperate on a document which exhibits both problems.
    arc committed Jun 9, 2011
  13. New TrailingDoctype remedy

    Fixes documents which have a <!DOCTYPE> declaration after the end of the
    root element.
    arc committed Jun 8, 2011
  14. Fix overzealous ControlCode remedy

    It was being invoked for a couple of errors that it wasn't capable of fixing
    (and that no test cases triggered).
    arc committed Jun 8, 2011
  15. Make LowAsciiChars remedy slightly laxer

    Now allows any number of leading zeroes on the numeric character reference.
    arc committed Jun 8, 2011
  16. Bugfixes and enhancements for XHTMLEmptyTag remedy

    - Fix all relevant elements (not just a subset of them), using HTML::Tagset
      to find the list of elements
    
    - Previously, the remedy was capable of fixing more elements than its
      constructor believed; change to use exactly the same list of elements in
      both places
    
    - Allow any letter case in the empty HTML tag
    arc committed Jun 8, 2011
  17. Remove uses of $&

    arc committed Jun 8, 2011
  18. New UnclosedHTML remedy

    This helps for feeds with a a <description> containing a simplistic substring
    of a piece of HTML markup.  When the parser finds a close-tag mismatch where
    the unclosed element's name is that of a non-empty HTML element, inject an
    extra close-tag at that point.
    
    This change adds a dependency on HTML::Tagset.
    arc committed Jun 8, 2011
  19. Make remedies fully pluggable

    Previously, the LibXML driver had a large multi-way condition to decide
    which remedy (if any) could be applied to some error.  Instead, each remedy
    now has a class method which returns true iff it accepts the current driver,
    and the constructor is permitted to return nothing if the error does not
    look like one the remedy can fix.
    arc committed Jun 7, 2011
  20. Ignore case in HTMLEntity remedy

    I have seen real-world feeds containing such things; this one sometimes
    does, for example: http://www.prospect.org/articles_rss.jsp
    arc committed Jun 7, 2011
  21. Better error message when HTMLEntity remedy fails

    Previously emitted the somewhat obscure message "can't find named HTML
    entities", when the remedy found no named entity references that it was able
    to convert to a numeric character reference.  Now emits an entity-specific
    message for the first such unrecognised entity reference.
    
    This involves changing our dependency from HTML::Entities::Numbered to plain
    old HTML::Entities.
    arc committed Jun 7, 2011
  22. Adjust error regex for InvalidEncoding remedy

    It seems that libxml2 and/or XML::LibXML can sometimes produce different
    messages for invalid-encoding errors; perhaps the behaviour differs across
    versions.
    arc committed Jun 12, 2011
  23. Merge branch 'github'

    arc committed Jun 13, 2011
  24. gitignore build products

    arc committed Jun 13, 2011
  25. Checking in changes prior to tagging of version 0.22. Changelog

    diff is:
    
    Index: Changes
    ===================================================================
    --- Changes	(revision 31331)
    +++ Changes	(working copy)
    @@ -1,5 +1,8 @@
     Revision history for Perl extension XML::Liberal
    
    +0.22  Sat Oct  3 04:38:52 PDT 2009
    +        - BAD-euc.xml doesn't fail with recent libxml2 apparently(?)
    +
     0.21  Tue Mar 17 11:14:46 PDT 2009
             - Now it works with libxml 2.7.* or over if you use XML::LibXML 1.69_02 or over
               (Thanks to mala)
    committed with arc Oct 3, 2009