Permalink
Switch branches/tags
Nothing to show
Commits on May 28, 2012
  1. Note that the encoding-related code is revamped

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@738 73d2b349-402e-0410-baf4-070fd12ab5b7
  2. Upgrade gb2312 to gb18030 in the XML declaration

    kurtmckee committed May 28, 2012
    Fixes issue 346.
    Thanks to Google user flytwokites for reporting this issue!
    
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@737 73d2b349-402e-0410-baf4-070fd12ab5b7
  3. Demonstrate issue 346

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@736 73d2b349-402e-0410-baf4-070fd12ab5b7
  4. Consolidate `toUTF8()` into `convert_to_utf8()`

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@735 73d2b349-402e-0410-baf4-070fd12ab5b7
  5. Consolidate `_getCharacterEncoding()` into `convert_to_utf8()`

    kurtmckee committed May 28, 2012
    This additionally requires modifying `_toUTF8()`, since the BOM
    will have already been stripped from the data when it's passed
    to `_toUTF8()`. Changing the behavior of `_toUTF8()` forces the
    BOM-with-invalid-characters unit test to be updated as well.
    
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@734 73d2b349-402e-0410-baf4-070fd12ab5b7
  6. Move HTTP charset detection to clean up code flow

    kurtmckee committed May 28, 2012
    The code now follows a clearer BOM -> XML -> HTTP -> RFC 3023 flow.
    
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@733 73d2b349-402e-0410-baf4-070fd12ab5b7
  7. Add/modify comments in `_getCharacterEncoding()`

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@732 73d2b349-402e-0410-baf4-070fd12ab5b7
  8. Make encoding variable names consistent and clear

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@731 73d2b349-402e-0410-baf4-070fd12ab5b7
  9. Begin simplifying character encoding code

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@730 73d2b349-402e-0410-baf4-070fd12ab5b7
  10. Rename `_stripDoctype()` to `replace_doctype()`

    kurtmckee committed May 28, 2012
    Bonus: improve the function docs and make the code more PEP-8 compliant.
    
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@729 73d2b349-402e-0410-baf4-070fd12ab5b7
  11. Consolidate the character encoding conversion code

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@728 73d2b349-402e-0410-baf4-070fd12ab5b7
  12. Stop processing early if the server sent HTTP 304

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@727 73d2b349-402e-0410-baf4-070fd12ab5b7
  13. Condense the `if` statements into the `for` loop

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@726 73d2b349-402e-0410-baf4-070fd12ab5b7
  14. Remove the redundant `if data is None` check

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@725 73d2b349-402e-0410-baf4-070fd12ab5b7
  15. Reflow, and add comments to, `_stripDoctype()`

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@724 73d2b349-402e-0410-baf4-070fd12ab5b7
  16. Compile the regular expressions at import

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@723 73d2b349-402e-0410-baf4-070fd12ab5b7
  17. Make `_getCharacterEncoding()` PEP8 compliant

    kurtmckee committed May 28, 2012
    Additionally, add 'u32' and 'utf32' to the list of declared XML
    encodings to match and override with the sniffed XML encoding.
    
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@722 73d2b349-402e-0410-baf4-070fd12ab5b7
  18. Save the compiled regular expressions

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@721 73d2b349-402e-0410-baf4-070fd12ab5b7
  19. Improve the docstring and comments in `_toUTF8()`

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@720 73d2b349-402e-0410-baf4-070fd12ab5b7
  20. Simplify and correct UTF-16 BOM sniffing behavior

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@719 73d2b349-402e-0410-baf4-070fd12ab5b7
  21. Use the BOMs defined in the `codecs` module

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@718 73d2b349-402e-0410-baf4-070fd12ab5b7
  22. Rearrange the order of the BOM checks

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@717 73d2b349-402e-0410-baf4-070fd12ab5b7
  23. Rearrange the encoding sniffing

    kurtmckee committed May 28, 2012
    Bonus: shorter expressions and improved readability!
    
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@716 73d2b349-402e-0410-baf4-070fd12ab5b7
  24. Use the BOMs defined in the `codecs` module

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@715 73d2b349-402e-0410-baf4-070fd12ab5b7
  25. `_UTF32_AVAILABLE` is no longer necessary

    kurtmckee committed May 28, 2012
    It's been replaced by adding LookupError exception handling.
    
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@714 73d2b349-402e-0410-baf4-070fd12ab5b7
  26. Minimize the amount of code in the `try` statement

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@713 73d2b349-402e-0410-baf4-070fd12ab5b7
  27. Precompute the character encoding BOMs and markers

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@712 73d2b349-402e-0410-baf4-070fd12ab5b7
  28. Inline the `_parseHTTPContentType()` code

    kurtmckee committed May 28, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@711 73d2b349-402e-0410-baf4-070fd12ab5b7
Commits on May 23, 2012
  1. Catch SAXException instead of SAXParseException

    kurtmckee committed May 23, 2012
    Fixes issue 352.
    Thanks to William J. Bowman for reporting this!
    
    Additionally reported at:
    
    http://flexget.com/ticket/1446
    https://bugs.launchpad.net/lxml/+bug/1001301
    
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@710 73d2b349-402e-0410-baf4-070fd12ab5b7
  2. Demonstrate issue 352

    kurtmckee committed May 23, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@709 73d2b349-402e-0410-baf4-070fd12ab5b7
Commits on May 10, 2012
  1. Update HTTP Last-Modified example in docs

    kurtmckee committed May 10, 2012
    Fixes issue 350.
    Thanks to Google user buburno for reporting this!
    
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@708 73d2b349-402e-0410-baf4-070fd12ab5b7
Commits on May 3, 2012
  1. Version bump to 5.1.2

    kurtmckee committed May 3, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@706 73d2b349-402e-0410-baf4-070fd12ab5b7
  2. Remove references to the `date` key

    kurtmckee committed May 3, 2012
    This was the only place that that key was referenced; it currently
    exists in the feedparser code only as a mapping, and will be removed
    in a future release to reduce the number of reserved keys.
    
    Also, clean up some copypasta mistakes with the example pubDates.
    
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@705 73d2b349-402e-0410-baf4-070fd12ab5b7
  3. Take note of the ENTITY declaration security fix

    kurtmckee committed May 3, 2012
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@704 73d2b349-402e-0410-baf4-070fd12ab5b7
  4. Prevent ENTITY declarations from hiding in encoded documents

    kurtmckee committed May 3, 2012
    ENTITY declarations can be used to create denials of service through
    exponential memory consumption. The previous behavior allowed such
    declarations to hide in non-ASCII-compatible encoded documents. Now
    feedparser will normalize the encoding first and then replace the
    DOCTYPE and ENTITY declarations.
    
    git-svn-id: http://feedparser.googlecode.com/svn/trunk@703 73d2b349-402e-0410-baf4-070fd12ab5b7