Permalink
Commits on Dec 7, 2017
  1. Update for 2.1.2 release

    willkg committed Dec 7, 2017
  2. Merge pull request #338 from willkg/html5lib-10

    willkg committed Dec 7, 2017
    Support html5lib-python 1.0.1
  3. Support html5lib-python 1.0.1

    willkg committed Dec 7, 2017
    Fixes #337
Commits on Nov 21, 2017
  1. Merge pull request #335 from hugovk/fix-typo

    willkg committed Nov 21, 2017
    Fix typo
  2. Fix typo

    hugovk committed Nov 21, 2017
Commits on Nov 11, 2017
  1. Merge pull request #332 from sblondon/master

    willkg committed Nov 11, 2017
    Insert the provided type in error message when it's a wrong one
Commits on Nov 10, 2017
Commits on Oct 2, 2017
  1. Merge pull request #329 from willkg/release-2.1.1

    willkg committed Oct 2, 2017
    Prep for 2.1.1 release
  2. Prep for 2.1.1 release

    willkg committed Oct 2, 2017
  3. Merge pull request #327 from willkg/324-unicodedecodeerror

    willkg committed Oct 2, 2017
    fixes unicodedecodeerror in "python setup.py build" with LANG=
  4. Change how setup.py opens files

    willkg committed Oct 2, 2017
    In Python3 environments where LANG=, then Python uses the ascii codec to decode
    files. That broke when we added a non-ascii character to CHANGES in the last
    release.
    
    Fixes #324.
  5. Add lint, docs, and LANG= build rules to tox.ini

    willkg committed Oct 2, 2017
    This adds a bunch of rules to tox.ini to improve release quality:
    
    * docs: builds the docs and verifies no errors
    * lint: lints the bleach codebase and verifies no issues
    * py{27,33,34,35,36}-build-no-lang: builds bleach with LANG=
  6. Merge pull request #326 from willkg/prep-for-2.1.1

    willkg committed Oct 2, 2017
    Prep for v2.1.1 development
  7. Prep for v2.1.1 development

    willkg committed Oct 2, 2017
  8. Merge pull request #325 from staticshock/argument-must-of

    willkg committed Oct 2, 2017
    Fix grammar in error message, "argument must of text type"
Commits on Sep 28, 2017
  1. Prep for Bleach 2.1 release

    willkg committed Sep 28, 2017
  2. Merge pull request #323 from willkg/298-sanitize-characters

    willkg committed Sep 28, 2017
    Convert invisible characters to ? in Characters tokens
  3. Convert invisible characters to ? in Characters tokens

    willkg committed Sep 27, 2017
    This prevents someone from using backspace and other invisible characters from
    tricking a user into copy and pasting a seemingly innocuous command into doing
    something they really don't want to do.
    
    I made the replacement character a constant figuring people can replace it if
    they want something different.
    
    Fixes #298.
Commits on Sep 21, 2017
  1. Merge pull request #321 from willkg/294-entities-in-attributes

    willkg committed Sep 21, 2017
    Fix character entities in HTML attributes
  2. Fix character entities in HTML attributes

    willkg committed Sep 21, 2017
    This is squirrely, but what's going on is that the tokenizer no longer consumes
    entities so when the serializer goes to convert all & to &, it's not a good
    thing to do.
    
    This wraps the serializer and looks for HTML attribute values, undoes the
    categorical "all & shall be &!" and then redoes it taking care only to
    escape bare &.
  3. Merge pull request #320 from willkg/clarify-contexts

    willkg committed Sep 21, 2017
    Further clarify where it's safe to use bleach.clean() output
  4. Merge pull request #319 from willkg/143-entities-2

    willkg committed Sep 21, 2017
    prevent unescaping of entities (#143)
Commits on Sep 20, 2017
  1. Rework entity fix to not escape all &

    willkg committed Sep 19, 2017
    This redoes the entity fix such that it doesn't escape all &. What this does is
    add a "sanitize_characters" pass for Characters tokens which looks for entities
    within the token data and extracts them into Entity tokens.
    
    Handling entities this way prevents them from being expanded during
    tokenization, but keeps them as distinct entities during serialization so they
    don't get escaped.
    
    This handles the &curren problem by requiring that an entity always start with a
    & and end with a ;. If it doesn't, then it's treated like characters and all &
    are escaped.
  2. Merge manual regression test in

    willkg committed Jul 28, 2017
    We don't have the problem with \r anymore, so we can merge this in with the
    other regression tests.
  3. Prevent HTMLTokenizer from unescaping entities

    willkg committed Jul 27, 2017
    This overrides the HTMLTokenizer's .consumeEntity() method such that it doesn't
    convert character entities.
    
    This also fixes some other escaping/unescaping oddities so that the output of
    bleach.clean() is more correct in regards to intended behavior.
    
    One thing this breaks is the idempotent property for bleach.clean()--it's no
    longer idempotent. Since it escapes text more correctly now and that's not an
    idempotent transform, this is no longer idempotent.
    
    For example, bleach.clean() can't differentiate between a user talking about
    code and saying this:
    
       I like my html wrapped in <b>!
    
    and this:
    
       I like my html escaped like this &lt;b&gt;!
    
    I'm not sure why we thought bleach.clean() could ever be correct and idempotent.
    Seems like that was an error.
  4. Merge pull request #318 from willkg/more-python3-fixes

    willkg committed Sep 20, 2017
    More Python 3 fixes for tests_websites
  5. Merge pull request #317 from willkg/fix-website-tests

    willkg committed Sep 20, 2017
    Fix test_websites to work with Python 3
  6. Merge pull request #315 from willkg/changes-clarification-2

    willkg committed Sep 20, 2017
    Change "Security issues" to "Security fixes"
  7. Change "Security issues" to "Security fixes"

    willkg committed Sep 20, 2017
    This is clearer regarding the intent of that block.