Skip to content
This repository has been archived by the owner on Jul 4, 2023. It is now read-only.

test-bot: better xml character filtering #50092

Closed
wants to merge 1 commit into from
Closed

test-bot: better xml character filtering #50092

wants to merge 1 commit into from

Conversation

xu-cheng
Copy link
Member

@MikeMcQuaid
Copy link
Member

@xu-cheng 👏 👍. Could you try to add a PR after this is merged to e.g. bump the revision on one of the formulae that had failures before?

@xu-cheng xu-cheng closed this in 946c6de Mar 14, 2016
@xu-cheng xu-cheng deleted the test-bot-xml branch March 14, 2016 11:57
xu-cheng added a commit to Homebrew/brew that referenced this pull request Mar 14, 2016
@xu-cheng
Copy link
Member Author

@xu-cheng
Copy link
Member Author

Revert in 5a850cf because weird encoding problem.

@apjanke
Copy link
Contributor

apjanke commented Mar 17, 2016

Maybe the problem was the Unicode escape Ruby string syntax.

output = step.output.gsub(/[^\x09\x0A\x0D\x20-\uD7FF\uE000-\uFFFD\u10000-\u10FFFF]/, "")

If I understand correctly, the plain \u takes exactly 4 hex digits. Other counts of digits need to be enclosed by {...}. So, something like this?

output = step.output.gsub(/[^\x09\x0A\x0D\x20-\uD7FF\uE000-\uFFFD\u{10000}-\u{10FFFF}]/, "")

@MikeMcQuaid
Copy link
Member

@apjanke Could you try one with the new syntax? Might be worth adding a unit test (not for the test bot) to verify this is behaving as expected and doesn't blow up on 1.8.

@apjanke apjanke self-assigned this Mar 28, 2016
@apjanke
Copy link
Contributor

apjanke commented Mar 28, 2016

Sure. I'll take a hack at this.

@apjanke
Copy link
Contributor

apjanke commented Apr 4, 2016

I think I have a fix for this – apjanke/brew@eb8db20. I think it's not just the character replacement syntax, but needs an adjustment to the bytewise output buffering, too.

I don't know how to write a unit test for it, since test-bot is already a testing scaffold, and (I think) is dependent on the state of your local git repos.

Works locally for me on 10.9 and 10.11, as verified by manual inspection and running xmllint against the test-bot --junit output files.

[~/tmp/test-bot]
$ xmllint --noout brew-test-bot-orig.xml
brew-test-bot-orig.xml:4784: parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0xF1 0x63 0x65 0x73
  inflating: akka-2.4.2/lib/akka/i have sp+�ces.jar
                                           ^
[✘ ~/tmp/test-bot]
$ xmllint --noout brew-test-bot-after.xml
[~/tmp/test-bot]
$

It runs without error under HOMEBREW_RUBY="1.8.7" for me too, though the output is still mis-encoded. (Maybe Iconv is available on the test bot machines, which would help with this?) At least it's not a regression, and I don't think it matters as much since we're not running 1.8.7 on Jenkins.

Should I just push this and see how it goes? Might take a while to come up with a unit test, since the test-bot bits seem hard to isolate and test individually in a meaningful way, without some refactoring.

@apjanke
Copy link
Contributor

apjanke commented Apr 4, 2016

Oh, never mind about the Ruby 1.8.7 breakage. I had accidentally switched back to the master branch when running the 1.8.7 test. It seems like the fix works on 1.8.7 as well when I run test-bot with the right code.

[~/tmp/test-bot]
$ xmllint --noout  brew-test-bot-ruby187-before.xml
brew-test-bot-ruby187-before.xml:4778: parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0xF1 0x63 0x65 0x73
  inflating: akka-2.4.2/lib/akka/i have sp+�ces.jar
                                           ^
[✘ ~/tmp/test-bot]
$ xmllint --noout  brew-test-bot-ruby187-after.xml
[~/tmp/test-bot]
$

I think this suggests that it was actually the bytewise buffering that was the problem here, and not necessarily the invalid-XML-character replacement pattern used later. (Though that should definitely be altered too.)

@Homebrew Homebrew locked and limited conversation to collaborators Jul 10, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants