Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
GitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Fix ASCII dependency in strcpy_url and strlen_url #2535
Commit 3c630f9 partially reverted
This change fixes strlen_url() and strcpy_url() in parallel to use a
Commit 3c630f9 partially reverted the changes from commit dd7521b because of the problem that strcpy_url() was modified unilaterally without also modifying strlen_url(). As a consequence strcpy_url() was again depending on ASCII encoding. This change fixes strlen_url() and strcpy_url() in parallel to use a common host-encoding independent criterion for deciding whether an URL character must be %-escaped.
The current pull request does not change the behavior regarding control codes. Please check the condition in function
To actually include control codes in the set of characters to be escaped I can simplify that condition back to
For writing new test cases I need some guidance please. The test case test1138 apparently is the one that tests URLs with characters beyond 0x80. Can that be extended, or is a new test case in a new file required?
The documentation in the
But the file test1138 directly contains UTF-8. Do I understand it correctly that I can directly enter the control characters in the file then?
You're right and I was wrong. This should maintain the same functionality on ASCII systems and yet enable non-ascii systems to do the right thing.
The test file format is actually totally encoding agnostic and can contain whatever is suitable so test 1138 has some UTF-8 sequences in there to verify exactly this sort of %-encoding. This test continues to work after your patch, as the CI builds and tests here already show.
I will merge this asap.