Test cases for IRI and URI processing
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.



A collection of URI and IRI strings useful for deterministic testing 
of user-agents or other URL parsers.

1) iris.xml
This is the  main file of test cases.  It contains the Webkit tests in XML 
form plus it combines other test cases from various sources [1][2].
Sometimes a test includes the expected results of parsing.  

TODO: Expected results are still TBD.

NOTE:  I converted all \x to % so \xC2 becomes %C2.
Some manual fixups required afterword right now:
http://www.foo。" + "bar.com should become http://www.foo。bar.com

In addition a plain text file was produced:

iris.txt - An attempt to store test URIs/IRIs in a plain text file,
one per line.  Expected results will not be stored.  NOTE: Webkit had a bunch
of tests that would set a <base href=""> but this file did not store any
"base" information.  In those cases only the "reference" was stored.

Supporting files (Removed from the repo to reduce confusion): 

miris.xml - More test cases that weren't in Webkit, modified with 
hostnames to help correlate URL parsing across DNS queries, HTTP requests, 
and the DOM.

tests.xml - DEPRECATED.  The original format borrowed from Julian [3].

2) TODOs

a) Merge the old tests from tests.xml in to the new iris.xml format.

b) Expected results are questionable right now and will need more work.
Do not rely on them.

c) BUGS? I've only spot checked the results, more review would be nice.
- the orignal byte escapings \xNN have been converted to %xNN because
they were generally for illegal UTF-8 byte sequences that won't transport
well - they're illegal in XML and a UTF-8 encoded text file.

3) How to use iris.xml

To generate test URIs use the following steps.  The "testUri" is the
end result string used for testing URI or IRI parsing.  In some cases
an HTML base ref is required to be set <base href="xyz" />.  This is
useful to test resolution of the testUri against the base.

foreach tc:test in tc:group
  if tc:charset exists
    then set meta charset = tc:charset // Test document charset
  if tc:test has tc:scheme && tc:ref 
    then testUri is tc:scheme appended with tc:ref
  if tc:test has tc:scheme // meaning we're just testing the scheme as the URI
    then testUri is tc:scheme
  if tc:test has tc:base && tc:ref && you don't care to test base resolution
    the testUri is tc:base appended with tc:ref
  else if tc:test has tc:base && tc:ref // you do want to test base resolution
    then set html base ref = tc:base
    and testUri is tc:ref
  if tc:test has tc:uri && tc:ref
    then testUri is tc:uri appended with tc.ref

4) Resources

[1] <http://trac.webkit.org/browser/trunk/LayoutTests/fast/url/script-tests>
[2] <http://greenbytes.de/tech/tc/uris/>
[3] <http://greenbytes.de/tech/webdav/urldecomp.xml>