Skip to content

Resolving a relative URI Reference against a base with only an authority does not work #81

Closed
@aucampia

Description

@aucampia

From IETF RFC 3986 Uniform Resource Identifier (URI): Generic Syntax I would expect the following test to pass:

from rfc3986 import uri_reference  # type: ignore[import]
from rfc3986.uri import URIReference  # type: ignore[import]


def test_schema_only_base() -> None:
    relative_uri: URIReference = uri_reference("john.smith")
    base_uri: URIReference = uri_reference("example:")
    resolved_uri: URIReference = relative_uri.resolve_with(base_uri, strict=True)
    assert resolved_uri.unsplit() == "example:/john.smith"

This is because of:

5.2.1. Pre-parse the Base URI

Note that only the scheme component is required to be
present in a base URI; the other components may be empty or
undefined.

5.2.2. Transform References

# Branches that won't execute elided with [...]
if defined(R.scheme) then
   [...]
else
   if defined(R.authority) then
      [...]
   else
      if (R.path == "") then
         [...]
      else
         if (R.path starts-with "/") then
            [...]
         else
            T.path = merge(Base.path, R.path);
            T.path = remove_dot_segments(T.path);
         endif;
         T.query = R.query;
      endif;
      T.authority = Base.authority;
   endif;
   T.scheme = Base.scheme;
endif;

T.fragment = R.fragment;

5.2.3. Merge Paths

The pseudocode above refers to a "merge" routine for merging a
relative-path reference with the path of the base URI. This is
accomplished as follows:

  • If the base URI has a defined authority component and an empty
    path, then return a string consisting of "/" concatenated with the
    reference's path; otherwise,

So according to this, my understanding is that, R, Base and T should be as follow:

R = URIReference(scheme=None, authority=None, path='john.smith', query=None, fragment=None)
Base = URIReference(scheme='example', authority=None, path=None, query=None, fragment=None)
T = URIReference(scheme='example', authority=None, path='/john.smith', query=None, fragment=None)

And T, when unsplit, will be "example:/john.smith". I may of course be missing something though, and if I am please do help me out here in seeing what it is.

When I run the actual test code it fails, so clearly the current implementation of RFC3986 in this library does not share my interpretation, it may be

$ pytest test_rfc3986.py -rA --log-level DEBUG
============================================================================ test session starts ============================================================================
platform linux -- Python 3.10.1, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: /home/iwana/sw/d/gitlab.com/aucampia/scraps
plugins: asyncio-0.16.0
collected 1 item                                                                                                                                                            

test_rfc3986.py F                                                                                                                                                     [100%]

================================================================================= FAILURES ==================================================================================
___________________________________________________________________________ test_schema_only_base ___________________________________________________________________________

    def test_schema_only_base() -> None:
        relative_uri: URIReference = uri_reference("john.smith")
        base_uri: URIReference = uri_reference("example:")
>       resolved_uri: URIReference = relative_uri.resolve_with(base_uri, strict=True)

test_rfc3986.py:8: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = URIReference(scheme=None, authority=None, path='john.smith', query=None, fragment=None)
base_uri = URIReference(scheme='example', authority=None, path=None, query=None, fragment=None), strict = True

    def resolve_with(self, base_uri, strict=False):
        """Use an absolute URI Reference to resolve this relative reference.
    
        Assuming this is a relative reference that you would like to resolve,
        use the provided base URI to resolve it.
    
        See http://tools.ietf.org/html/rfc3986#section-5 for more information.
    
        :param base_uri: Either a string or URIReference. It must be an
            absolute URI or it will raise an exception.
        :returns: A new URIReference which is the result of resolving this
            reference using ``base_uri``.
        :rtype: :class:`URIReference`
        :raises rfc3986.exceptions.ResolutionError:
            If the ``base_uri`` is not an absolute URI.
        """
        if not isinstance(base_uri, URIMixin):
            base_uri = type(self).from_string(base_uri)
    
        if not base_uri.is_absolute():
>           raise exc.ResolutionError(base_uri)
E           rfc3986.exceptions.ResolutionError: example: is not an absolute URI.

/home/iwana/.local/lib/python3.10/site-packages/rfc3986/_mixin.py:266: ResolutionError
========================================================================== short test summary info ==========================================================================
FAILED test_rfc3986.py::test_schema_only_base - rfc3986.exceptions.ResolutionError: example: is not an absolute URI.
============================================================================= 1 failed in 0.10s =============================================================================

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions