Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URL comparisons when retrieving metadata files #562

Closed
danbri opened this issue May 20, 2015 · 6 comments
Closed

URL comparisons when retrieving metadata files #562

danbri opened this issue May 20, 2015 · 6 comments

Comments

@danbri
Copy link
Contributor

danbri commented May 20, 2015

When comparing URLs for this purpose, what do we say if anything about non-string-identical URIs eg. http://example.COM/ vs http://example.COM:80/, rel=canonical etc?

See http://www.w3.org/2015/05/20-csvw-irc#T14-27-34

@gkellogg
Copy link
Member

This is based on Table Compatibility:

Two table descriptions are compatible if they have equivalent normalized url properties.

url is a link property, so the following is used:

If the property is a link property the value is turned into an absolute URL using the base URL.

So, they're not canonicalized, but are made absolute by joining to the base URL.

I think the text is clear, can we close this?

@danbri
Copy link
Contributor Author

danbri commented May 22, 2015

The question is rather whether a modern W3C spec ought to be comparing URIs/(IRIs?) as strings, vs some other deeper check for equivalence. I'm not thinking of semweb "sameAs" for refers to the same real world thing, but rather equivalencies baked into the URI family of specs, e.g. meaning of :80, case insensitivity of domain names, perhaps some internationalization concerns.

@gkellogg
Copy link
Member

Probably the best route is to simply reference [[RFC3968]] section 6 "Normalization and Comparison". There is an unbounded set of potential scheme normalization rules, but we might want to limit ourselves to just HTTP and HTTPS normatively.

This would affect Table Compatibility as well as Metadata Discovery (where it determines that the metadata refers to the URL used for discovery.

I've marked it for discussion on the next call.

@gkellogg
Copy link
Member

Note that commit 62ff8ec addresses this, not a PR until the issue has been discussed and resolved. In addition to my comments above, it involves link property normalization to be consistent. (didn't consider @id normalization).

gkellogg added a commit that referenced this issue May 26, 2015
…tadata compatibility and discovery. For consistency, `link` property normalization includes URL normalization. Note the issue marker in the section to find a better way of describing the schemes for which scheme-based normalization should/must be defined.

This should affect tests, both by exploring cases of link-property normalization, metadata discovery and compatibility. #562.
@JeniT
Copy link

JeniT commented May 27, 2015

I will raise this with TAG when I speak with them.

@JeniT
Copy link

JeniT commented Jun 10, 2015

TAG has no objection to using [[RFC3968]] section 6 "Normalization and Comparison", so please go ahead with that PR.

@JeniT JeniT assigned gkellogg and unassigned JeniT Jun 10, 2015
gkellogg added a commit that referenced this issue Jun 15, 2015
Added test to ensure URL comparison is normalized. #545 and #562.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants