Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

review used schema.org terms for existence and relevance #271

Closed
VladimirAlexiev opened this issue Jan 25, 2022 · 33 comments
Closed

review used schema.org terms for existence and relevance #271

VladimirAlexiev opened this issue Jan 25, 2022 · 33 comments
Assignees

Comments

@VladimirAlexiev
Copy link
Contributor

The first example in #270 has two more problems::

  • https://schema.org/Purchase doesn't exist,
  • even if it did, I don't see how a Purchase can be semantically equivalent to relatedDocuments: IMHO a Purchase is an event
@OR13
Copy link
Collaborator

OR13 commented May 10, 2022

TODO: find a better IRI for Purchase... or define it here.

@OR13
Copy link
Collaborator

OR13 commented May 10, 2022

@nissimsan
Copy link
Collaborator

@VladimirAlexiev , this is defined by us, not schema.org.
#271
Closing.

@VladimirAlexiev
Copy link
Contributor Author

Let me quote from https://github.com/w3c-ccg/traceability-vocab/blob/main/docs/openapi/components/schemas/common/BillOfLading.yml#L38

    $linkedData:
      term: relatedDocuments
      '@id': https://schema.org/Purchase

@nissimsan please reopen

@OR13
Copy link
Collaborator

OR13 commented Aug 17, 2022

No, do not reopen... create a new issue with a clear and actionable description something like:

title: Update broken IRIs in BillOfLading RDF Class
description: This RDF Class contains links to terms that are incorrect (include examples).

@VladimirAlexiev
Copy link
Contributor Author

@OR13 Please read the issue title. I gave just one example.
Here's an idea: grep all schema.org terms from all schemas, and try to resolve those URLs to confirm their existence.
Reminder: schema.org semantic URLs use http not https, even though the schema.org site redirects to https.

@nissimsan
Copy link
Collaborator

@VladimirAlexiev, can you elaborate on the latter, pls?
It sure looks like https:
image

@VladimirAlexiev
Copy link
Contributor Author

@OR13
I guess they changed it recently, despite objections about changing what are supposed to be permanent URLs.

But could you please check in the ontology (JSONLD or Turtle) to make sure?

@TallTed
Copy link
Contributor

TallTed commented Aug 23, 2022

Apparently, the powers that be at schema.org haven't read the CoolURIs article, never mind that @danbri has been involved in the worlds of Linked Data and semantic webs nigh unto forever....

FWIW, generally, if not universally, http:// URIs for schema.org redirect to https:// when dereferenced, and while the latest revisions of schema.org do use https://, anyone who's made significant use of their vocab may find it a significant hurdle to change all instances of http://.

There's nothing wrong with http:// being in the identifier URI, and https:// being the way you get the description of that http://-identified entity. For good or ill, there's no way to make a universal RDF statement that "all https:// URIs are owl:sameAs all http:// URIs", but schema.org (and others running into similar issues) could relatively easily include such a declaration on each term in their vocab document. Maybe if enough different schema.org users whine about this, they'll do it.

@nissimsan
Copy link
Collaborator

image
https:// alright.

@danbri
Copy link

danbri commented Aug 25, 2022

re

Maybe if enough different schema.org users whine about this, they'll do it.

We would respond to data consumers saying they'd use it.

However it is not clear what property to use to associate non-type, non-property terms, e.g.

http://schema.org/AudiobookFormat and https://schema.org/AudiobookFormat

I am not convinced owl:sameAs works, as it is such as strong claim.

@nissimsan
Copy link
Collaborator

Noting that a http -> https redirect still breaks the Verifiable Credential proof.

@danbri
Copy link

danbri commented Aug 25, 2022 via email

@nissimsan
Copy link
Collaborator

Yes, agreed, @danbri.

More generically, I consistently stick with whatever I get redirect to - http or https, cool or uncool. Simple and safe.

@TallTed
Copy link
Contributor

TallTed commented Aug 25, 2022

@nissimsan -- Though it may appear to be both, "[sticking] with whatever [you] get redirect [sic] to" is NEITHER simple nor safe.

Browser redirection from the URI of a term being described, to the URI of the description of that term, does not imply in any way that the term being described is identified by the URI of the description of that term!

@TallTed
Copy link
Contributor

TallTed commented Aug 25, 2022

[@nissimsan] Noting that a http -> https redirect still breaks the Verifiable Credential proof.

The identifier of an entity being in the http URI scheme should not break any proof, even if the identifier of the description of that entity (which may be reached by dereferencing the identifier of that entity) is identified by a URI in the https URI scheme!

@danbri
Copy link

danbri commented Aug 25, 2022

There are also the various kinds of HTTP redirection available. FWIW Schema.org's http: to https: redirections use "HTTP/1.1 301 Moved Permanently".

@TallTed
Copy link
Contributor

TallTed commented Aug 25, 2022

@danbri --

However it is not clear what property to use to associate non-type, non-property terms, e.g.

http://schema.org/AudiobookFormat and https://schema.org/AudiobookFormat

I am not convinced owl:sameAs works, as it is such as strong claim.

Are you saying that "you" (schema.org) mean (or meant) to identify two different entities by those two URIs (or really, any two URIs which differ only in their scheme, http vs https; note please I'm not talking about any two schemes, where I agree, it's more complex)?

Or is the latter simply a newer identifier (a co-referer) for the same entity, as would communicated by putting an owl:sameAs relation (in either direction, i.e., in either description, though optimally it would be in both) between them?

@TallTed
Copy link
Contributor

TallTed commented Aug 25, 2022

[@danbri] FWIW Schema.org's http: to https: redirections use "HTTP/1.1 301 Moved Permanently"

That would seem to support my assertion that owl:sameAs is the intended relation between the http: and https: identifiers, directed from the former to the latter, though as a reflexive, symmetric, and transitive relation, it could be stated in either direction.

@danbri
Copy link

danbri commented Aug 25, 2022 via email

@TallTed
Copy link
Contributor

TallTed commented Aug 25, 2022

@danbri

(Not treating your last in order....)

Well, I would hope you wouldn't spray these anywhere meant to be consumed as RDF --

http://schema.org/Person owl:sameAs https://schema.org/
means we can't usefully say
http://schema.org/Person schema:supersededBy https://schema.org/

First, this makes no sense to me, as written --

http://schema.org/Person owl:sameAs https://schema.org/

-- though perhaps you meant to write --

<http://schema.org/Person> owl:sameAs <https://schema.org/Person>

Likewise, perhaps you meant this --

http://schema.org/Person schema:supersededBy https://schema.org/

-- to be written --

<http://schema.org/Person> schema:supersededBy <https://schema.org/Person>

Now, if you are concerned about versioning -- whether of http://schema.org/ (or https://schema.org/) writ large (i.e., all terms therein), or of https://schema.org/Person writ small -- I think that has some validity.

I think that validity is best addressed by identifying the "old" description which was superseded with some versioned URI which is linked from the "new" description which is identified by some other versioned URI.

Dereferencing the un-versioned URI should, in my opinion, always lead to the latest/current description, which should include a link to at least the most-recent previous description (recursive, so each description links to the next-most-recent), if not to all previous descriptions.

Dereferencing any versioned URI should lead to that version of the description, which optimally would include links to both more and less recent descriptions.

Your example of --

Dublin Core moved from http://purl.org/dc/Creator to
http://purl.org/dc/terms/creator (or similar)
-- seems to me rather significantly different than migrating schema.org from the HTTP (unencrypted end-to-end) protocol to the HTTPS (encrypted end-to-end) protocol, whether or not the URIs that are used to identify entities are left as http:-scheme or migrated to https:-scheme.

It is unfortunate that many treat the URI of the HTML or other rendition of a description of a dereferenced URI (to which they may be routed by various 3xx and other means) as if it were the URI of the entity identified by the initial URI. Rather, the URI of the entity identified by the initial URI should be included within the HTML or other rendition of a description of a dereferenced URI which description should optimally be identified by its own URI, but all too often is not.

@VladimirAlexiev
Copy link
Contributor Author

The switch-over was done in 12.0 on 2021-03-08 (see https://schema.org/docs/releases.html).
I bitched about it for a long time, @danbri allowed a reasonable amount of discussion, and obviously schema.org won't go back to http, and there are good reasons to modernize to https.
@OR13 Are there significant numbers of existing VCs that are broken by this switch-over? I think that all new VCs should use https for schema.org, and possibly even for other ontologies (and ask the creators of those other ontologies what do they think about a switch).

So I suggest not to sidetrack this issue with that other issue.

grep all schema.org terms from all Traceability schemas, and try to resolve those URLs to confirm their existence.

@OR13 what do you think?

@OR13
Copy link
Collaborator

OR13 commented Aug 29, 2022

This isn't the schema.org repo... AFAIK, we updated all the references to use https... so that we don't have any issues with this.... I prefer more, smaller, more actionable issues... rather than issues that are "review all links"... I welcome a separate issue per discovered broken link.

@danbri
Copy link

danbri commented Sep 14, 2022

FWIW the last schema.org release contained this file, https://github.com/schemaorg/schemaorg/blob/main/data/releases/14.0/httpequivs.ttl which asserts owl:equivalentClass and owl:equivalentProperty relationship between http: and https: term URIs. I don't think it has a treatment for Enumeration members. /cc @rjw

p.s. yes sorry I had a typo in my earlier response to @TallTed

And +1 to @OR13 re actionable issues.

@TallTed
Copy link
Contributor

TallTed commented Sep 14, 2022

https://github.com/schemaorg/schemaorg/blob/main/data/releases/14.0/httpequivs.ttl which asserts owl:equivalentClass and owl:equivalentProperty relationship between http: and https: term URIs

Hallelujah! Glad to hear it! (Hope I don't forget it!)

@VladimirAlexiev
Copy link
Contributor Author

VladimirAlexiev commented Sep 19, 2022

@OR13 Please reopen this: it's a task to grep all schema.org terms and check them for existence in schema.org.
Do you expect me to find all cases? I've started posting specific cases, but can't someone else share in this work?

Here's a first cut, collapsing cases already reported in other issues:

grep -hr 'http.*schema.org' .|perl -pe "s{^ +}{}; s{ '}{}; s{'$}{}; s{https://https://}{https://}; s{www.}{}"| sort|uniq >schema.txt

Attached: schema.txt

@nissimsan nissimsan reopened this Sep 19, 2022
@nissimsan
Copy link
Collaborator

@VladimirAlexiev - reopening. The floor is yours... :)

@VladimirAlexiev
Copy link
Contributor Author

@nissimsan Isn't anyone going to help with the list I made?

@brownoxford
Copy link
Collaborator

Discussed on call, please review and indicate whether you are able to assist on this ticket.

@OR13
Copy link
Collaborator

OR13 commented Feb 28, 2023

I suggest running the script and filing separate issues, and then closing this issue.

Issues that are large and not actionable tend to not progress well.

@nissimsan
Copy link
Collaborator

We do use https://schema.org/Purchase on Bill of Lading. Def. seems like a mistake. I'll remove this.

@VladimirAlexiev
Copy link
Contributor Author

@nissimsan The attachment shows 161 schema terms in traceability schemas. Will you make a head request for each of these URLs and see whether they resolve?

Here's a mistake: this lowercase term is a prop, so it cannot be used as @type:

'@type': https://schema.org/identifier

@BenjaminMoe
Copy link
Contributor

@VladimirAlexiev can you open a separate issue for this? #271 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants