New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid HTML on "schema.org" schema pages. #375

Closed
John-Nagle opened this Issue Mar 7, 2015 · 6 comments

Comments

Projects
None yet
5 participants
@John-Nagle

John-Nagle commented Mar 7, 2015

"http://schema.org/Thing" and its descendants are no longer valid HTML. This seems to have been broken in Q1-Q2 2014, based on archived versions in the Internet Archive. In particular, the list of "More Specific Types" uses "li" items which are no longer within a "ul". (There's also an extra "/table" and a bogus "/span" tag in there.) This makes it harder to machine-process that data.

I assume those pages are automatically generated from "https://github.com/schemaorg/schemaorg/blob/master/data/schema.rdfa", which was new in 2014, so this is probably just a bad template.

W3C validation report: http://validator.w3.org/check?uri=http%3A%2F%2Fschema.org%2FThing&charset=%28detect+automatically%29&doctype=Inline&group=0

@danbri

This comment has been minimized.

Show comment
Hide comment
@danbri

danbri Mar 13, 2015

Contributor

Thanks, yes we moved to a new codebase early last year. We ought to fix this, indeed...

Contributor

danbri commented Mar 13, 2015

Thanks, yes we moved to a new codebase early last year. We ought to fix this, indeed...

@danbri danbri self-assigned this Mar 13, 2015

@danbri danbri added this to the sdo-gozer release milestone Mar 13, 2015

@John-Nagle

This comment has been minimized.

Show comment
Hide comment
@John-Nagle

John-Nagle Mar 13, 2015

Thanks. The RDFa version of the same data, "http://schema.org/docs/schema_org_rdfa.html" (which comes from "https://github.com/schemaorg/schemaorg/blob/master/data/schema.rdfa") could also use some work. It's OK up until line 4536; then it gets weird. There are "href" attributes on "span" tags, starting at "action The movement the muscle generates." It looks like some schema of medical information in a similar, but not quite compatible, format was pasted in there. There's similar cut and paste trouble near "series" and "wikidoc". Some of this won't even parse properly as HTML5 in a browser.

Validator:
http://validator.w3.org/nu/?doc=http%3A%2F%2Fschema.org%2Fdocs%2Fschema_org_rdfa.html

John-Nagle commented Mar 13, 2015

Thanks. The RDFa version of the same data, "http://schema.org/docs/schema_org_rdfa.html" (which comes from "https://github.com/schemaorg/schemaorg/blob/master/data/schema.rdfa") could also use some work. It's OK up until line 4536; then it gets weird. There are "href" attributes on "span" tags, starting at "action The movement the muscle generates." It looks like some schema of medical information in a similar, but not quite compatible, format was pasted in there. There's similar cut and paste trouble near "series" and "wikidoc". Some of this won't even parse properly as HTML5 in a browser.

Validator:
http://validator.w3.org/nu/?doc=http%3A%2F%2Fschema.org%2Fdocs%2Fschema_org_rdfa.html

@danbri

This comment has been minimized.

Show comment
Hide comment
@danbri

danbri Mar 13, 2015

Contributor

Thanks. Don't spend too much time on the schema.rdfa file's compatibility - in its current form it is an implementation detail. It turned out not to be ideal to use RDFa for this, and we are looking into migration to JSON-LD anyway. But I need to integrate and test a decent parser for that first...

Contributor

danbri commented Mar 13, 2015

Thanks. Don't spend too much time on the schema.rdfa file's compatibility - in its current form it is an implementation detail. It turned out not to be ideal to use RDFa for this, and we are looking into migration to JSON-LD anyway. But I need to integrate and test a decent parser for that first...

@timbl

This comment has been minimized.

Show comment
Hide comment
@timbl

timbl Mar 13, 2015

Could you folks please make Turtle an option.

  • It is simpler than rdf/a or json/ld
  • It is a native graph language not a tree language like json or xml
  • It is the one required common language in the linked data platform.
  • It can be read by things old libraries
    Tim

timbl commented Mar 13, 2015

Could you folks please make Turtle an option.

  • It is simpler than rdf/a or json/ld
  • It is a native graph language not a tree language like json or xml
  • It is the one required common language in the linked data platform.
  • It can be read by things old libraries
    Tim
@chaals

This comment has been minimized.

Show comment
Hide comment
@chaals

chaals Mar 13, 2015

Contributor
  • reply@- notifications@  13.03.2015, 22:53, "Tim Berners-Lee" notifications@github.com:Could you folks please make Turtle an option.You mean getting a turtle version of information we have, or reading Turtle from pages (how)? cheers It is simpler than rdf/a or json/ldIt is a native graph language not a tree language like json or xmlIt is the one required common language in the linked data platform.It can be read by things old libraries Tim—Reply to this email directly or view it on GitHub.   --Charles McCathie Nevile - web standards - CTO Office, Yandexchaals@yandex-team.ru - - - Find more at http://yandex.com 
Contributor

chaals commented Mar 13, 2015

  • reply@- notifications@  13.03.2015, 22:53, "Tim Berners-Lee" notifications@github.com:Could you folks please make Turtle an option.You mean getting a turtle version of information we have, or reading Turtle from pages (how)? cheers It is simpler than rdf/a or json/ldIt is a native graph language not a tree language like json or xmlIt is the one required common language in the linked data platform.It can be read by things old libraries Tim—Reply to this email directly or view it on GitHub.   --Charles McCathie Nevile - web standards - CTO Office, Yandexchaals@yandex-team.ru - - - Find more at http://yandex.com 
@John-Nagle

This comment has been minimized.

Show comment
Hide comment
@John-Nagle

John-Nagle Mar 27, 2015

I have no strong preference on format. Please pick something parseable and implement it. Thank you.

John-Nagle commented Mar 27, 2015

I have no strong preference on format. Please pick something parseable and implement it. Thank you.

@danbri danbri modified the milestones: 2015 Q2, sdo-gozer release May 12, 2015

danbri added a commit that referenced this issue Jul 24, 2015

Merge pull request #663 from schemaorg/legalhtml
Fixed broken html for terms pages (#375)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment