"http://schema.org/Thing" and its descendants are no longer valid HTML. This seems to have been broken in Q1-Q2 2014, based on archived versions in the Internet Archive. In particular, the list of "More Specific Types" uses "li" items which are no longer within a "ul". (There's also an extra "/table" and a bogus "/span" tag in there.) This makes it harder to machine-process that data.
I assume those pages are automatically generated from "https://github.com/schemaorg/schemaorg/blob/master/data/schema.rdfa", which was new in 2014, so this is probably just a bad template.
W3C validation report: http://validator.w3.org/check?uri=http%3A%2F%2Fschema.org%2FThing&charset=%28detect+automatically%29&doctype=Inline&group=0
Thanks, yes we moved to a new codebase early last year. We ought to fix this, indeed...
Thanks. The RDFa version of the same data, "http://schema.org/docs/schema_org_rdfa.html" (which comes from "https://github.com/schemaorg/schemaorg/blob/master/data/schema.rdfa") could also use some work. It's OK up until line 4536; then it gets weird. There are "href" attributes on "span" tags, starting at "action The movement the muscle generates." It looks like some schema of medical information in a similar, but not quite compatible, format was pasted in there. There's similar cut and paste trouble near "series" and "wikidoc". Some of this won't even parse properly as HTML5 in a browser.
Thanks. Don't spend too much time on the schema.rdfa file's compatibility - in its current form it is an implementation detail. It turned out not to be ideal to use RDFa for this, and we are looking into migration to JSON-LD anyway. But I need to integrate and test a decent parser for that first...
Could you folks please make Turtle an option.
I have no strong preference on format. Please pick something parseable and implement it. Thank you.
Fixed broken html for terms pages (#375)