Join GitHub today
Fix schema.rdfa reversions from sdo-deimos/3.0 release #1203
Our last release (sdo-deimos aka 3.0) included a number of unintended reversions. These need careful repair before we open a new branch for a subsequent release. The cause seems to be the March 24 2016 merging of the medical vocab changes into sdo-deimos (#11 (comment)). It seems I (@danbri) did not review the PR with sufficient care and our unit tests did not catch the mistake.
I can see no simple way to fix this beyond a careful comparison of before/after versions of data/schema.rdfa, which I shall be attempting here.
The commit https://github.com/schemaorg/schemaorg/blob/55a43a126c4981f090036008f49b0b6678f50ef0/data/schema.rdfa is too large to view in Github's Web UI. On commandline "git show 55a43a1" shows a lot of changes.
I do not trace any merge with
I've committed a very simple diff utility (/cc @mfhepp ) to the repo, https://github.com/schemaorg/schemaorg/blob/master/scripts/differ.py (no commandline flags yet).
Here are the outputs, in Turtle syntax, from comparing 2.2 and 3.0 data/schema.rdfa release snapshots (and ignoring extension files):
Looking in these three files for BlogPosting it is clear that the supertype reverted back to Article with the 3.0 release. Since #526 (see http://schema.org/docs/releases.html#g526) in v2.1, and therefore also since 2.2, it should have had a supertype of SocialMediaPosting:
https://gist.github.com/danbri/79625878bfbbc1ad098ddd91520aa573 has a first cut at a TODO list based on looking at all triples that went missing from v2.2 to v3.0 and ignoring anything to do with medicine, then investigating the remainder. Next task is to look at non-medical additions that were added in v3 and form a merged TODO list, which is then a basis for a pull request. (aside: I will add the blogposting case too).
https://gist.github.com/danbri/d79a476cecc1bf425f007524eab11c81 - this is a quick list of terms that were in 2.2 core but not in health-lifesci extension. This is not the best way to check, @RichardWallis is looking into things more deeply but I thought I'd share this since I generated it (with scripts/compare_health.py utility)
Comparison script added to master 'compareterms.py'
Current output from script: https://gist.github.com/RichardWallis/ee96bcfbce20f96e6f967d3caf366a08
Indicates following changes from sdo-phobos (v2.2) to v3.0 master:
Looking over the summary, I think we can discard half of the "dropped terms" after further investigation.
0.) Missing health-lifesci properties
These 4 are not in the 3.0 or extension dirs, and no discussion in issue tracker:
We should consider restoring them, or amending the examples if they are not restored.
@twamarc can you comment on any of these 4?
1.) OK - Dropped term http://schema.org/Optometic
2.) OK - Dropped term http://schema.org/Radiograpy
Typo fixed. Now called: http://schema.org/Radiography
3.) OK - Dropped term http://schema.org/specialUsage
Not a vocabulary term. This is an implementation detail for vocabulary acknowledgements, and was changed as part of #1022
Here is a slightly reworked summary of "OK" vs "TODO" on all the changes that I found between 2.2 and 3.0. The format is rough but I think I was reasonably thorough, given the scale of the task.
Next task is to strip out the OK items and work up a pull request for the small fixes (reverse unwanted rollbacks) that it lists.
a. TODO: health-lifesci is missing definitions for breastfeedingWarning, healthCondition, prescriptionStatus, secondaryPrevention.
Missing: schema:BlogPosting rdfs:subClassOf schema:SocialMediaPosting .
3.) TODO: genre
4.) ingredients - this was already restored.
Addressed as a quick fix in http://schema.org/docs/releases.html#g1174
schema:ingredients a rdf:Property ;
... has this gone? seems to be in v3 too.
6.) TODO: check status of manufacturer ... we restored this as a quick fix too, looking at diff of data/schema.org vs data/releases/3.0/schema.rdfa
7.) TODO: netWorth - reversion see 60.) below
9.) TODO: affiliation was subpropertyOf memberOf
Established in 2.1 via http://schema.org/docs/releases.html#g596 (Documented that affiliation is a sub-property of memberOf.)
1212 Removed: schema:affiliation rdfs:subPropertyOf schema:memberOf .
2.2: schema:affiliation rdfs:subPropertyOf schema:memberOf .
TODO: confirm and restore.
10.) prepTime - TODO: link ISO
2.2: schema:prepTime rdfs:comment "The length of time it takes to prepare the recipe, in ISO 8601 duration format." .
11.) TODO: restore schema:publisher schema:rangeIncludes schema:Person .
schema:publisher schema:rangeIncludes schema:Person . has gone. - noted in #1198.
15.) totalTime - TODO link ISO
2.2: schema:totalTime rdfs:comment "The total time it takes to prepare and cook the recipe, in ISO 8601 duration format." .
In 2.1 we had http://schema.org/docs/releases.html#g577 "Amended videoFormat to indicate that it is expected on BroadcastEvent and ScreeningEvent, rather than TelevisionStation."
TODO: Restore per 2.2.
2.2: schema:videoFormat schema:domainIncludes schema:BroadcastEvent,
27.) parentOrganization - TODO repair and restore.
To be restored.
Unwanted rollback of http://schema.org/docs/releases.html#g535 ("Broadened domain of parentOrganization to allow any Organization, rather than only LocalBusiness. Noted parentOrganization and subOrganization as inverses.")
2.2: schema:parentOrganization schema:inverseOf schema:subOrganization ;
schema:subOrganization schema:inverseOf schema:parentOrganization . ...was also dropped in v3.0.
39.) branchCode TODO
2.2: schema:branchCode schema:domainIncludes schema:LocalBusiness,
45.) cookTime - TODO, restore hyperlink
2.2: schema:cookTime rdfs:comment "The time it takes to actually cook the dish, in ISO 8601 duration format." .
50.) TODO: investigate status of Dentist
In 3.0, added:
56.) TODO: restore codeSampleType (and sampleType)
TODO: We should restore this edit, https://github.com/schemaorg/schemaorg/pull/513/files
2.2: schema:codeSampleType rdfs:comment "What type of code sample: full (compile ready) solution, code snippet, inline code, scripts, template." .
TODO: Also fix sampleType (the superseded version).
60.) TODO: netWorth - rollback of #585
We chose to exclude "organization" from the text. This was rolled back in schema.rdfa and should be re-excluded, pending input from FIBO et al.
Thanks @RichardWallis @danbri , this is helpful.
Therefore I fully agree with TODO: 0) a.
I propose the following patch (to be included in general pull request):
referenced this issue
Jun 17, 2016
Queued up for re-publication via webschemas.org development site:
Ok this is all queued up for review in #1215 and is pushed to the webschema.org site (but not schema.org itself). Here is a summary of the changes (please change URLs to webschemas.org to review the candidate fixes).
This may be over-cautious but for completeness I have gone back in releases.html to v2.0. All the improvements recorded for 2.0 are still current (or have been subsequently evolved further as documented in later releases).
Quick fixes and Examples
Manual review of 2.1, 2.2:
Site improvements (N/A here)
Site improvements: (N/A here)
Quick fixes and Examples
To summarize the manual review from docs/releases.html these are the things that appear to be missing in 3.0 and which #1203/#1215 repair. They are mostly from 2.1. Taking the above list of problems noted in 2.1 section of releases.html, and merging it against the fixes prepared which were based on triple-by-triple data/schema.rdfa comparison of 2.2 vs 3.0, we get the following 10 fixes, each of which is now annotated with the issue # and text of the 2.1 change that is being restored.
In addition, this review highlighted that there is a problem with #518: paymentStatus. This fix has been added to the PR and listed here as 0.).
@RichardWallis can you review these i.e. confirm my belief that this PR #1215 repairs what I say it repairs, and that as far as you can tell it puts us back on track with how v3.0+ ought to be looking?